ProteinNet: A standardized data set for machine learning of protein structure

March 26, 2018

Table of Contents

Protein structure prediction is one of the central problems of biochemistry. While the problem is well-studied within the biological and chemical sciences, it is less well represented within the machine learning community. We suspect this is due to two reasons: 1) a high barrier to entry for non-domain experts, and 2) lack of standardization in terms of training / validation / test splits that make fair and consistent comparisons across methods possible.

If these two issues are addressed, protein structure prediction can become a major source of innovation in ML research, alongside the canonical tasks of computer vision, NLP, and speech recognition. Much like ImageNet helped spur the development of new computer vision techniques, ProteinNet aims to facilitate ML research on protein structure by providing a standardized data set, and standardized training / validation / test splits, that any group can use with minimal effort to get started.

Source: github.com

ProteinNet: A standardized data set for machine learning of protein structure

Tags :

Share :

Related Posts

The 5 Deep Learning Frameworks Every Serious Machine Learner Should Be Familiar With

Automated front-end development using deep learning

This Unidentified Sea Creature Washed Up on a Georgia Beach, then Vanished