ProteinNet: A standardized data set for machine learning of protein structure

ProteinNet: A standardized data set for machine learning of protein structure

  • March 26, 2018
Table of Contents

ProteinNet: A standardized data set for machine learning of protein structure

Protein structure prediction is one of the central problems of biochemistry. While the problem is well-studied within the biological and chemical sciences, it is less well represented within the machine learning community. We suspect this is due to two reasons: 1) a high barrier to entry for non-domain experts, and 2) lack of standardization in terms of training / validation / test splits that make fair and consistent comparisons across methods possible.

If these two issues are addressed, protein structure prediction can become a major source of innovation in ML research, alongside the canonical tasks of computer vision, NLP, and speech recognition. Much like ImageNet helped spur the development of new computer vision techniques, ProteinNet aims to facilitate ML research on protein structure by providing a standardized data set, and standardized training / validation / test splits, that any group can use with minimal effort to get started.

Source: github.com

Share :
comments powered by Disqus

Related Posts

Using Machine Learning to Improve Streaming Quality at Netflix

Using Machine Learning to Improve Streaming Quality at Netflix

Network quality is difficult to characterize and predict. While the average bandwidth and round trip time supported by a network are well-known indicators of network quality, other characteristics such as stability and predictability make a big difference when it comes to video streaming. A richer characterization of network quality would prove useful for analyzing networks (for targeting/analyzing product improvements), determining initial video quality and/or adapting video quality throughout playback (more on that below).

Read More
Guide to Speech Recognition with Python

Guide to Speech Recognition with Python

Far from a being a fad, the overwhelming success of speech-enabled products like Amazon Alexa has proven that some degree of speech support will be an essential aspect of household tech for the foreseeable future. If you think about it, the reasons why are pretty obvious. Incorporating speech recognition into your Python application offers a level of interactivity and accessibility that few technologies can match.

Read More
Deploy TensorFlow models

Deploy TensorFlow models

Don’t follow the TensorFlow docs since they explain how to setup a docker image and compile TF serving that takes forever. We can do much better. Some guy made a docker image with everything already compile on it, so we are going to use that one.

Read More