AlphaFold workflow

Exciting news: AlphaFold code released

The protein folding problem

Predicting 3D structures of proteins based solely on amino acid sequences has been considered the holy grail of biology for decades. Even though we have managed to determine around 100,000 protein structures, using techniques such as X-ray crystallography, cryogenic electron microscopy or nuclear magnetic resonance, it is merely a small fraction of millions of known sequences.

Many computational approaches, focusing on either thermodynamics and kinetics or evolutionary approaches, have been developed throughout the years. However, up until now, all of them failed to live up to expectations, especially when no homologous structures were present. Everything changed when Google DeepMind presented their most recent work at the CASP14 assessment.

Deep neural networks to predict protein structure

DeepMind’s work on tertiary structure prediction resulted in AlphaFold, a novel machine learning approach which relies on deep neural networks trained to predict properties of a protein from its genetic sequence. It is able to infer the distances between pairs of residues and the angles between the chemical bonds that connect them by taking into account geometric constraints of protein structure.

AlphaFold uses an amino acid sequence to query several databases of protein sequences and constructs a multiple sequence alignment (MSA) of evolutionary related proteins. Based on the alignment, it detects parts of the protein sequence that are prone to mutations and infers correlations between them. Together with pairwise distances extracted from structural databases, the data is fed into a machine learning algorithm to predict the final structure.

What’s next?

After the release of CASP14 results where AlphaFold achieved an amazing accuracy of 0.96 Å RMSD, the whole scientific community knew we were onto something, but we did not get to try it first hand until about two weeks ago when DeepMind released AlphaFold2 code

At Nostrum Biodiscovery, we like to stay on top of things, so we have already started testing AlphaFold and validating its predictions with an intention to use it in our Drug Discovery projects as soon as possible.

Source of the image: DeepMind blog

Highly accurate protein structure prediction for the human proteome