This week’s AI tip is about: how AlphaFold is delivering breakthroughs in computational biology
AlphaFold, an artificial intelligence program developed by DeepMind, a subsidiary of Alphabet, predicts 3D structures of proteins from their amino acid sequences with near-experimental accuracy.
It uses deep learning and neural networks to analyze evolutionary, physical, and geometric constraints of protein structures.
AlphaFold 3 expands capabilities to predict structures of protein complexes with DNA, RNA, post-translational modifications, and selected ligands and ions.
Core Architecture
AlphaFold is built on a deep learning architecture that utilizes transformers, a type of neural network that has revolutionized machine learning. The model employs a specialized transformer called Invariant Point Attention (IPA), designed specifically for working with three-dimensional structures.
Multiple Sequence Alignment (MSA) AlphaFold begins by generating a multiple sequence alignment (MSA) of the input protein sequence. The MSA is created by querying protein sequence databases to find similar proteins from different organisms. This alignment helps identify evolutionary relationships and co-evolutionary signals crucial for structure prediction.
Pair Representations
The model creates “pair representations” for every pair of amino acid residues in the protein. These representations encode co-evolutionary relationships based on the MSA. This information is interpreted as relative positions and distances between amino acid residues.
Evoformer
Neural Network AlphaFold uses a neural network called Evoformer to interpret and update both the MSA and pair representations. This network enables continuous information flows between the MSA and pair representations, refining the structural hypothesis.
Structure Module
The structure module takes the updated pair representation and the original sequence to generate a 3D structure. It first creates the protein backbone and then places and refines the positions of amino acid side chains.
Iterative Refinement
AlphaFold employs a highly recursive and iterative process:
- The model maximizes information flows at every step.
- Hypotheses pass back and forth among AlphaFold’s components, improving prediction accuracy.
- A process called “recycling” feeds the MSA, pair representations, and 3D structure back into the neural network multiple times.
AlphaFold represents a significant breakthrough in computational biology, offering unprecedented accuracy in protein structure prediction and potentially revolutionizing various fields of biological research and drug discovery.