Weakness of AlphaFold3: I discussed Alohafold3 with Gemini (a Google-based AI chatbot) and had an interesting conversation. She understood the method well, although the code is not publicized yet, and explained its weaknesses effectively. Here’s the summary notes after a few discussions: (1) General weaknesses as an AI-based system (2) AF3 specific weaknesses (3) Hallucinations —————————————————————— (1)AlphaFold3, like most AI models, definitely shares some of the general weaknesses of AI: - Data Dependence: AlphaFold3 relies on the quality and quantity of data it's trained on. If the training data has biases or limitations, the model's predictions might reflect those. For instance, if there's a lack of data on certain protein structures, AlphaFold3 might be less confident in its predictions for those types of proteins. - Limited Explainability: While AlphaFold3 provides a confidence score, it can be challenging to understand exactly how the model arrives at its final structure prediction. This lack of transparency can make it difficult to pinpoint potential errors or areas for improvement. —————————————————————— (2) weaknesses specific to AlphaFold3 Accuracy Limitations: While AlphaFold3 is incredibly powerful, its predictions aren't always perfect. Experimental validation is still crucial to confirm the accuracy of its predictions, especially for complex proteins or those with unusual structures. Challenges with Interactions: While AlphaFold3 has improved its ability to predict protein-protein interactions, it's still an ongoing area of development. The model might struggle with particularly complex or transient interactions. Limited Data on Conformational Changes: The training data for AlphaFold3 likely focuses on protein structures in their "resting" state. Data on the full range of conformations a protein might adopt under various conditions might be limited. This can make it difficult for AlphaFold3 to predict how a protein might change shape in response to specific stimuli. —————————————————————— (3) Hallucinations Here are a couple of potential examples of hallucinations (inaccuracies) in AlphaFold3's protein structure predictions: Incorrect Folds for Proteins with Unseen Folds: AlphaFold3 is trained on a massive dataset of protein structures, but there will always be some proteins with unique folds not present in the training data. In these cases, AlphaFold3 might predict a structure that is physically impossible or highly unlikely for that protein, essentially hallucinating a fold it hasn't encountered before. Overconfident Predictions for Low-Quality Data: AlphaFold3 outputs a confidence score alongside its structure prediction. However, the model might be overly confident in its predictions, especially when trained on limited or noisy data about a specific protein. This could lead researchers to believe a potentially inaccurate structure is correct. —————————————————————— (4) someone else’s example (reposting)
It seems rather convenient that two of the structures* left out in the Nature Portfolio paper for AlphaFold3 on RNA by Google DeepMind are ones where it does not do well (from our multiple trials on web server). Makes you wonder about many things, especially in absence of code and data Plots made by my PhD student Venkata Adury - you can generate them yourself and check our results. The AF3 RMSD will fluctuate a bit trial to trial, and we report closest to experimental structure from Rosetta. True state of the art comparison would be with Alchemy RNA which does much better. * R1149 and R1156 from https://lnkd.in/euqaqrKe The PDBs were released December 6, 2023 which is before the date the paper was received by Nature (December 13, 2023) and well before the paper got publicly posted in May 2024.