Start Date
5-12-2025 12:00 PM
End Date
5-12-2025 1:00 PM
Description
AlphaFold3 represents a major advance in protein structure prediction, yet its performance on intrinsically disordered proteins remains uncharacterized. We present the first systematic evaluation of AF3 on disordered systems, revealing a striking dichotomy. For monomers, AF3's pLDDT scores reliably predict disorder (MCC: 0.693), matching AlphaFold2 and rivaling dedicated predictors. This consistency across fundamentally di ̄erent architectures confirms that disorder prediction emerges from training data, not model design. For multimers, the picture grows complex. Despite comparable aggregate performance (mean DockQ: 0.563 vs 0.571), AF3 and AF2 achieve these results through fundamentally di ̄erent mechanisms.
Conventional structural features explain 58% of AF2's variance but only 42% of AF3's. Users cannot predict when AF3 will succeed or fail from interface properties alone. On disorder-to-order transitions (MFIB benchmark), both models perform equally well, successfully predicting final folded states. Yet seed variance analysis reveals AF3's failures are deterministic: the model converges to identical structures across independent runs, whether correct or incorrect, indicating rigid structural priors override available information. Our findings establish AF3 as reliable for the prediction of monomer disorder but unpredictable for multimers. Architecture alone cannot overcome training data bias. Progress demands disorder-enriched datasets and ensemble sampling, not merely novel architectures.
AlphaFold3 and Intrinsically Disordered Proteins: Reliable Monomer Prediction, Unpredictable Multimer Performance
AlphaFold3 represents a major advance in protein structure prediction, yet its performance on intrinsically disordered proteins remains uncharacterized. We present the first systematic evaluation of AF3 on disordered systems, revealing a striking dichotomy. For monomers, AF3's pLDDT scores reliably predict disorder (MCC: 0.693), matching AlphaFold2 and rivaling dedicated predictors. This consistency across fundamentally di ̄erent architectures confirms that disorder prediction emerges from training data, not model design. For multimers, the picture grows complex. Despite comparable aggregate performance (mean DockQ: 0.563 vs 0.571), AF3 and AF2 achieve these results through fundamentally di ̄erent mechanisms.
Conventional structural features explain 58% of AF2's variance but only 42% of AF3's. Users cannot predict when AF3 will succeed or fail from interface properties alone. On disorder-to-order transitions (MFIB benchmark), both models perform equally well, successfully predicting final folded states. Yet seed variance analysis reveals AF3's failures are deterministic: the model converges to identical structures across independent runs, whether correct or incorrect, indicating rigid structural priors override available information. Our findings establish AF3 as reliable for the prediction of monomer disorder but unpredictable for multimers. Architecture alone cannot overcome training data bias. Progress demands disorder-enriched datasets and ensemble sampling, not merely novel architectures.