Фреймворк µFormer от Microsoft Research объединяет языковые модели белков и модули глубокого обучения для точного дизайна молекул. ИИ помогает преодолеть сложности ландшафта приспособленности, находя оптимальные белковые последовательности для нужд современной инженерии.
Protein engineering plays a pivotal role in designing novel proteins with desired functions, yet the rugged fitness landscape of proteins within their mutant space presents a major challenge, limiting the effective discovery of optimal sequences. To address this, we introduce µFormer, a deep learning framework that combines a pre-trained protein language model with custom-designed scoring modules to predict the mutational effects of proteins. µFormer achieves state-of-the-art performance in predicting high-order mutants, modeling epistatic interactions, and handling insertion. By integrating µFormer with a reinforcement learning framework, we enable efficient exploration of vast mutant spaces, encompassing trillions of mutation candidates, to design protein variants with enhanced activity. Remarkably, we successfully predicted mutants that exhibited a 2000-fold increase in bacterial growth rate due to enhanced enzymatic activity. These results highlight the effectiveness of our approach in identifying impactful mutations across diverse protein targets and fitness metrics, offering a powerful tool for optimizing proteins with significantly higher success rates.