NAEPro — инновационная нейросеть для совместного проектирования последовательности и структуры белков с заданными функциями. ИИ автоматически находит ключевые функциональные мотивы, позволяя ученым создавать новые биомолекулы с беспрецедентной точностью.
Proteins are macromolecules responsible for essential functions in almost all living organisms. Designing reasonable proteins with desired functions is crucial. A protein’s sequence and structure are strongly correlated and they together determine its function. In this paper, we propose NAEPro, a model to jointly design Protein sequence and structure based on automatically detected functional and conserved sites. NAEPro is powered by an interleaving network of attention and equivariant layers, which can capture global correlation in a whole sequence and local influence from nearest amino acids in three dimensional (3D) space. Such an architecture facilitates effective yet economic message passing at two levels. We evaluate our model and several strong baselines on two protein datasets, β-lactamase and myoglobin. Experimental results show that our model achieves the highest binding affinity scores among the top-5, top-10 and top-30 candidates. These findings prove the capability of our model to design functional proteins. Furthermore, in-depth analysis further confirms our model’s ability to generate highly effective proteins capable of binding to their target metallocofactors1.