Яндекс Метрика
Биология и ИИ

PeTriBERT

University of Montpellier,BionomeeX
Protein generation

PeTriBERT — специализированная AI-модель, созданная для решения задачи «обратного фолдинга» белков. Этот ИИ умеет предсказывать первичную аминокислотную последовательность, опираясь на трехмерную структуру белка, что значительно ускоряет биоинженерные исследования.

Protein is biology workhorse. Since the recent break-through of novel folding methods, the amount of available structural data is increasing, closing the gap between data-driven sequence-based and structure-based methods. In this work, we focus on the inverse folding problem that consists in predicting an amino-acid primary sequence from protein 3D structure. For this purpose, we introduce a simple Transformer model from Natural Language Processing augmented 3D-structural data. We call the resulting model PeTriBERT: Proteins embedded in tridimensional representation in a BERT model. We train this small 40-million parameters model on more than 350 000 proteins sequences retrieved from the newly available AlphaFoldDB database. Using PetriBert, we are able to in silico generate totally new proteins with a GFP-like structure. These 9 of 10 of these GFP structural homologues have no ressemblance when blasted on the whole entry proteome database. This shows that PetriBert indeed capture protein folding rules and become a valuable tool for de novo protein design.

Что такое PeTriBERT?+
Кто разработал PeTriBERT?+
Какие задачи решает PeTriBERT?+