PDFII — это инновационная серия ИИ-предикторов, разработанная для анализа структуры и функций внутренне неупорядоченных белков. Используя концепцию «языка жизни», эта модель обеспечивает высокую точность прогнозов, что критически важно для современной биоинформатики.
The identification of intrinsically disordered proteins and their functional roles is largely dependent on the performance of computational predictors, necessitating a high standard of accuracy in these tools. In this context, we introduce a novel series of computational predictors, termed PDFll (Predictors of Disorder and Function of proteins from the Language of Life), which are designed to offer precise predictions of protein disorder and associated functional roles based on protein sequences. PDFll is developed through a two-step process. Initially, it leverages large-scale protein language models (pLMs), trained on an extensive dataset comprising billions of protein sequences. Subsequently, the embeddings derived from pLMs are integrated into streamlined, yet sophisticated, deep-learning models to generate predictions. These predictions notably surpass the performance of existing state-of-the-art predictors, particularly those that forecast disorder and function without utilizing evolutionary information.