Модель PLTNUM специализируется на предсказании периода полураспада белков, используя архитектуру языковых моделей ИИ. Она демонстрирует высокую точность на данных различных клеточных линий, помогая ученым лучше понимать динамику белков в организме.
We developed a protein half-life prediction model, PLTNUM, based on a protein language model using an extensive dataset of protein sequences and protein half-lives from the NIH3T3 mouse embryo fibroblast cell line as a training set. PLTNUM achieved an accuracy of 71% on validation data and showed robust performance with an ROC of 0.73 when applied to a human cell line dataset. By incorporating Shapley Additive Explanations (SHAP) into PLTNUM, we identified key factors contributing to shorter protein half-lives, such as cysteine-containing domains and intrinsically disordered regions. Using SHAP values, PLTNUM can also predict potential degron sequences that shorten protein half-lives. This model provides a platform for elucidating the sequence dependency of protein half-lives, while the uncertainty in predictions underscores the importance of biological context in influencing protein half-lives.