Специализированная вариация модели, дообученная на классическом корпусе Penn Treebank (PTB). Данная ИИ-модель оптимизирована для лингвистических исследований и задач, где критически важно глубокое понимание синтаксической структуры языка.
Model Description GPT-Neo 2.7B is a transformer model designed using EleutherAI's replication of the GPT-3 architecture. GPT-Neo refers to the class of models, while 2.7B represents the number of parameters of this particular pre-trained model. Training data GPT-Neo 2.7B was trained on the Pile, a large scale curated dataset created by EleutherAI for the purpose of training this model. Training procedure This model was trained for 420 billion tokens over 400,000 steps. It was trained as a masked autoregressive language model, using cross-entropy loss.