Marin 8B: Эффективная ИИ-модель для кода и математики

Q: Кто разработал Marin 8B?

Модель Marin 8B разработана компанией Marin (United States of America).

Q: Какие задачи решает Marin 8B?

Генерация текста, Ответы на вопросы, Генерация кода, Количественные рассуждения

// задачи

Генерация текстаОтветы на вопросыГенерация кодаКоличественные рассуждения

// описание

Компактная, но мощная языковая модель, обученная по уникальному итеративному методу «Tootsie Roll». Marin 8B демонстрирует выдающиеся результаты в написании кода и сложных математических рассуждениях. Этот ИИ доказывает, что инновационный подход к обучению важнее простого наращивания параметров.

// abstract

The "Tootsie Roll" Process A core premise of the Marin 8B run was that we didn't fully know the best recipe— so we just started training with what we had, and planned to adapt along the way. Internally, we referred to this as the "Tootsie" process, a reference to Tootsie Rolls, which use a "graining" process where each day's batch contains a bit of the previous day's, seeding crystallization or something. (We are not food scientists.) This is admittedly a bit of a strained metaphor, but the idea was that we'd keep folding in new data, training techniques, and whatever else as the training process went on. (As it would turn out, dear reader, we would often change more than the data...) Model Basics Model Size We decided to build a roughly 7-8 billion parameter model mostly out of pragmatism: we initially only had reserved capacity to train a model of that size for long enough. Architecture We settled on the Llama architecture for the usual reasons: it has been shown to work well, easier to plug into existing inference stacks, no one ever got fired for buying IBM, etc. We used the same settings as Llama 3.1 8B.

// faq

Что такое Marin 8B?+

Кто разработал Marin 8B?+

Какие задачи решает Marin 8B?+

// похожие модели