Pixtral 12B: Мультимодальная ИИ-модель от Mistral AI

Q: Кто разработал Pixtral 12B?

Модель Pixtral 12B разработана компанией Mistral AI (France).

Q: Какие задачи решает Pixtral 12B?

Генерация текста, Ответы на вопросы, Визуальные ответы на вопросы, Генерация кода

// задачи

Генерация текстаОтветы на вопросыВизуальные ответы на вопросыГенерация кода

// описание

Pixtral 12B от Mistral AI — это мощная мультимодальная модель, способная одинаково эффективно обрабатывать текст, код и изображения любого формата. Благодаря новому визуальному энкодеру на 400 млн параметров, этот ИИ отлично справляется со сложными визуальными вопросами и контекстом.

// abstract

Natively multimodal, trained with interleaved image and text data Strong performance on multimodal tasks, excels in instruction following Maintains state-of-the-art performance on text-only benchmarks New 400M parameter vision encoder trained from scratch 12B parameter multimodal decoder based on Mistral Nemo Supports variable image sizes and aspect ratios Supports multiple images in the long context window of 128k tokens License: Apache 2.0 Pixtral is trained to understand both natural images and documents, achieving 52.5% on the MMMU reasoning benchmark, surpassing a number of larger models. The model shows strong abilities in tasks such as chart and figure understanding, document question answering, multimodal reasoning and instruction following. Pixtral is able to ingest images at their natural resolution and aspect ratio, giving the user flexibility on the number of tokens used to process an image. Pixtral is also able to process any number of images in its long context window of 128K tokens. Unlike previous open-source models, Pixtral does not compromise on text benchmark performance to excel in multimodal tasks.

// faq

Что такое Pixtral 12B?+

Кто разработал Pixtral 12B?+

Какие задачи решает Pixtral 12B?+

// похожие модели

HyperCLOVA X SEED 32B Think

NAVER

32.0B

Nova 2

Amazon Web Services (AWS)

Gemini 3 Pro

Google DeepMind