Mistral Moderation: новый API для безопасности ИИ

Q: Кто разработал Mistral Moderation?

Модель Mistral Moderation разработана компанией Mistral AI (France).

// задачи

Text classification

// описание

Mistral Moderation — это специализированный API для классификации контента, призванный сделать использование ИИ безопасным. Модель использует те же алгоритмы защиты, что и сервис Le Chat, позволяя разработчикам гибко настраивать этические фильтры и защищать свои приложения от нежелательного контента.

// abstract

Safety plays a key role in making AI useful. At Mistral AI, we believe that system level guardrails are critical to protecting downstream deployments.That's why we are releasing a new content moderation API. It is the same API that powers the moderation service in Le Chat. We are launching it to empower our users to utilize and tailor this tool to their specific applications and safety standards. Over the past few months, we've seen growing enthusiasm across the industry and research community for new LLM based moderation systems, which can help make moderation more scalable and robust across applications. Our model is an LLM classifier trained to classify text inputs into 9 categories defined below. We are releasing two end-points: one for raw text and one for conversational content. Undesirable content is very specific to a given context, therefore we've trained our model to classify the last message of conversation within a conversational context. Check out our technical documentation for more information. The model is natively multilingual and in particular trained on Arabic, Chinese, English, French, German, Italian, Japanese, Korean, Portuguese, Russian, Spanish.

// faq

Что такое Mistral Moderation?+

Кто разработал Mistral Moderation?+

Какие задачи решает Mistral Moderation?+

// похожие модели