Codestral Embed от Mistral AI — это первая специализированная модель эмбеддингов, созданная специально для работы с программным кодом. ИИ демонстрирует выдающиеся результаты в задачах поиска (retrieval) и автодополнения, опережая по эффективности решения от OpenAI и Cohere.
We are excited to release Codestral Embed, our first embedding model specialized for code. It performs especially well for retrieval use cases on real-world code data. Codestral Embed significantly outperforms leading code embedders in the market today: Voyage Code 3, Cohere Embed v4.0 and OpenAI’s large embedding model. Codestral Embed can output embeddings with different dimensions and precisions, and the figure below illustrates the trade-offs between retrieval quality and storage costs. Codestral Embed with dimension 256 and int8 precision still performs better than any model from our competitors. The dimensions of our embeddings are ordered by relevance. For any integer target dimension n, you can choose to keep the first n dimensions for a smooth trade-off between quality and cost.