Amazon Transcribe — это продвинутый сервис распознавания речи, который теперь работает на базе многомиллиардной нейросетевой модели. Система поддерживает более 100 языков, позволяя мгновенно превращать аудио в текст с высокой точностью для любых бизнес-задач.
Amazon Transcribe is a fully managed automatic speech recognition (ASR) service that makes it straightforward for you to add speech-to-text capabilities to your applications. Today, we are happy to announce a next-generation multi-billion parameter speech foundation model-powered system that expands automatic speech recognition to over 100 languages. In this post, we discuss some of the benefits of this system, how companies are using it, and how to get started. We also provide an example of the transcription output below. Transcribe’s speech foundation model is trained using best-in-class, self-supervised algorithms to learn the inherent universal patterns of human speech across languages and accents. It is trained on millions of hours of unlabeled audio data from over 100 languages. The training recipes are optimized through smart data sampling to balance the training data between languages, ensuring that traditionally under-represented languages also reach high accuracy levels.