ML//Multimodal//Whisper

2026-02-27

- OpenAI's automatic speech recognition model (2022). Trained on 680K hours of multilingual audio.

OpenAI's automatic speech recognition model (2022). Trained on 680K hours of multilingual audio.

Robust to accents, background noise, technical jargon, the "GPT-3 moment" for speech recognition.

distil-whisper, v3 Turbo: faster production variants. Open weights, no papers for v2/v3.

Became the default ASR backbone in most voice AI pipelines.