March 28, 2026Agent-OperableOpen SourceAPI

Cohere Transcribe — Open-Source Speech Model for Agent Voice Pipelines

Cohere has launched Transcribe, its first open-source automatic speech recognition model, on March 26. Covered by TechCrunch and reaching the top of the Hugging Face Open ASR leaderboard, the 2B-parameter model achieves a word error rate of 5.42 — lower than any competing model including Zoom Scribe v1, IBM Granite 4.0 1B, ElevenLabs Scribe v2, and Qwen3-ASR-1.7B.

Transcribe supports 14 languages and is small enough to run on consumer-grade GPUs for self-hosting. In human evaluation, it achieved a 61% win rate over competing models for accuracy, coherence, and usability. The model is available under an open-source license on Hugging Face and through Cohere's API for free.

Critically, Cohere plans to integrate Transcribe into North, its enterprise agent orchestration platform. This positions Transcribe not just as a standalone ASR model but as the voice input layer for agentic workflows — enabling agents to process spoken instructions, transcribe meetings for action item extraction, and power voice-first agent interfaces.

With Mistral's Voxtral TTS launching the same week for voice output and Gemini 3.1 Flash Live adding real-time audio processing, March 2026 is establishing the full voice stack for AI agents: hear (Transcribe), think (LLMs), speak (Voxtral TTS).

Hugging Face: https://huggingface.co/CohereLabs/cohere-transcribe-03-2026
Blog: https://cohere.com/blog/transcribe
← Previous
Aera Browser — The Browser Built for AI Agent Automation with MCP
Next →
GitHub Stars Daily Spotlight — March 29, 2026
← Back to all articles

Comments

Loading...
>_