BlogAI/Voice
March 5, 20267 min
Native STT vs. Deepgram: The Cost-Quality Trade-off
TAMSIV's voice pipeline relies on Deepgram. Excellent quality, impeccable French support. The problem? Every second of audio has a price.
Native STT as an alternative
Every smartphone has a built-in speech recognition engine. Free, local, fast. Less accurate than Deepgram for fast French, no reliable punctuation. But for dictating a short task, it's sufficient.
The dual architecture
Two interchangeable modes, configurable from the admin:
- Cloud (Deepgram): Audio via WebSocket, integrated VAD, high quality. Pro/Team plans.
- Native: On-device recognition, no data leaves the phone. Free plan.
The frontend exposes a unified interface. Components don't know which engine is running.
Real-world comparison
- Quiet environment: native 92%, Deepgram 98%
- With background noise: native 75%, Deepgram 94%
- French with accents: Deepgram significantly better
The verdict: Deepgram is superior. But for a free plan, native is acceptable. Users who want the best quality have another reason to go Pro. Win-win.