The World's Largest
Catalog of Enterprise Audio
Dual-Channel. PII Redacted. Human-In-The-Loop Transcripts.
70M
CALLS
8M
HOURS
30
LANGUAGES
200+
ACCENTS
Legal Industry
500k hours · English
• Customer conversations from the legal services
industry in the United States.
• Topics include business formation, trademarks and IP, estate planning, and tax.
• Metadata includes CSAT scores
View Dataset
Financial Industry
5M hours · 20 languages
• Multilingual customer conversations from global financial platform across 160 countries.
• Topics include transfers, fraud and disputes, identity verification, fx, banking, and credit cards.
• Languages include Russian, Mandarin, Arabic, Japanese, & Hindi
view dataset
The Language Corpus
8M hours · 30 languages · 200+ accents
• Every major and many minor languages – Catalan, Bengali, Tagalog, Greek, Bulgarian, Creole & more.
• Variety of accents within a single language – e.g. 35 American regional/local accents.
• Wide variety of subject matters discussed
view dataset
Tech Support
750k hours · English & Spanish
• Technical support conversations from the consumer electronics industry.
• Multi-turn procedural troubleshooting in the casual register of real consumers.
coming soon
©️ 2026 Sumo AI

©️ 2026 Sumo AI

