The World's Largest
Catalog of Enterprise Audio

Dual-Channel. PII Redacted. Human-In-The-Loop Transcripts.

70M

CALLS

8M

HOURS

30

LANGUAGES

200+

ACCENTS

Legal Industry

500k hours · English

Customer conversations from the legal services
industry in the United States.
Topics include business formation, trademarks and IP, estate planning, and tax.
Metadata includes CSAT scores

View Dataset

Financial Industry

5M hours · 20 languages

Multilingual customer conversations from global financial platform across 160 countries.
Topics include transfers, fraud and disputes, identity verification, fx, banking, and credit cards.
Languages include Russian, Mandarin, Arabic, Japanese, & Hindi

view dataset

The Language Corpus

8M hours · 30 languages · 200+ accents

Every major and many minor languages – Catalan, Bengali, Tagalog, Greek, Bulgarian, Creole & more.
Variety of accents within a single language – e.g. 35 American regional/local accents.
Wide variety of subject matters discussed

view dataset

Tech Support

750k hours · English & Spanish

Technical support conversations from the consumer electronics industry.
Multi-turn procedural troubleshooting in the casual register of real consumers.

coming soon

©️ 2026 Sumo AI

©️ 2026 Sumo AI