Datasets
Audio, transcript, and labeled training data collections
Sales Transcripts Q1v3
Transcripts
1,247 items · 342 MB · Created Jan 15
Label distribution:
Train: 70%Test: 20%Val: 10%
Quality: 92%
4/4 domains covered
VoiceBot Training Audiov2
Audio
856 items · 2.1 GB · Created Feb 1
Label distribution:
Train: 80%Test: 15%Val: 5%
Quality: 88%
4/4 domains covered
Synthetic Examplesv1
Labeled Data
5,000 items · 28 MB · LLM-generated
Label distribution:
Not split yet
Quality: 67%
3/5 domains covered
Domain-Specific Lithuanian v1v1
Training Audio
245 items · 1.2 GB · 3h 15m total audio
Label distribution:
Train: 80%Val: 10%Test: 10%
Quality: 85%
4/4 domains covered