Get verified datasets for speech and accents from $0.59/clip
Back to PressPress Release

100,000 Hours Milestone: VMX Dataset Library Reaches New Scale

I am announcing that our verified dataset library has surpassed 100,000 hours, with stronger multilingual and accent representation.

Jane DavisFeb 1, 20254 min read
100,000 Hours Milestone: VMX Dataset Library Reaches New Scale cover

A Milestone Built on Quality

Crossing 100,000 hours is meaningful, but volume alone is not the point. I care most about the quality and verification integrity behind every hour in our library.

From day one in 2025, I have pushed our team to prioritize trust signals that customers can actually audit and rely on.

What Improved Alongside Scale

As the library expanded, we improved metadata consistency, speaker verification checks, and accent coverage depth across key markets. Scale and rigor have to move together.

We also improved contributor workflows to reduce friction while maintaining strong consent and review controls.

What Comes Next

This milestone gives us a stronger base for the next stage of growth. I am focused on expanding high-value language packs and improving delivery speed for enterprise teams.

I appreciate everyone who contributed to reaching this milestone: our team, our contributors, and our customers who hold us to a high standard.