Common Voice

Common Voice’s multi-language dataset is the largest publicly available voice dataset of its kind. Each entry in the dataset consists of a unique MP3 and corresponding text file. Many of the 32,585 recorded hours in the dataset also include demographic metadata. The dataset currently consists of 21,594 validated hours in 131 languages.

Related Blueprints

No items found.

Load more Blueprints

Common Voice

Related Blueprints

Related content