Librispeech (all)

Librispeech (all)#

This config can be used to prepare Librispeech dataset in the NeMo format.

It produces manifests for the all splits of Libripseech.

This config performs the following data processing.

  1. Downloads Librispeech data

  2. Converts flac files to wav file

  3. Calculates the length of wav files

  4. Makes capitalization lowercase

Required arguments.

  • workspace_dir: specify the workspace folder where all audio files will be stored.

Note that you can customize any part of this config either directly or from command-line.

Output format.

This config generates output manifest files for all splits of the data:

  • ${workspace_dir}/dev-clean.json - dev-clean subset.

  • ${workspace_dir}/dev-other.json - dev-other subset.

  • ${workspace_dir}/test-clean.json - test-clean subset.

  • ${workspace_dir}/test-other.json - test-other subset.

  • ${workspace_dir}/train-clean-100.json - train-clean-100 subset.

  • ${workspace_dir}/train-clean-360.json - train-clean-360 subset.

  • ${workspace_dir}/train-other-500.json - train-other-500 subset.

Output manifest contains the following fields:

  • audio_filepath (str): relative path to the audio files.

  • text (str): transcription (lower-case without punctuation).

  • duration (float): audio duration in seconds.

Config link: dataset_configs/english/librispeech/all.yaml