Librispeech#
This config can be used to prepare Librispeech dataset in the NeMo format.
It produces manifests for the dev-clean split (for other splits, please configure). The options are:
"dev-clean""dev-other""test-clean""test-other""train-clean-100""train-clean-360""train-other-500""dev-clean-2""train-clean-5"
This config performs the following data processing.
Downloads Librispeech data
Converts flac files to wav file
Calculates the length of wav files
Makes capitalization lowercase
Required arguments.
workspace_dir: specify the workspace folder where all audio files will be stored.
Note that you can customize any part of this config either directly or from command-line.
Output format.
This config generates output manifest file:
${workspace_dir}/manifest.json- dev-clean subset of the data.
Output manifest contains the following fields:
audio_filepath (str): relative path to the audio files.
text (str): transcription (lower-case without punctuation).
duration (float): audio duration in seconds.
Config link: dataset_configs/english/librispeech/config.yaml