Librispeech#
This config can be used to prepare Librispeech dataset in the NeMo format.
It produces manifests for the dev-clean split (for other splits, please configure). The options are:
"dev-clean"
"dev-other"
"test-clean"
"test-other"
"train-clean-100"
"train-clean-360"
"train-other-500"
"dev-clean-2"
"train-clean-5"
This config performs the following data processing.
Downloads Librispeech data
Converts flac files to wav file
Calculates the length of wav files
Makes capitalization lowercase
Required arguments.
workspace_dir: specify the workspace folder where all audio files will be stored.
Note that you can customize any part of this config either directly or from command-line.
Output format.
This config generates output manifest file:
${workspace_dir}/manifest.json
- dev-clean subset of the data.
Output manifest contains the following fields:
audio_filepath (str): relative path to the audio files.
text (str): transcription (lower-case without punctuation).
duration (float): audio duration in seconds.
Config link: dataset_configs/english/librispeech/config.yaml