Librispeech#

This config can be used to prepare Librispeech dataset in the NeMo format.

It produces manifests for the dev-clean split (for other splits, please configure). The options are:

This config performs the following data processing.

Required arguments.

workspace_dir: specify the workspace folder where all audio files will be stored.

Note that you can customize any part of this config either directly or from command-line.

Output format.

This config generates output manifest file:

Output manifest contains the following fields: