Speech Data Processor#
Speech Data Processor (SDP) is a toolkit to make it easy to:
Write code to process a new dataset, minimizing the amount of boilerplate code required.
Share the steps for processing a speech dataset.
SDP is hosted here: NVIDIA/NeMo-speech-data-processor. It’s mainly used to prepare datasets for NeMo toolkit.
SDP’s philosophy is to represent processing operations as ‘processor’ classes, which take in a path to a NeMo-style data manifest as input (or a path to the raw data directory if you do not have a NeMo-style manifest to start with), apply some processing to it, and then save the output manifest file.
You specify which processors you want to run using a YAML config file. Many common processing operations are provided, and it is easy to add your own.
To learn more about SDP, have a look at the following sections.