Command-Line Interface
After you installed energon, a script called energon
will be added to your PATH.
It provides commands to prepare, preview, or lint datasets on disk.
Here’s a simple example:
energon prepare /mnt/data/my_captioning_webdataset
The above command will scan your existing off-the-shelf web dataset and add the needed metadata to make it compatible with Energon.
Below, you can see the available sub-commands under energon
.
energon
A set of tools that energon provides.
Among other things, you can use it to lint or preprocess your dataset.
See help of commands to learn more.
energon [OPTIONS] COMMAND [ARGS]...
Commands
- analyze-debug
Internal tool to analyze randomness.
- checkpoint
Tools for energon checkpoints.
- info
Get summarizing information about a dataset.
- lint
Check energon dataset for errors.
- mount
Mount an energon WebdatasetFileStore at…
- prepare
Prepare WebDataset for use with energon.
- preview
Preview samples of a dataset on the console.
energon prepare
An interactive tool to generate metadata for your existing webdataset. This will help make the dataset compliant with our format.
The tool will ask you for a train/val/test split and how to assign the webdataset fields to the fields of the corresponding sample type in Energon.
See Data Preparation for more details on how to use this command.
energon info
Prints information about the dataset such as overall number of samples and size.
energon lint
You can execute this tool on the prepared dataset to check if the data is valid and loadable. It will report any problems such as non-readable images.
energon mount
Use this to mount your prepared dataset as a virtual read-only filesystem and inspect it using ls
or other file browsing tools.
It is as simple as running
energon mount /PATH/TO/DATASET ./MY_MOUNT_POINT
This will leave the process in the foreground and the mount will exist as long as the program is running.
If you want to detach the process to the background, use the -d
or --detach
flag.
Two modes are supported by energon mount
:
Flat mode (default) |
Sample folder mode (flag |
|
---|---|---|
Description |
All files from all shards listed at |
One folder per sample key, |
Example |
|
|
Warning
You should not use the same sample keys in multiple shards of the same dataset.
If you do, energon mount
will not work as intended and it will display WARNING files in the virtual mount.
energon preview
This command will load a dataset and display samples one-by-one on the console. Note that this will not work for datasets with non-standard flavors or crude datasets.