Command-Line Interface

After you installed energon, a script called energon will be added to your PATH. It provides commands to prepare, preview, or lint datasets on disk.

Here’s a simple example:

energon prepare /mnt/data/my_captioning_webdataset

The above command will scan your existing off-the-shelf web dataset and add the needed metadata to make it compatible with Energon.

Below, you can see the available sub-commands under energon.

energon

A set of tools that energon provides.

Among other things, you can use it to lint or preprocess your dataset.

See help of commands to learn more.

energon [OPTIONS] COMMAND [ARGS]...

Commands

analyze-debug

Internal tool to analyze randomness.

checkpoint

Tools for energon checkpoints.

info

Get summarizing information about a dataset.

lint

Check energon dataset for errors.

mount

Mount an energon WebdatasetFileStore at…

prepare

Prepare WebDataset for use with energon.

preview

Preview samples of a dataset on the console.

energon prepare

An interactive tool to generate metadata for your existing webdataset. This will help make the dataset compliant with our format.

The tool will ask you for a train/val/test split and how to assign the webdataset fields to the fields of the corresponding sample type in Energon.

See Data Preparation for more details on how to use this command.

energon info

Prints information about the dataset such as overall number of samples and size.

energon lint

You can execute this tool on the prepared dataset to check if the data is valid and loadable. It will report any problems such as non-readable images.

energon mount

Use this to mount your prepared dataset as a virtual read-only filesystem and inspect it using ls or other file browsing tools. It is as simple as running

energon mount /PATH/TO/DATASET ./MY_MOUNT_POINT

This will leave the process in the foreground and the mount will exist as long as the program is running. If you want to detach the process to the background, use the -d or --detach flag.

Two modes are supported by energon mount:

Flat mode (default)

Sample folder mode (flag -s)

Description

All files from all shards listed at
the root of the mount point.

One folder per sample key,
each folder containing files
named by the sample part extension

Example

001.jpg
001.txt
002.jpg
002.txt
...

001/
jpg
txt
002/
jpg
txt
...

Warning

You should not use the same sample keys in multiple shards of the same dataset. If you do, energon mount will not work as intended and it will display WARNING files in the virtual mount.

energon preview

This command will load a dataset and display samples one-by-one on the console. Note that this will not work for datasets with non-standard flavors or crude datasets.