megatron-energon

Introduction

  • General
  • Installation

Basic Usage

  • Quickstart
  • Data Preparation
  • Data Flow
  • Task Encoder
  • Metadataset
  • Save and Restore

Advanced Usage

  • Remote Dataset
  • Custom Sample Loader
  • Crude Data and How to Cook It 👨‍🍳
  • Reproducible Scaling
  • Packing
  • Grouping
  • Joining Datasets
  • Epochized Blending
  • Customized Blending
  • Parallelism

API

  • Packages and Modules
    • megatron.energon
  • Command-Line Interface

Internals

  • Contribution Guidelines
  • Code Structure
megatron-energon
  • Packages and Modules
  • View page source
Previous Next

Packages and Modules

  • megatron.energon
    • AugmentTaskEncoder
    • BaseCoreDatasetFactory
    • BaseWebdatasetFactory
    • Batch
    • BatchDataset
    • BlendDataset
    • CaptioningSample
    • CaptioningWebdataset
    • ConcatDataset
    • Cooker
    • CrudeSample
    • CrudeWebdataset
    • DatasetLoader
    • DatasetLoaderInterface
    • DefaultDecoderWebdatasetFactory
    • DefaultGenericWebdatasetFactory
    • DefaultTaskEncoder
    • EpochizeDataset
    • FilterDataset
    • GcDataset
    • GroupBatchDataset
    • ImageClassificationSample
    • ImageClassificationWebdataset
    • ImageSample
    • ImageWebdataset
    • InterleavedSample
    • InterleavedWebdataset
    • IterMapDataset
    • JoinedWebdatasetFactory
    • LimitDataset
    • LogSampleDataset
    • MapDataset
    • Metadataset
    • MetadatasetV2
    • MixBatchDataset
    • MultiChoiceVQASample
    • MultiChoiceVQAWebdataset
    • OCRSample
    • OCRWebdataset
    • PackingDataset
    • RepeatDataset
    • Sample
    • SavableDataLoader
    • SavableDataset
    • ShuffleBufferDataset
    • SimilarityInterleavedSample
    • SimilarityInterleavedWebdataset
    • SkipSample
    • StandardWebdatasetFactory
    • TaskEncoder
    • TextSample
    • TextWebdataset
    • VQAOCRWebdataset
    • VQASample
    • VQAWebdataset
    • VidQASample
    • VidQAWebdataset
    • WorkerConfig
    • basic_sample_keys()
    • batch_list()
    • batch_pad_stack()
    • batch_stack()
    • concat_pad()
    • generic_batch()
    • generic_concat()
    • get_loader()
    • get_savable_loader()
    • get_train_dataset()
    • get_val_dataset()
    • get_val_datasets()
    • homogeneous_concat_mix()
    • load_dataset()
    • prepare_metadataset()
    • stateless()
    • Cooker
    • basic_sample_keys()
Previous Next

© Copyright 2025 NVIDIA Corporation.

Built with Sphinx using a theme provided by Read the Docs.