# Introduction

The ``accvlab.on_demand_video_decoder`` package provides hardware-accelerated GPU on-demand decoding 
capabilities for NVIDIA GPUs.

The package internally leverages the Video Codec SDK's core C/C++ video decode APIs and provides user-friendly 
Python interfaces. The package offers efficient and convenient methods for video frame extraction.

![accvlab.on_demand_video_decoder Overview](images/overview.png)

## Target Use Cases

This package is specifically designed for scenarios that demand high video decoding throughput, including:

- **Autonomous Driving**: Process large volumes of video data for training perception models
- **Multimodal Large Language Models (MLLMs)**: Efficiently extract video frames for vision-language training
- **Video Understanding**: Enable high-throughput video analysis for inference and training workloads
- **Additional Scenarios Sensitive to Video-Decode Throughput**

## Key Benefits

### 1. Massive Storage Savings
Traditional workflows require extracting video frames and storing them as individual images on disk before 
training. This package eliminates this step by decoding frames **on-demand** directly from video files, saving 
approximately **90% of disk storage** with negligible performance overhead.

### 2. Reduced CPU Overhead
By offloading video decoding tasks to NVIDIA's dedicated **NVDEC hardware decoder**, this package frees up CPU 
resources **(~20%)** for other critical training pipeline operations, improving overall system performance.

### 3. Flexible Decoding Methods
``accvlab.on_demand_video_decoder`` provides multiple decoding modes optimized for different workload 
patterns:
- **Random Decode**: Optimized for large-batch scenarios with random video selection and random frame sampling
- **Stream Decode**: Optimized for large-batch sequential frame decoding from videos
- **Separate Decode**: Decouples demuxer and decoder components, enabling flexible configuration of the video 
  decoding pipeline
- **Demuxer-Free Decode**: Optimized for direct GOP (Group of Pictures) reading scenarios, balancing latency 
  and storage efficiency

### 4. Seamless Integration with PyTorch
``accvlab.on_demand_video_decoder`` integrates efficiently with PyTorch, and detailed examples are provided to 
help users quickly integrate the package and boost workload performance.

```{seealso}
- [Basic ``accvlab.on_demand_video_decoder`` Usage Sample Code](sample.md)
- [PyTorch Integration Examples](pytorch_integration_examples/index)
```

## Features

### Functional Features
* **Codecs**: H.264, HEVC, AV1.
* **Surface formats**: NV12 (8 bit), YUV 4:2:0 (10 bit), YUV 4:4:4 (8 and 10 bit).
* **[Video container formats](https://ffmpeg.org/ffmpeg-formats.html#Demuxers)**: MP4, MOV, FLV, etc.
* DLPack support to facilitate data exchange with popular DL frameworks like PyTorch and TensorRT.
* Contains Python sample applications demonstrating API usage.
* Flexible decode method (random, stream, demuxer-decoder separation, decoder-only)

### High-Performance Features
* **Caching**: Re-use of Demuxers, Decoders, Packets & internally used data
* **Map-free**: Avoid memory mapping for unneeded frames
* **Use GPU Memory Pool**: Avoid frequent memory re-allocation (re-allocate only if total needed memory 
  increases)
* **Producer-Customer Model**:  Demuxer as producer, decoder as consumer, GOP as products.
* **NVDEC Pipeline**: Pipeline utilizes all NVDEC units while ensuring load balancing with non-uniform GOP 
  length 

## Getting Started

Please refer to the following sections to get started:

* [**Installation**](installation)
* [**Sample Code Documentation**](sample)
* **PyTorch Integration Examples**
    * [**Random decode with PyTorch**](pytorch_integration_examples/dataloader_random_decode)
    * [**Stream decode with PyTorch**](pytorch_integration_examples/dataloader_stream_decode)
    * [**Separation decode with PyTorch**](pytorch_integration_examples/dataloader_separation_decode)
    * [**Demuxer-free decode with PyTorch**](pytorch_integration_examples/dataloader_demuxer_free_decode)

## Acknowledgements

The ``accvlab.on_demand_video_decoder`` package builds upon and integrates with several key technologies:

- **[FFmpeg](https://www.ffmpeg.org/)**: A complete, cross-platform solution for video and audio processing 
  that provides essential multimedia framework capabilities
- **[PyNvVideoCodec](https://developer.nvidia.com/pynvvideocodec)**: NVIDIA's Python bindings for video codec 
  operations
- **[NVIDIA Video Codec SDK](https://developer.nvidia.com/video-codec-sdk)**: The underlying SDK that provides 
  hardware-accelerated video encode and decode capabilities on NVIDIA GPUs

We are grateful to these projects and the open-source community for making high-performance video processing 
accessible.