Release Notes for NeMo Retriever Extraction

This documentation contains the release notes for NeMo Retriever extraction.

Note

NeMo Retriever extraction is also known as NVIDIA Ingest and nv-ingest.

Release 25.09 (25.9.0)

The NeMo Retriever extraction 25.09 release adds new hardware and software support, and other improvements, including the following:

Add functional support for RTX Pro 6000.
Add functional support for DGX B200.
Add support for nemoretriever-ocr-v1. For details, refer to Deploy With Docker Compose (Self-Hosted) and NV-Ingest Helm Charts.
Add support for llama-3.2-nemoretriever-1b-vlm-embed-v1.
Add support for Llama Nemotron VLM 8b NIM for image captioning. For details, refer to Extract Captions from Images.
Add support for custom vector database implementations. For details, refer to Build a Custom Vector Database Operator.
Add support for custom Lambda stages. For details, refer to Add User-defined Stages to Your NeMo Retriever Extraction Pipeline.
Expanded documentation for Library Mode.
New documentation Configure Ray Logging.
New documentation Use Multimodal Embedding.
Add support for Integer, float, boolean, and array in custom metadata during Milvus entity creation.
Add support for running more than one VLM at a time by using Helm. For details, refer to NV-Ingest Helm Charts.

Known Issues

The following are the known issues for this release:

A10G and L40S are not supported. For details, refer to Support Matrix.
nemoretriever-parse is not supported on RTX Pro 6000 or B200. For details, refer to Support Matrix.
The NeMo Retriever extraction pipeline does not support ingestion of batches that include individual files greater than approximately 400MB.

Breaking Changes

There are no breaking changes in this version.

Upgrade

To upgrade the Helm Charts for this version, refer to NV-Ingest Helm Charts.

Release 25.6.3

The NeMo Retriever extraction 25.6.3 release is a patch release that updates the client's dependency on pypdfium2 to the latest stable version.

Only the release branch and the nv-ingest-client package have been updated in 25.6.3. The previously released 25.6.2 container on NGC remains unchanged.

Known Issues

The following are the known issues for this release:

The NeMo Retriever extraction pipeline does not support ingestion of batches that include individual files greater than approximately 400MB.

Release 25.6.2

The NeMo Retriever extraction 25.06 release focuses on accuracy improvements and feature expansions, including the following:

Improve reranker accuracy.
Upgrade Python version from 3.10 to 3.12
Helm deployment now has similar throughput performance to docker deployment.
Add support for the latest version of the OpenAI API.
Add MIG support. For details, see Enable NVIDIA GPU MIG.
Add time slicing support. For details, see Enable GPU time-slicing.
Add support for RIVA NIM for optional audio extraction. For details, see helm/values.yaml.
New notebook for How to add metadata to your documents and filter searches.

Known Issues

The following are the known issues for this release:

The NeMo Retriever extraction pipeline does not support ingestion of batches that include individual files greater than approximately 400MB.

Breaking Changes

There are no breaking changes in this version.

Upgrade

To upgrade the Helm Charts for this version, refer to NV-Ingest Helm Charts.

Release 25.4.2

The NeMo Retriever extraction 25.04 release focuses on small bug fixes and improvements, including the following:

Fixed a known issue where large text file ingestion failed.
The REST service is now more resilient, and recovers from worker failures and connection errors.
Various improvements on the client side to reduce retry rates, and improve overall quality of life.
New notebook for How to reindex a collection.
Expanded chunking documentation. For more information, refer to Split Documents.

Breaking Changes

There are no breaking changes in this version.

Upgrade

To upgrade the Helm Charts for this version, refer to NV-Ingest Helm Charts.

Release 25.3.0

The NeMo Retriever extraction 25.03 release includes accuracy improvements, feature expansions, and throughput improvements.

New Features

Consolidated NeMo Retriever extraction to run on a single GPU (H100, A100, L40S, or A10G). For details, refer to Support Matrix.
Added Library Mode for a lightweight no-GPU deployment that uses NIM endpoints hosted on build.nvidia.com. For details, refer to Deploy Without Containers (Library Mode).
Added support for infographics extraction.
Added support for RIVA NIM for Audio extraction (Early Access). For details, refer to Audio Processing.
Added support for Llama-3.2 VLM for Image Captioning capability.
docX, pptx, jpg, png support for image detection & extraction.
Deprecated DePlot and CACHED NIMs.

Release 24.12.1

Cases where .split() tasks fail during ingestion are now fixed.

Release 24.12

We currently do not support OCR-based text extraction. This was discovered in an unsupported use case and is not a product functionality issue.