What is NVIDIA NeMo Retriever?
NVIDIA NeMo Retriever is a collection of microservices for building and scaling multimodal data extraction, embedding, and reranking pipelines with high accuracy and maximum data privacy – built with NVIDIA NIM. NeMo Retriever, part of the NVIDIA NeMo software suite for managing the AI agent lifecycle, ensures data privacy and seamlessly connects to proprietary data wherever it resides, empowering secure, enterprise-grade retrieval.
NeMo Retriever provides the following:
- Multimodal Data Extraction — Quickly extract documents at scale that include text, tables, charts, and infographics.
- Embedding + Indexing — Embed all extracted text from text chunks and images, and then insert into Milvus - accelerated with NVIDIA cuVS.
- Retrieval — Leverage semantic + hybrid search for high accuracy retrieval with the embedding + reranking NIM microservice.
Enterprise-Ready Features
NVIDIA NeMo Retriever comes with enterprise-ready features, including the following:
- High Accuracy — NeMo Retriever exhibits a high level of accuracy when retrieving across various modalities through enterprise documents.
- High Throughput — NeMo Retriever is capable of extracting, embedding, indexing and retrieving across hundreds of thousands of documents at scale with high throughput.
- Decomposable/Customizable — NeMo Retriever consists of modules that can be separately used and deployed in your own environment.
- Enterprise-Grade Security — NeMo Retriever NIMs come with security features such as the use of safetensors, continuous patching of CVEs, and more.
Applications
The following are some applications that use NVIDIA Nemo Retriever:
- AI Virtual Assistant for Customer Service (NVIDIA AI Blueprint)
- Build an Enterprise RAG pipeline (NVIDIA AI Blueprint)
- Building Code Documentation Agents with CrewAI (CrewAI Demo)
- Digital Human for Customer Service (NVIDIA AI Blueprint)
- Document Research Assistant for Blog Creation (LlamaIndex Jupyter Notebook)
- Video Search and Summarization (NVIDIA AI Blueprint)