Logo

RAG Pipelines for Developers

  • About the RAG Pipelines
  • Support Matrix
  • API Catalog Models
  • Local GPUs
  • Multi-GPU for Inference
  • Query Decomposition
  • Quantized Model
  • Structured Data
  • Multimodal Data
  • Multi-turn
  • Sample Chat Application
  • Alternative Vector Database

Tools

  • Evaluation
  • Observability

Jupyter Notebooks

  • Basics: Prompt, Client, and Responses
  • LLM Streaming Client
  • Q&A with LangChain
  • Q&A with LlamaIndex
  • Advanced Q&A with LlamaIndex
  • Press Release Chat Bot
  • NVIDIA AI Endpoints with LangChain
  • LangChain with Local Llama 2 Model
  • NVIDIA AI Endpoints, LlamaIndex, and LangChain
  • HF Checkpoints with LlamaIndex and LangChain
  • Multimodal Models from NVIDIA AI Endpoints with LangChain Agent
  • Build a RAG chain by generating embeddings for NVIDIA Triton documentation

Software Components

  • Architecture
  • NeMo Framework Inference Server
  • RAG Playground Web Application
  • Jupyter Notebook Server
  • Chain Server
  • Software Component Configuration
NVIDIA Generative AI Examples
  • »
  • Search


© Copyright 2023-2024, NVIDIA.