Jupyter Notebook Server

About the Notebooks
Running JupyterLab Server Individually

About the Notebooks

The Jupyter notebooks provide guidance to building knowledge-augmented chat bots.

The following Jupyter notebooks are provided with the AI workflow for the default canonical RAG example:

Upload Press Releases and Interact with REST FastAPI Server

This notebook demonstrates how to use the REST FastAPI server to upload the knowledge base and then ask a question without and with the knowledge base.
NVIDIA AI Endpoint Integration with LangChain

This notebook demonstrates how to build a Retrieval Augmented Generation (RAG) example using the NVIDIA AI endpoint integrated with Langchain, with FAISS as the vector store.
RAG with LangChain and local LLM model

This notebook demonstrates how to plug in a local LLM from Hugging Face Hub and build a simple RAG app using LangChain.
NVIDIA AI Endpoint with LlamaIndex and LangChain

This notebook demonstrates how to plug in an NVIDIA AI Endpoint ai-mixtral-8x7b-instruct and embedding ai-embed-qa-4, bind these into LlamaIndex with these customizations.
Locally deployed model from Hugging Face integration with LlamaIndex and LangChain

This notebook demonstrates how to plug in a local LLM from Hugging Face Hub Llama-2-13b-chat-hf and all-MiniLM-L6-v2 embedding from Hugging Face, bind these to into LlamaIndex with these customizations.
LangChain agent with tools plug in multiple models from NVIDIA AI Endpoints

This notebook demonstrates how to use multiple NVIDIA AI Endpoint models such as ai-mixtral-8x7b-instruct, Deplot, and Neva.
LangChain with HTML documents and NVIDIA AI Endpoints

This notebook demonstrates how to build a RAG using NVIDIA AI Endpoints for LangChain. The notebook creates a vector store by downloading web pages and generating their embeddings using FAISS. The notebook shows two different chat chains for querying the vector store.
LangChain with HTML documents and NVIDIA AI Endpoints

This notebook guides you through creating a basic agent executor using LangGraph. We demonstrate how to handle the logic of the intermediate steps from the agent leveraging different provided tools within langGraph.
LangChain with HTML documents and NVIDIA AI Endpoints

In this notebook, we are going to use milvus as vectorstore, the ai-mixtral-8x7b-instruct as LLM and ai-embed-qa-4 embedding provided by NVIDIA_AI_Endpoint as LLM and embedding model, and build a simply RAG example for chatting with NVIDIA Financial Reports.
RAG with locally deployed models using NIMS

In this notebook we demonstrate how to build a RAG using NVIDIA Inference Microservices (NIM). We locally host a Llama3-8b-instruct using the NIM LLM container and deploy it using NVIDIA AI Endpoints for LangChain. In order to run this notebook in a virtual environment, you need to launch the NIM Docker container in the background outside of the notebook environment prior to running the LangChain code in the notebook cells. Run the commands in the first 3 cells from a terminal then begin with the 4th cell (curl inference command) within the notebook environment.

Running JupyterLab Server Individually

To run the JupyterLab server for development purposes, run the following commands:

Optional: Notebooks 7 to 9 require GPUs. If you have a GPU and want to run one of these notebooks, update the jupyter-server service in the Docker Compose file to use ./notebooks/Dockerfile.gpu_notebook as the Dockerfile:

jupyter-server:
  container_name: notebook-server
  image: notebook-server:latest
  build:
    context: ../../
    dockerfile: ./notebooks/Dockerfile.gpu_notebook

These notebooks can use more than one GPU. To use more than one, specify the GPU IDs in the device_ids field or specify count: all

jupyter-server:
  deploy:
    resources:
      reservations:
        devices:
          - driver: nvidia
            device_ids: ['0', '1']
            capabilities: [gpu]

Build the container from source:

$ source deploy/compose/compose.env
$ docker compose -f deploy/compose/rag-app-text-chatbot.yaml build jupyter-server

Start the container which starts the notebook server:

$ source deploy/compose/compose.env
$ docker compose -f deploy/compose/rag-app-text-chatbot.yaml up jupyter-server

Open the JupyterLab server at http://host-ip:8888