Using LLMs hosted on NVIDIA API Catalog#

This guide teaches you how to use NeMo Guardrails with LLMs hosted on NVIDIA API Catalog. It uses the ABC Bot configuration and with the meta/llama-3.1-70b-instruct model. Similarly, you can use meta/llama-3.1-405b-instruct, meta/llama-3.1-8b-instruct or any other AI Foundation Model.

Prerequisites#

Before you begin, ensure you have the following prerequisites in place:

  1. Install the langchain-nvidia-ai-endpoints package:

pip install -U --quiet langchain-nvidia-ai-endpoints
  1. An NVIDIA NGC account to access AI Foundation Models. To create a free account go to NVIDIA NGC website.

  2. An API key from NVIDIA API Catalog:

    • Generate an API key by navigating to the AI Foundation Models section on the NVIDIA NGC website, selecting a model with an API endpoint, and generating an API key. You can use this API key for all models available in the NVIDIA API Catalog.

    • Export the NVIDIA API key as an environment variable:

export NVIDIA_API_KEY=$NVIDIA_API_KEY # Replace with your own key
  1. If you’re running this inside a notebook, patch the AsyncIO loop.

import nest_asyncio

nest_asyncio.apply()

Configuration#

To get started, copy the ABC bot configuration into a subdirectory called config:

cp -r ../../../../examples/bots/abc config

Update the models section of the config.yml file to the desired model supported by NVIDIA API Catalog:

...
models:
  - type: main
    engine: nvidia_ai_endpoints
    model: meta/llama-3.1-70b-instruct
...

Usage#

Load the guardrail configuration:

from nemoguardrails import LLMRails, RailsConfig

config = RailsConfig.from_path("./config")
rails = LLMRails(config)

Test that it works:

response = rails.generate(messages=[
{
    "role": "user",
    "content": "How many vacation days do I have per year?"
}])
print(response['content'])
According to our company policy, you are eligible for 20 days of vacation per year, accrued monthly.

You can see that the bot responds correctly.

Conclusion#

In this guide, you learned how to connect a NeMo Guardrails configuration to an NVIDIA API Catalog LLM model. This guide uses meta/llama-3.1-70b-instruct, however, you can connect any other model by following the same steps.