Using LLMs hosted on Vertex AI#

This guide teaches you how to use NeMo Guardrails with LLMs hosted on Vertex AI. It uses the ABC Bot configuration and changes the model to gemini-1.0-pro.

This guide assumes you have configured and tested working with Vertex AI models. If not, refer to this guide.

Prerequisites#

You need to install the following Python libraries:

  1. Install the google-cloud-aiplatform and langchain-google-vertexai packages:

pip install --quiet "google-cloud-aiplatform>=1.38.0" langchain-google-vertexai==0.1.0
  1. Set the GOOGLE_APPLICATION_CREDENTIALS environment variable:

export GOOGLE_APPLICATION_CREDENTIALS=$GOOGLE_APPLICATION_CREDENTIALS # Replace with your own key
  1. If you’re running this inside a notebook, patch the AsyncIO loop.

import nest_asyncio
nest_asyncio.apply()

Configuration#

To get started, copy the ABC bot configuration into a subdirectory called config:

cp -r ../../../../examples/bots/abc config

Update the config/config.yml file to use the gemini-1.0-pro model with the vertexai provider:

...

models:
  - type: main
    engine: vertexai
    model: gemini-1.0-pro

...

Load the guardrails configuration:

from nemoguardrails import RailsConfig
from nemoguardrails import LLMRails

config = RailsConfig.from_path("./config")
rails = LLMRails(config)

Test that it works:

response = rails.generate(messages=[{
    "role": "user",
    "content": "Hi! How are you?"
}])
print(response)
{'role': 'assistant', 'content': "I'm doing great! Thank you for asking. I'm here to help you with any questions you may have about the ABC Company."}

You can see that the bot responds correctly. To see in more detail what LLM calls have been made, you can use the print_llm_calls_summary method as follows:

info = rails.explain()
info.print_llm_calls_summary()
Summary: 5 LLM call(s) took 3.99 seconds .

1. Task `self_check_input` took 0.58 seconds .
2. Task `generate_user_intent` took 1.19 seconds .
3. Task `generate_next_steps` took 0.71 seconds .
4. Task `generate_bot_message` took 0.88 seconds .
5. Task `self_check_output` took 0.63 seconds .

Evaluation#

The gemini-1.0-pro and text-bison models have been evaluated for topical rails, and gemini-1.0-pro has also been evaluated as a self-checking model for hallucination and content moderation. Evaluation results can be found here.

Conclusion#

In this guide, you learned how to connect a NeMo Guardrails configuration to a Vertex AI LLM model. This guide uses gemini-1.0-pro, however, you can connect any other model following the same steps.