Creating an LLM Chain

Implementing the Method

  1. Edit the RetrievalAugmentedGeneration/examples/simple_rag_api_catalog/chains.py file and add the following import statements:

    from langchain_core.prompts import ChatPromptTemplate
    from langchain_core.output_parsers import StrOutputParser
    
  2. Update the llm_chain method with the following statements:

        def llm_chain(self, query: str, chat_history: List["Message"], **kwargs) -> Generator[str, None, None]:
            """Code to form an answer using LLM when context is already supplied"""
            logger.info("Using llm to generate response directly without knowledge base.")
            system_message = [("system", settings.prompts.chat_template)]
            conversation_history = [(msg.role, msg.content) for msg in chat_history]
            user_input = [("user", "{input}")]
    
            if conversation_history:
                prompt_template = ChatPromptTemplate.from_messages(
                    system_message + conversation_history + user_input
                )
            else:
                prompt_template = ChatPromptTemplate.from_messages(
                    system_message + user_input
                )
    
            llm = get_llm(**kwargs)
            chain = prompt_template | llm | StrOutputParser()
    
            augmented_user_input = (f"\n\nQuestion: {query}\n")
            return chain.stream({"input": augmented_user_input})
    

Building and Running with Docker Compose

Using the containers has one additional step this time: exporting your NVIDIA API key as an environment variable.

  1. Build the container for the Chain Server:

    $ docker compose --env-file deploy/compose/compose.env -f deploy/compose/simple-rag-api-catalog.yaml build chain-server
    
  2. Export your NVIDIA API key in an environment variable:

    $ export NVIDIA_API_KEY=nvapi-...
    
  3. Run the containers:

    $ docker compose --env-file deploy/compose/compose.env -f deploy/compose/simple-rag-api-catalog.yaml up -d
    

Verify the LLM Chain Method Using Curl

You can access the Chain Server with a URL like http://localhost:8081.

  • Confirm the llm_chain method runs by submitting a query:

    $ curl -H "Content-Type: application/json" http://localhost:8081/generate \
        -d '{"messages":[{"role":"user", "content":"What should I see in Paris?"}], "use_knowledge_base": false}'
    

    Example Output

    data: {"id":"3ffaf2b4-edd1-479e-9192-5b84b49c6ab6","choices":[{"index":0,"message":{"role":"assistant","content":""},"finish_reason":""}]}
    
    data: {"id":"3ffaf2b4-edd1-479e-9192-5b84b49c6ab6","choices":[{"index":0,"message":{"role":"assistant","content":" In"},"finish_reason":""}]}
    
    data: {"id":"3ffaf2b4-edd1-479e-9192-5b84b49c6ab6","choices":[{"index":0,"message":{"role":"assistant","content":" Paris, you should see famous landmarks such as the Eiffel Tower, the Louvre Museum, Notre-Dame Cathedral, the Arc de"},"finish_reason":""}]}
    
    data: {"id":"3ffaf2b4-edd1-479e-9192-5b84b49c6ab6","choices":[{"index":0,"message":{"role":"assistant","content":" Triomphe, and the Sacré-Cœur Basilica. You can also take a stroll along the Seine River, visit the Montmartre"},"finish_reason":""}]}
    
    data: {"id":"3ffaf2b4-edd1-479e-9192-5b84b49c6ab6","choices":[{"index":0,"message":{"role":"assistant","content":" district, and explore the charming streets of the Marais neighborhood. If you're interested in art, be sure to check out the Musée d'Or"},"finish_reason":""}]}
    
    data: {"id":"3ffaf2b4-edd1-479e-9192-5b84b49c6ab6","choices":[{"index":0,"message":{"role":"assistant","content":"say"},"finish_reason":""}]}
    
    data: {"id":"3ffaf2b4-edd1-479e-9192-5b84b49c6ab6","choices":[{"index":0,"message":{"role":"assistant","content":" and the Centre Pompidou. And don't forget to try some delicious French cuisine while you're there!"},"finish_reason":""}]}
    
    data: {"id":"3ffaf2b4-edd1-479e-9192-5b84b49c6ab6","choices":[{"index":0,"message":{"role":"assistant","content":""},"finish_reason":""}]}
    
    data: {"id":"3ffaf2b4-edd1-479e-9192-5b84b49c6ab6","choices":[{"index":0,"message":{"role":"assistant","content":""},"finish_reason":"[DONE]"}]}
    

Next Steps

  • Creating a RAG Chain

  • You can stop the containers by running the docker compose -f deploy/compose/simple-rag-api-catalog.yaml down command.