Ecosystem#

NeMo Flow is the agent execution runtime layer in the NVIDIA NeMo ecosystem. It does not replace an agent framework, model provider, guardrail authoring system, or deployment platform. Instead, it gives those systems one shared way to model execution scopes, lifecycle events, middleware, plugins, adaptive behavior, and observability around tool and LLM calls.

Use this page to understand where NeMo Flow fits:

  • Inside the NVIDIA NeMo software stack

  • Inside agent frameworks, harnesses, and provider adapters

  • Across the Rust, Python, Node.js, Go, WebAssembly, and C FFI surfaces in this repository

How NeMo Flow Fits In The NVIDIA NeMo Ecosystem#

The NVIDIA NeMo ecosystem spans model development, agent construction, guardrailing, inference, optimization, and runtime operations. NeMo Flow has a narrower responsibility: it is the portable execution substrate that agent systems can call when actual work crosses a scope, tool, or model boundary.

Layer

Typical Responsibility

NeMo Flow Relationship

NeMo model, inference, and deployment components

Provide or serve the models an agent uses.

NeMo Flow records and controls LLM execution boundaries, but it does not train, host, or route model inference by itself.

NeMo Agent Toolkit and agent application frameworks

Build, run, profile, and optimize agent workflows across tools, data sources, and framework choices.

NeMo Flow can sit below these systems as the shared runtime contract for scopes, middleware, lifecycle events, subscribers, and plugins.

NeMo Guardrails and policy systems

Define safety, control, and compliance behavior for LLM applications.

NeMo Flow can host runtime guardrails and intercepts around managed tool and LLM calls, while higher-level guardrail systems can still own policy authoring and orchestration.

Application harnesses and workflow code

Decide the agent pattern, planner, memory, retries, scheduling, and user-facing behavior.

NeMo Flow instruments the execution boundaries that the harness already owns.

Observability and evaluation backends

Store traces, trajectories, metrics, and analysis data.

NeMo Flow emits lifecycle events and exports them to in-process subscribers, ATIF, OpenTelemetry, OpenInference-compatible traces, or other backends.

In practical terms, NeMo Flow answers a different question than higher-level agent products. A framework asks, “What should the agent do next?” NeMo Flow asks, “When the agent does work, which scope owns it, which middleware applies, what events are emitted, and which subscribers can consume the result?”

        flowchart TB
    User[User / Application]
    Framework[Agent Framework or Harness]
    Toolkit[NeMo Agent Toolkit / Framework Integrations]
    Flow[NeMo Flow Runtime]
    Provider[Model, Tool, or Provider SDK]
    Obs[Subscribers and Observability Backends]
    Policy[Guardrails, Intercepts, and Plugins]

    User --> Framework
    Framework --> Toolkit
    Toolkit --> Flow
    Framework -. direct instrumentation .-> Flow
    Flow --> Provider
    Policy --> Flow
    Flow --> Obs

    class Framework yellow-lightest;
    class Toolkit blue-lightest;
    class Flow green-light;
    class Provider purple-lightest;
    class Obs grey-light;
    class Policy green-lightest;
    

The dotted path matters. An application or custom harness can call NeMo Flow directly without adopting a higher-level framework. A framework integration can also call NeMo Flow on behalf of application code when the framework owns the tool or provider boundary.

How NeMo Flow Fits Agent Frameworks And Harnesses#

The agent framework and harness landscape is intentionally mixed. A team might use NeMo Agent Toolkit, LangChain, LangGraph, an internal orchestration layer, a provider SDK, or direct application code. NeMo Flow is designed to meet those systems at stable execution boundaries instead of requiring one framework shape.

Integration Point

Use NeMo Flow For

Keep In The Framework Or Harness

Request, run, workflow, or agent lifecycle hooks

Create scopes, emit scope start and end events, and isolate concurrent work.

Scheduling, routing, retry policy, planner choice, memory, and user session state.

Tool invocation callbacks

Run managed tool execution, apply tool middleware, emit tool lifecycle events, and preserve parent scope context.

Tool discovery, tool schema presentation, framework-specific callback signatures, and application-visible result handling.

LLM or provider adapter calls

Run managed LLM execution, attach model metadata, apply LLM middleware, handle stream lifecycle events, and emit normalized observability payloads.

Provider clients, authentication, transport, provider-native request objects, and provider-specific response types.

Framework internals that cannot hand over a callback

Use explicit lifecycle APIs, request-intercept helpers, guardrail helpers, or mark events.

The actual invocation path when the framework must retain control.

Cross-cutting behavior

Package middleware, subscribers, adaptive behavior, and reusable policy as plugins.

Framework configuration, agent definitions, deployment topology, and business logic.

Prefer a managed execution wrapper when a framework exposes a stable callback that NeMo Flow can own. Use explicit lifecycle calls or standalone helpers when the framework owns the callback internally but exposes reliable start, finish, or request transformation hooks.

This lets NeMo Flow provide consistent runtime semantics without forcing a framework migration:

  • Applications keep their existing agent orchestration model

  • Framework adapters preserve public behavior and callback signatures

  • Non-serializable provider objects stay in framework-owned storage

  • NeMo Flow receives JSON-compatible payloads for middleware and events

  • Subscribers see a consistent scope, tool, and LLM event stream across integrations