Inference Profiles#
NemoClaw configures inference through the OpenShell gateway.
The agent inside the sandbox talks to inference.local, and OpenShell routes that traffic to the provider you selected during onboarding.
Routed Provider Model#
NemoClaw keeps provider credentials on the host. The sandbox does not receive your raw OpenAI, Anthropic, Gemini, or NVIDIA API key.
At onboard time, NemoClaw configures:
an OpenShell provider
an OpenShell inference route
the baked OpenClaw model reference inside the sandbox
That means the sandbox knows which model family to use, while OpenShell owns the actual provider credential and upstream endpoint.
Supported Providers#
The following non-experimental provider paths are available through nemoclaw onboard.
Provider |
Endpoint Type |
Notes |
|---|---|---|
NVIDIA Endpoints |
OpenAI-compatible |
Hosted models on |
OpenAI |
Native OpenAI-compatible |
Uses OpenAI model IDs |
Other OpenAI-compatible endpoint |
Custom OpenAI-compatible |
For compatible proxies and gateways |
Anthropic |
Native Anthropic |
Uses |
Other Anthropic-compatible endpoint |
Custom Anthropic-compatible |
For Claude proxies and compatible gateways |
Google Gemini |
OpenAI-compatible |
Uses Google’s OpenAI-compatible endpoint |
Local Ollama |
OpenAI-compatible |
Local Ollama runtime routed through |
Validation During Onboarding#
NemoClaw validates the selected provider and model before it creates the sandbox.
OpenAI-compatible providers: NemoClaw tries
/responsesfirst, then/chat/completions.Anthropic-compatible providers: NemoClaw tries
/v1/messages.NVIDIA Endpoints manual model entry: NemoClaw also validates the model name against
https://integrate.api.nvidia.com/v1/models.Compatible endpoint flows: NemoClaw validates by sending a real inference request, because many proxies do not expose a reliable
/modelsendpoint.
If validation fails, the wizard does not continue to sandbox creation.
Local Providers#
Some local providers use the same routed inference.local pattern, but the upstream runtime is local to the host.
Local Ollama
Local NVIDIA NIM
Local vLLM
Only Local NVIDIA NIM and Local vLLM are behind the NEMOCLAW_EXPERIMENTAL=1 gate.
Ollama gets additional onboarding help:
if no models are installed, NemoClaw offers starter models
it pulls the selected model
it warms the model
it validates the model before continuing
Runtime Switching#
For runtime switching guidance, refer to Switch Inference Models.