Switch Inference Models at Runtime#
Change the active inference model while the sandbox is running. No restart is required.
Prerequisites#
A running NemoClaw sandbox.
The OpenShell CLI on your
PATH.
Switch to a Different Model#
Switching happens through the OpenShell inference route. Use the provider and model that match the upstream you want to use.
NVIDIA Endpoints#
$ openshell inference set --provider nvidia-prod --model nvidia/nemotron-3-super-120b-a12b
OpenAI#
$ openshell inference set --provider openai-api --model gpt-5.4
Anthropic#
$ openshell inference set --provider anthropic-prod --model claude-sonnet-4-6
Google Gemini#
$ openshell inference set --provider gemini-api --model gemini-2.5-flash
Compatible Endpoints#
If you onboarded a custom compatible endpoint, switch models with the provider created for that endpoint:
$ openshell inference set --provider compatible-endpoint --model <model-name>
$ openshell inference set --provider compatible-anthropic-endpoint --model <model-name>
If the provider itself needs to change, rerun nemoclaw onboard.
Verify the Active Model#
Run the status command to confirm the change:
$ nemoclaw <name> status
Add the --json flag for machine-readable output:
$ nemoclaw <name> status --json
The output includes the active provider, model, and endpoint.
Notes#
The host keeps provider credentials.
The sandbox continues to use
inference.local.Runtime switching changes the OpenShell route.
It does not rewrite your stored credentials.