Switch Inference Models at Runtime#

Change the active inference model while the sandbox is running. No restart is required.

Prerequisites#

  • A running NemoClaw sandbox.

  • The OpenShell CLI on your PATH.

Switch to a Different Model#

Switching happens through the OpenShell inference route. Use the provider and model that match the upstream you want to use.

NVIDIA Endpoints#

$ openshell inference set --provider nvidia-prod --model nvidia/nemotron-3-super-120b-a12b

OpenAI#

$ openshell inference set --provider openai-api --model gpt-5.4

Anthropic#

$ openshell inference set --provider anthropic-prod --model claude-sonnet-4-6

Google Gemini#

$ openshell inference set --provider gemini-api --model gemini-2.5-flash

Compatible Endpoints#

If you onboarded a custom compatible endpoint, switch models with the provider created for that endpoint:

$ openshell inference set --provider compatible-endpoint --model <model-name>
$ openshell inference set --provider compatible-anthropic-endpoint --model <model-name>

If the provider itself needs to change, rerun nemoclaw onboard.

Verify the Active Model#

Run the status command to confirm the change:

$ nemoclaw <name> status

Add the --json flag for machine-readable output:

$ nemoclaw <name> status --json

The output includes the active provider, model, and endpoint.

Notes#

  • The host keeps provider credentials.

  • The sandbox continues to use inference.local.

  • Runtime switching changes the OpenShell route.

  • It does not rewrite your stored credentials.