Switch Inference Models at Runtime#

Change the active inference model while the sandbox is running. No restart is required.

Prerequisites#

Switching happens through the OpenShell inference route. Use the provider and model that match the upstream you want to use.

$ openshell inference set --provider nvidia-prod --model nvidia/nemotron-3-super-120b-a12b

$ openshell inference set --provider openai-api --model gpt-5.4

$ openshell inference set --provider anthropic-prod --model claude-sonnet-4-6

$ openshell inference set --provider gemini-api --model gemini-2.5-flash

If you onboarded a custom compatible endpoint, switch models with the provider created for that endpoint:

$ openshell inference set --provider compatible-endpoint --model <model-name>

$ openshell inference set --provider compatible-anthropic-endpoint --model <model-name>

If the provider itself needs to change, rerun nemoclaw onboard.

Run the status command to confirm the change:

$ nemoclaw <name> status

Add the --json flag for machine-readable output:

$ nemoclaw <name> status --json

The output includes the active provider, model, and endpoint.