About Inference Routing#
OpenShell handles inference in two ways:
Path |
How It Works |
|---|---|
External endpoints |
Traffic to hosts like |
|
A special endpoint exposed inside every sandbox for inference that should stay local to the host for privacy and security. The privacy router strips the original credentials, injects the configured backend credentials, and forwards to the managed model endpoint. |
How inference.local Works#
When code inside a sandbox calls https://inference.local, the privacy router routes the request to the configured backend for that gateway. The configured model is applied to generation requests, and provider credentials are supplied by OpenShell rather than by code inside the sandbox.
If code calls an external inference host directly, that traffic is evaluated only by network_policies.
Property |
Detail |
|---|---|
Credentials |
No sandbox API keys needed — credentials come from the configured provider record. |
Configuration |
One provider and one model define sandbox inference. |
Provider support |
OpenAI, Anthropic, and NVIDIA providers all work through the same endpoint. |
Hot-refresh |
Provider credential changes and inference updates are picked up without recreating sandboxes. |
Supported API Patterns#
Supported request patterns depend on the provider configured for inference.local.
Pattern |
Method |
Path |
|---|---|---|
Chat Completions |
|
|
Completions |
|
|
Responses |
|
|
Model Discovery |
|
|
Model Discovery |
|
|
Pattern |
Method |
Path |
|---|---|---|
Messages |
|
|
Requests to inference.local that do not match the configured provider’s supported patterns are denied.
Next Steps#
Continue with one of the following:
To set up the backend behind
inference.local, refer to Configure Inference Routing.To control external endpoints, refer to Network Access Rules.