TL;DR — Run `interpreter --api_base https://api.jusinfer.com/v1 --api_key <jinf_token> --model jusInfer-auto`. OI's litellm layer passes the OpenAI-compatible request through; jusInfer picks the cheapest capable model per turn. Local code execution, tool approvals, and conversation memory are unaffected.
Open Interpreter + custom provider — keep the agent, drop the bill
Open Interpreter (OI) lets a model run code on your machine to actually accomplish a task — read files, install packages, query databases, plot data. The agent loop is short and tight: model emits code → OI runs it → OI feeds stdout/stderr back → model decides next step. That tight loop means a lot of model calls per task, which means a lot of money on a frontier model. Pointing OI at jusInfer drops the per-call cost without changing the code-execution behavior at all.
Why it works
OI's model layer is litellm, which speaks OpenAI-compatible by default. Anything that accepts a base URL and bearer token works.
Setup
1. Mint a jusInfer API key
Sign in at jusinfer.com/login. Open jusinfer.com/developer → Keys tab → Mint key. Copy the jinf_… token.
2. Launch OI against jusInfer
CLI flags (one-off):
interpreter \
--api_base https://api.jusinfer.com/v1 \
--api_key jinf_your_token_here \
--model jusInfer-auto
Or set them once in ~/.openinterpreter/config.yaml:
llm:
api_base: https://api.jusinfer.com/v1
api_key: jinf_your_token_here
model: jusInfer-auto
3. Verify
Run interpreter with no further args, then ask it what python version is installed. You'll see OI emit a one-liner, ask permission to run, execute, and report — same flow as before. The model behind it is now jusInfer-auto.
What changes vs default
| Aspect | Default OI (frontier model) | OI + jusInfer |
|---|---|---|
| First-token latency | 400-800ms | 200-500ms (smaller models warm up faster) |
| Cost per "read file → decide" loop step | $0.01-0.05 | $0.002-0.01 |
| Cost per "write 200-line script" step | $0.04-0.10 | $0.02-0.05 |
| Code-execution behavior | unchanged | unchanged |
| Conversation memory | unchanged | unchanged |
| Tool approval flow | unchanged | unchanged |
| Local file access | unchanged | unchanged |
What about safety mode / --safe_mode?
OI's safe mode (interactive approval before each code execution) is a CLIENT-side feature. The model behind it doesn't know whether you'll approve, deny, or modify the code. Switching to jusInfer doesn't weaken safe mode — your approvals still gate every execution.
What about offline / --local?
--local runs an Ollama or LM Studio model on your machine. Don't combine --local with --api_base — they're mutually exclusive. Use --local when you want privacy + zero per-call cost; use jusInfer when you want frontier quality at routed-down cost.
Multi-step tasks: where the savings compound
OI tasks tend to be long. "Clean this dataset and produce three charts" might be 30-50 model calls — read CSV, inspect schema, write cleaning script, run it, check output, write plotting script, run it, check output, iterate. Per-call cost matters more than per-token cost.
Sample task — "load sales.csv, find the top 5 products by revenue in 2025, write each to a separate JSON file":
| Model | Total calls | Total cost | Wall time |
|---|---|---|---|
| Claude Sonnet (direct) | 12 | ~$0.18 | 38s |
| GPT-4.1 (direct) | 11 | ~$0.14 | 41s |
| jusInfer-auto | 12 | ~$0.03 | 35s |
The wall time barely moves because the bottleneck is local code execution, not the model. The cost moves a lot because every "tell me what the columns are" step now lands on a small fast model.
When you'd stay on the direct provider
- Custom system prompts that depend on provider-specific tool calling — OI uses litellm's normalized tool-use, so this is rare.
- Your org has an inference contract you need to bill against — jusInfer is a passthrough; underlying providers see jusInfer's account.
Switching back
Drop the --api_base / --api_key flags or comment them out in config.yaml. OI falls back to whatever was set in environment variables (OPENAI_API_KEY, ANTHROPIC_API_KEY, etc.).
Further reading
- Custom agent harness on an OpenAI-compatible base URL — the same pattern for any agent runtime
- Aider + cheap inference — for non-code-executing pair-programming agents
- jusInfer API reference — the endpoint OI's litellm layer hits