TL;DR — Add a `models` entry in ~/.continue/config.json with provider "openai", apiBase https://api.jusinfer.com/v1, your jinf_ key, and model "jusInfer-auto". Restart VS Code and the new model appears in Continue's model picker. Works the same for JetBrains.
Continue.dev — configure a custom model in 30 seconds
Continue is the most config-driven open-source coding agent — every model, provider, and slash-command is declared in ~/.continue/config.json. That makes it the easiest agent to point at a custom OpenAI-compatible endpoint like jusInfer.
The minimal change
Edit ~/.continue/config.json. Add a models array entry:
{
"models": [
{
"title": "jusInfer",
"provider": "openai",
"apiBase": "https://api.jusinfer.com/v1",
"apiKey": "jinf_your_key_here",
"model": "jusInfer-auto"
}
]
}
Restart VS Code (or your JetBrains IDE). Open the Continue panel, click the model picker, and pick jusInfer. Done.
Getting a key
Sign in at jusinfer.com/login. Open jusinfer.com/developer → Keys tab → Mint key. Copy the jinf_… token (shown once). Paste into the apiKey field above.
Multiple jusInfer models at once
If you want different routing for chat vs autocomplete vs agentic edits, add multiple entries with explicit model IDs:
{
"models": [
{
"title": "jusInfer (auto)",
"provider": "openai",
"apiBase": "https://api.jusinfer.com/v1",
"apiKey": "jinf_…",
"model": "jusInfer-auto"
},
{
"title": "Claude Sonnet via jusInfer",
"provider": "openai",
"apiBase": "https://api.jusinfer.com/v1",
"apiKey": "jinf_…",
"model": "anthropic/claude-sonnet-4.5"
},
{
"title": "Hermes 4 via jusInfer",
"provider": "openai",
"apiBase": "https://api.jusinfer.com/v1",
"apiKey": "jinf_…",
"model": "nousresearch/hermes-4-405b"
}
]
}
Continue's model picker lets you switch on the fly. Useful when you want to force a specific model for a tough refactor.
Tab autocomplete
Continue's tab autocomplete (the inline suggestion as you type) reads from a separate tabAutocompleteModel block. Point that at a cheap small model via jusInfer:
{
"tabAutocompleteModel": {
"title": "jusInfer tab",
"provider": "openai",
"apiBase": "https://api.jusinfer.com/v1",
"apiKey": "jinf_…",
"model": "qwen/qwen3-coder-8b"
}
}
Tab autocomplete fires constantly while you type. Routing it to an 8B model instead of a frontier model is the single biggest cost lever for Continue users. Quality is indistinguishable for the tactical "what comes next" call.
Embeddings (RAG)
If you use Continue's @codebase retrieval, the embeddings model is separate. jusInfer doesn't currently route embeddings — keep using Continue's default (or your own embedding provider) for that subset. Inference is what we route; embeddings are a different workload.
Verifying
After restarting, open the Continue panel and ask "what's in this file?" with the cursor on an open file. The call goes through jusInfer. Open jusinfer.com/developer → Usage tab — you'll see the spend tick up.
If you don't see the model in Continue's picker after editing config.json, double-check JSON syntax (a trailing comma will make Continue silently ignore the file).
What works, what doesn't
| Feature | Works? | Notes |
|---|---|---|
| Chat | ✅ | |
Slash commands (/edit, /comment, /test) | ✅ | |
@codebase retrieval | ✅ | Inference yes; embeddings out of scope (use Continue's default) |
@docs / @url context | ✅ | Continue handles fetch; jusInfer handles inference |
| Tab autocomplete | ✅ | Cheap-model routing has huge cost leverage here |
| Image inputs | ✅ | Auto-routes to vision-capable model |
| Custom slash commands | ✅ | Whatever you've defined in config.json works |
| Local models (Ollama path) | n/a | Not a jusInfer concern — use Continue's local provider for those |
Team configuration
Continue supports a team config that fetches config.json from a shared URL. Host a team config at e.g. https://your-domain.com/continue-config.json with the jusInfer endpoint, distribute the URL to your team, and everyone gets the same routing without each engineer hand-editing their config.
You can also use a secrets.json for the API key so team members don't share one — just generate per-user jinf_ keys (each engineer mints their own via /developer).
Why Continue benefits from per-call routing
Continue is unusually heavy on small calls. Tab autocomplete fires hundreds of times per hour. @codebase retrieval fans out to multiple sub-queries. Each chat turn is a roundtrip. If every one of those goes to Sonnet 4.5, you're paying $10-40/day per engineer.
jusInfer picks the cheapest capable model for each call. A tab autocomplete goes to Qwen3 8B for ~$0.05/M tokens. A multi-file refactor goes to Sonnet for ~$3/M tokens. The blended bill drops 60-80% with no quality regression on the tactical traffic that makes up the bulk.
Setup checklist
- Sign up at jusinfer.com/login.
- Mint a
jinf_key at /developer → Keys. - Edit
~/.continue/config.json— add themodelsblock above. - Restart VS Code or JetBrains.
- Pick "jusInfer" from the model selector.
- (Optional) point
tabAutocompleteModelat a small model for autocomplete savings.
Related reading
- OpenAI-compatible drop-in (Cursor, Aider, Cline, Goose…)
- Cline + custom endpoint — cut your VS Code agent bill 60%+
- Aider on a budget
- The cheapest LLM API for coding agents in 2026
Raw markdown: /blog/continue-custom-model.md