Skip to content
2026-05-26 · Kalmantic

TL;DR — Add a `models` entry in ~/.continue/config.json with provider "openai", apiBase https://api.jusinfer.com/v1, your jinf_ key, and model "jusInfer-auto". Restart VS Code and the new model appears in Continue's model picker. Works the same for JetBrains.

Continue.dev — configure a custom model in 30 seconds

Continue is the most config-driven open-source coding agent — every model, provider, and slash-command is declared in ~/.continue/config.json. That makes it the easiest agent to point at a custom OpenAI-compatible endpoint like jusInfer.

The minimal change

Edit ~/.continue/config.json. Add a models array entry:

{
  "models": [
    {
      "title": "jusInfer",
      "provider": "openai",
      "apiBase": "https://api.jusinfer.com/v1",
      "apiKey": "jinf_your_key_here",
      "model": "jusInfer-auto"
    }
  ]
}

Restart VS Code (or your JetBrains IDE). Open the Continue panel, click the model picker, and pick jusInfer. Done.

Getting a key

Sign in at jusinfer.com/login. Open jusinfer.com/developerKeys tab → Mint key. Copy the jinf_… token (shown once). Paste into the apiKey field above.

Multiple jusInfer models at once

If you want different routing for chat vs autocomplete vs agentic edits, add multiple entries with explicit model IDs:

{
  "models": [
    {
      "title": "jusInfer (auto)",
      "provider": "openai",
      "apiBase": "https://api.jusinfer.com/v1",
      "apiKey": "jinf_…",
      "model": "jusInfer-auto"
    },
    {
      "title": "Claude Sonnet via jusInfer",
      "provider": "openai",
      "apiBase": "https://api.jusinfer.com/v1",
      "apiKey": "jinf_…",
      "model": "anthropic/claude-sonnet-4.5"
    },
    {
      "title": "Hermes 4 via jusInfer",
      "provider": "openai",
      "apiBase": "https://api.jusinfer.com/v1",
      "apiKey": "jinf_…",
      "model": "nousresearch/hermes-4-405b"
    }
  ]
}

Continue's model picker lets you switch on the fly. Useful when you want to force a specific model for a tough refactor.

Tab autocomplete

Continue's tab autocomplete (the inline suggestion as you type) reads from a separate tabAutocompleteModel block. Point that at a cheap small model via jusInfer:

{
  "tabAutocompleteModel": {
    "title": "jusInfer tab",
    "provider": "openai",
    "apiBase": "https://api.jusinfer.com/v1",
    "apiKey": "jinf_…",
    "model": "qwen/qwen3-coder-8b"
  }
}

Tab autocomplete fires constantly while you type. Routing it to an 8B model instead of a frontier model is the single biggest cost lever for Continue users. Quality is indistinguishable for the tactical "what comes next" call.

Embeddings (RAG)

If you use Continue's @codebase retrieval, the embeddings model is separate. jusInfer doesn't currently route embeddings — keep using Continue's default (or your own embedding provider) for that subset. Inference is what we route; embeddings are a different workload.

Verifying

After restarting, open the Continue panel and ask "what's in this file?" with the cursor on an open file. The call goes through jusInfer. Open jusinfer.com/developerUsage tab — you'll see the spend tick up.

If you don't see the model in Continue's picker after editing config.json, double-check JSON syntax (a trailing comma will make Continue silently ignore the file).

What works, what doesn't

FeatureWorks?Notes
Chat
Slash commands (/edit, /comment, /test)
@codebase retrievalInference yes; embeddings out of scope (use Continue's default)
@docs / @url contextContinue handles fetch; jusInfer handles inference
Tab autocompleteCheap-model routing has huge cost leverage here
Image inputsAuto-routes to vision-capable model
Custom slash commandsWhatever you've defined in config.json works
Local models (Ollama path)n/aNot a jusInfer concern — use Continue's local provider for those

Team configuration

Continue supports a team config that fetches config.json from a shared URL. Host a team config at e.g. https://your-domain.com/continue-config.json with the jusInfer endpoint, distribute the URL to your team, and everyone gets the same routing without each engineer hand-editing their config.

You can also use a secrets.json for the API key so team members don't share one — just generate per-user jinf_ keys (each engineer mints their own via /developer).

Why Continue benefits from per-call routing

Continue is unusually heavy on small calls. Tab autocomplete fires hundreds of times per hour. @codebase retrieval fans out to multiple sub-queries. Each chat turn is a roundtrip. If every one of those goes to Sonnet 4.5, you're paying $10-40/day per engineer.

jusInfer picks the cheapest capable model for each call. A tab autocomplete goes to Qwen3 8B for ~$0.05/M tokens. A multi-file refactor goes to Sonnet for ~$3/M tokens. The blended bill drops 60-80% with no quality regression on the tactical traffic that makes up the bulk.

Setup checklist

  1. Sign up at jusinfer.com/login.
  2. Mint a jinf_ key at /developer → Keys.
  3. Edit ~/.continue/config.json — add the models block above.
  4. Restart VS Code or JetBrains.
  5. Pick "jusInfer" from the model selector.
  6. (Optional) point tabAutocompleteModel at a small model for autocomplete savings.

Raw markdown: /blog/continue-custom-model.md

continuecontinue-devcustom-modelopenai-compatiblevscodejetbrains