TL;DR — Set ANTHROPIC_BASE_URL=https://api.jusinfer.com and ANTHROPIC_AUTH_TOKEN=jinf_... in your shell. jusInfer speaks the Anthropic Messages API natively at /v1/messages — Claude Code's default — and routes every call to the cheapest capable model under the hood.
Use jusInfer with Claude Code
Claude Code is Anthropic's terminal-native coding agent. It speaks the Anthropic Messages API (POST /v1/messages) by default. jusInfer exposes that exact surface — so Claude Code points at api.jusinfer.com with one env-var change and everything else (tool use, streaming, thinking, system prompts) keeps working unchanged. Under the hood, jusInfer routes each call to the cheapest capable model.
Why route through jusInfer?
- Cost. We pick the most cost-effective model that meets the bar for each call — frontier when you need it, smaller when you don't. Same API, lower bill.
- No vendor lock-in. Your code only ever talks to
api.jusinfer.com. We swap models behind the scenes as the market moves. - One bill. Credits roll over, never expire. Per-user soft caps. No surprise overages.
- Same DX. Your existing Claude Code workflow doesn't change — just the endpoint.
30-second setup
1. Get a jusInfer API key
# Sign in at https://jusinfer.com/login (Google or Microsoft)
# Then visit https://jusinfer.com/developer → Keys tab → "Mint key"
# Copy the jinf_… token. You'll only see it once.
2. Point Claude Code at jusInfer
Claude Code uses two environment variables for OpenAI-compatible mode:
export ANTHROPIC_BASE_URL="https://api.jusinfer.com"
export ANTHROPIC_AUTH_TOKEN="jinf_your_key_here"
Add those to your shell profile (~/.zshrc, ~/.bashrc) and reload:
source ~/.zshrc
3. Verify
claude --version
claude # should start a normal session
Inside Claude Code, ask it anything — say, "summarize this directory". The call is routed through jusInfer. Verify the spend on your dashboard.
Per-project override
If you only want jusInfer for one repo:
# Create .envrc in your repo (with direnv) or a shell alias:
alias claude-jusinfer='ANTHROPIC_BASE_URL=https://api.jusinfer.com \
ANTHROPIC_AUTH_TOKEN=jinf_... claude'
What jusInfer picks
jusInfer routes every request through three filters:
- Capability — does the task need hard reasoning, long context, or tool use?
- Cost — per-token rate, cache hit rate, retry cost
- Fit — task type, codebase size, your spend profile
The chosen model is opaque by default. You can inspect any call's actual upstream model from the Usage tab if you want to.
Limits and gotchas
- Streaming works. Server-sent events are forwarded transparently.
- Tool use works. Both Anthropic-style and OpenAI-style tool calls are normalized.
- System prompts work. No magic — they pass through untouched.
- Image inputs route to a vision-capable model automatically.
- Rate limits: 60 RPM per key, 600 RPM per tenant. Bump in the dashboard if you need more.
Troubleshooting
| Symptom | Likely cause | Fix |
|---|---|---|
401 INVALID_KEY | Key was revoked or never minted | Mint a fresh key in the dashboard |
403 SEAT_REQUIRED | Tenant has >1 member with no seat sub | Subscribe in Billing tab |
429 RATE_LIMIT | Burst hit the per-key 60 RPM | Wait 60s or raise the limit |
502 | Upstream model temporarily unavailable | Retry; jusInfer automatically falls back |
Related
This page is also available as raw markdown for AI agents to consume directly: /docs/claude-code.md