Common questions about jusInfer.
Short, specific answers. Each answer is also formatted as schema.org FAQPage JSON-LD so AI answer engines and Google can cite them directly.
What is jusInfer?
jusInfer is an OpenAI-compatible inference gateway at https://api.jusinfer.com that routes each request to the cheapest capable model for the task. It's designed for coding agents — Claude Code, OpenCode, Cursor, Aider, Cline, Continue, Goose, and any custom OpenAI-compatible harness — and reduces typical inference costs 50–80% with no workflow change.
How is jusInfer different from OpenRouter?
OpenRouter is an aggregator — you pick the model per call, they route to the provider. jusInfer is a smart router — we pick the model per call, optimized for coding-task quality and cost. OpenRouter is best when you already know which model you want. jusInfer is best when you'd rather have the system decide.
Does jusInfer work with Claude Code?
Yes. Set ANTHROPIC_BASE_URL=https://api.jusinfer.com and ANTHROPIC_AUTH_TOKEN=jinf_... in your shell. Claude Code's OpenAI-compatible mode routes every call through jusInfer without other workflow changes. Full guide at https://jusinfer.com/docs/claude-code/.
Does jusInfer work with Cursor?
Yes. In Cursor Settings → Models, set the Custom OpenAI Base URL to https://api.jusinfer.com/v1 and the API key to your jinf_ token. Add jusInfer-auto to Custom Models. Cursor calls route through jusInfer for 60–80% savings. Detailed guide at https://jusinfer.com/blog/cursor-too-expensive-options/.
Is jusInfer OpenAI-compatible?
Yes — both the Chat Completions API (/v1/chat/completions) and the newer Responses API (/v1/responses) are supported with the standard OpenAI request shape. Streaming, tool calls, vision inputs, JSON mode, and parallel tool calls all work. Any client built for OpenAI works by changing base_url and api_key only.
How much does jusInfer cost?
Pay-as-you-go credits, $5/$10/$25/$50 packs via Stripe. No subscription required for solo accounts. Teams with more than one member need a $1/seat/month subscription (the first user is always free). Per-token rates depend on the underlying model selected per call — typical blended cost for coding workloads is $0.20–$1.00 per million tokens. Credits never expire.
Is there a free tier?
Every new signup gets $0.05 in starter credit — enough to test the full integration end to end and run a few dozen real coding-agent tasks before topping up. There is no time-based free trial; credits don't expire.
Can I use jusInfer in production?
Yes. The gateway runs on Cloudflare Workers with global edge deployment. Per-key and per-tenant rate limits, per-user soft caps, anomaly alerts, and margin-drift monitoring are built in. We do not yet offer a formal SLA or enterprise tier — if you need either, email hello@jusinfer.com.
What models does jusInfer use?
We route to a curated set of frontier and open-weights models across Anthropic (Claude), OpenAI (GPT), Together (Qwen, Llama, Hermes), Fireworks (DeepSeek), and Cloudflare Workers AI (Kimi, Qwen, Llama variants). The specific model per call is selected based on task type, context length, and cost — visible in the Usage tab on the dashboard for any individual call.
Do you store my prompts or generated code?
We store token counts, model selection, and timestamps for billing and anomaly detection. We do not store prompt content, completion content, or generated code beyond a short transient window required to forward the request to the upstream provider. Upstream providers (Anthropic, OpenAI, etc.) have their own retention policies — see their docs.
Where is jusInfer hosted?
Cloudflare Workers — global edge deployment with regional routing based on caller location. Data (auth tokens, usage records, pre-orders) lives in Cloudflare D1 in a single primary region. The site (jusinfer.com) is on Firebase Hosting; the API (api.jusinfer.com) is on Cloudflare Workers.
How do I migrate from OpenAI / Anthropic / OpenRouter direct?
Change two values in your client config: base_url to https://api.jusinfer.com/v1 and api_key to your jinf_ token (mint at https://jusinfer.com/developer). No other code changes needed. Existing model-name overrides are honored — pass nousresearch/hermes-4-405b or anthropic/claude-sonnet-4.5 and we route accordingly.
What if a model is temporarily unavailable?
jusInfer retries transparently across providers. If Anthropic Sonnet 4.5 is rate-limited, we fall back to an equivalent-tier model (GPT-5 or a top-tier open-weights option) automatically. Your client sees a successful response, not a 5xx error. Failure is only surfaced when all candidates fail.
Can I see which underlying model was used for a call?
Yes. The Usage tab in the dashboard (https://jusinfer.com/developer) shows the actual upstream model id per call. By default this is hidden from the API response to keep clients vendor-agnostic, but you can opt in to including it in usage metadata for individual calls.