TL;DR — OpenRouter = price-shopping marketplace. Helicone = observability proxy bolted on top of your direct provider keys. Portkey = enterprise routing + guardrails. jusInfer = coding-agent-specific router with per-step model selection. Pick the one whose primary axis matches your highest-pain problem.
LLM gateway comparison 2026
"LLM gateway" gets used to mean four different products. Picking the wrong shape is the most common mistake we see — a team installs an observability proxy when they actually needed a router, or installs a router when they actually needed cost telemetry on their existing provider keys.
The four shapes:
| Shape | Example | Optimizes for | What you give up |
|---|---|---|---|
| Marketplace | OpenRouter | Lowest-price access to many models | No per-step routing intelligence; pay-per-use markup |
| Observability proxy | Helicone | Telemetry, logs, retries on YOUR keys | No price optimization; you still pay the underlying provider directly |
| Enterprise router | Portkey | Governance, fallbacks, PII redaction | Heavier setup; priced for orgs not individuals |
| Coding-agent router | jusInfer | Per-step model selection for coding workloads | Not optimized for chat, image, embeddings |
OpenRouter
What it is: a marketplace. You send a request specifying a model id (anthropic/claude-sonnet-4, deepseek/deepseek-chat, etc.); OpenRouter forwards to the cheapest-currently-available provider hosting that model and charges you their price plus a small markup.
Best for: getting access to many models behind a single key, especially open-weight models hosted by multiple inference providers (Together, Fireworks, DeepInfra, etc.) where price varies day to day.
Where it's weak for coding agents: you still pick the model. OpenRouter doesn't watch your conversation and route an "is this file a JSON config?" question to a small model. If you pin Claude Sonnet in your harness, every step uses Claude Sonnet.
Helicone
What it is: a proxy you point your existing OpenAI/Anthropic SDK at by swapping the base URL. Your keys stay yours; Helicone logs every request, computes per-user cost, exposes a dashboard. Free tier covers a generous quota.
Best for: teams that already standardized on a direct provider and want telemetry without changing their billing model.
Where it's weak: no price optimization layer. If your bill is $X today with the direct provider, your bill is still $X with Helicone — you just see it itemized. Adding routing requires migrating to a different product class.
Portkey
What it is: an enterprise-shaped router with guardrails (PII redaction, prompt firewalls), multi-provider fallback chains, cost budgets per virtual key, and an audit-log surface.
Best for: orgs with a procurement process, compliance requirements, or multi-team budget allocation. The configuration surface is rich because that's what enterprises need.
Where it's heavier than you want: for an individual developer or a small team, Portkey's setup-to-value time is longer than the alternatives. The features pay back at scale, not at one-developer scale.
jusInfer
What it is: a router specifically for coding-agent traffic. Same OpenAI-compatible API surface, but the model selection inside the gateway is informed by what a coding loop looks like — short "read this file" turns get cheap models, "rewrite this 400-line function" gets a stronger one, "design this system" gets reasoning-capable.
Best for: people running Claude Code, Cursor, Cline, OpenCode, Aider, or any other coding agent and watching their monthly bill grow with usage.
Where it's weak: not optimized for image generation, embeddings, fine-tuning, or chat-style consumer products. The per-step routing logic encodes "what does a coding turn look like" — applied to chatbot traffic, it just looks like a regular cheap-by-default proxy.
Decision matrix
Pick by your highest-pain problem:
| Your pain | Pick |
|---|---|
| "I want access to 100+ models behind one key, even niche open-weight ones" | OpenRouter |
| "My provider bill is fine, but I have no visibility into who's burning the most" | Helicone |
| "Compliance + budgets + multi-team governance is the bottleneck" | Portkey |
| "My coding-agent bill is growing faster than my team" | jusInfer |
| "I want all of the above" | Compose two: Helicone in front of jusInfer, for example. They're not exclusive. |
What about combining?
You can. Helicone is content-neutral — it'll happily proxy any OpenAI-compatible endpoint, including jusInfer's. Stack order:
your harness → Helicone (telemetry) → jusInfer (per-step routing) → underlying provider
Your Helicone dashboard shows the request to jusInfer; jusInfer's dashboard shows the routed underlying model. Two layers of visibility, one bill at the bottom.
What you can't usefully stack: two price-optimization layers (OpenRouter behind jusInfer, or vice versa). One of them ends up just being a passthrough that adds latency without changing the routed model.
Honest weaknesses of each (we've used all four)
- OpenRouter — quality varies day to day as it shifts between sub-providers. Fine for batch, occasionally surprising for interactive coding.
- Helicone — sampling at scale needs paid plan; free tier hits limits quickly under heavy agent traffic.
- Portkey — config-as-data is powerful but takes a day to internalize for a new user. Not a "5-minute install."
- jusInfer — narrower scope; if your workload is 50% coding agents and 50% something else, the savings only land on the coding half.
When the right answer is "none, go direct"
Three cases:
- You're under $20/month of inference — gateway cost (free or otherwise) is solving a problem you don't have.
- You're locked to one model intentionally (a research project measuring one specific model's behavior).
- Your tooling doesn't support custom base URLs (rare in 2026; even Cursor and Copilot have it now).
Further reading
- Cheapest LLM API for coding 2026 — narrower per-model price comparison
- OpenRouter alternatives 2026 — what to pick if OpenRouter isn't fitting
- Together vs Fireworks vs jusInfer — inference provider comparison (different layer)