---
title: Cline + custom endpoint — cut your VS Code agent bill 60%+
description: Cline supports any OpenAI-compatible base URL via its "OpenAI Compatible" provider. Here's the 2-minute config that drops your autonomous-edit bill 60-80% without changing how you use it.
tldr: In Cline → Settings → API Provider, pick "OpenAI Compatible". Set Base URL to https://api.jusinfer.com/v1, paste a jinf_ token, and use jusInfer-auto as the model. Cline keeps full autonomous-edit and tool-use behavior; jusInfer picks the cheapest capable model per step.
date: 2026-05-26
author: jusInfer
cluster: integration
tags: cline, vscode-agent, openai-compatible, custom-base-url, cost-optimization
---

# Cline + custom endpoint — cut your bill 60%+

[Cline](https://cline.bot) (formerly Claude Dev) is the most polished autonomous-edit agent in VS Code. It's also expensive on default settings because every "read file → propose edit → run test" loop step talks to a frontier model. Cline natively supports any OpenAI-compatible base URL, so you can route through jusInfer and drop the bill 60-80% without changing your workflow.

## What you'll change

Two values in Cline's settings panel. No code, no extension reinstall, no workflow change.

## Setup

### 1. Mint a jusInfer API key

Sign in at [jusinfer.com/login](https://jusinfer.com/login) with Google or Microsoft. Open [jusinfer.com/developer](https://jusinfer.com/developer) → **Keys** tab → **Mint key**. Copy the `jinf_…` token (shown once).

### 2. Configure Cline

Open VS Code's Cline panel → click the gear icon → **API Provider** dropdown → select **"OpenAI Compatible"**. Then fill in:

```
Base URL:   https://api.jusinfer.com/v1
API Key:    jinf_your_key_here
Model ID:   jusInfer-auto
```

Click **Save**.

### 3. Verify

Start a Cline task — say, "rename this function and update all call sites." It should work exactly as before. After it finishes, open [jusinfer.com/developer](https://jusinfer.com/developer) → **Usage** tab. You'll see the spend tick up at a fraction of what Sonnet 4.5 would cost.

## What you get

| Behavior | Status |
|---|---|
| Autonomous edit mode | ✅ unchanged |
| Tool use (read_file, write_file, execute_command) | ✅ unchanged |
| Multi-step plans | ✅ unchanged |
| File context / repo awareness | ✅ unchanged |
| Image inputs (when needed) | ✅ auto-routes to vision-capable model |
| Streaming responses | ✅ |
| Cost per task | **down 60-80% on real workloads** |

## Pinning a specific model

`jusInfer-auto` lets us pick. If you want every Cline call to use a specific upstream — say Sonnet 4.5 for hard refactors or Hermes 4 405B for tool-heavy stuff — use the provider-prefixed model id instead:

```
Model ID: anthropic/claude-sonnet-4.5
# or
Model ID: nousresearch/hermes-4-405b
# or
Model ID: openai/gpt-5
```

jusInfer normalizes provider prefixes — you don't need separate accounts at each upstream.

## Why Cline benefits so much from per-call routing

Cline operates on tight loops: **read** → **think** → **edit** → **verify** → **repeat**. A typical task is 10-30 LLM calls. Most of those calls are tactical (apply a small edit, parse a tool result) and don't need a frontier model. A few are strategic (decide architecture, debug a hard failure) and do.

jusInfer looks at each request's shape (context length, tool-call pattern, recent turn history) and routes accordingly. Tactical calls go to an 8-30B parameter model that's fast and cheap; strategic calls go to Sonnet or GPT-5. You get the same task-completion rate at a fraction of the cost.

## Real numbers from a real team

A 6-engineer team using Cline daily switched from default Sonnet 4.5 to jusInfer in March 2026. Before: $480/seat/month average. After: $115/seat/month over the following 30 days. No quality regression on their internal eval (which is a fixed set of 20 real refactor tasks they grade by hand). The savings came from 80% of their daily Cline traffic routing to mid-tier models that handle "apply this small edit" perfectly well.

## Common gotchas

- **The model ID must exist in your Cline settings dropdown.** Some Cline versions require you to type `jusInfer-auto` in a custom-models field before it appears as selectable.
- **Tool-call shape is normalized.** Cline expects OpenAI-style tool calls; jusInfer normalizes Anthropic-format responses transparently so it doesn't matter which upstream actually runs.
- **Streaming is forwarded end-to-end.** If your Cline UI lags on responses, that's network — not the gateway.

## What about Cline's Anthropic-direct path?

Cline also supports the **Anthropic** provider directly. That path uses the Anthropic Messages API, not OpenAI Chat Completions. To route Anthropic-shape traffic through jusInfer, see [Use jusInfer with Claude Code](/docs/claude-code/) — same two env vars (`ANTHROPIC_BASE_URL` + `ANTHROPIC_AUTH_TOKEN`), and Cline picks them up if you set them in your shell before launching VS Code.

## Setup checklist

1. Sign up at [jusinfer.com/login](https://jusinfer.com/login).
2. Mint a `jinf_` key at [/developer](https://jusinfer.com/developer) → Keys.
3. Cline → Settings → API Provider: OpenAI Compatible.
4. Base URL: `https://api.jusinfer.com/v1`. API Key: your jinf_ token. Model: `jusInfer-auto`.
5. Run a normal Cline task. Check spend on the dashboard.
6. Set per-user caps in [Tenant tab](https://jusinfer.com/developer) if multi-engineer.

## Related reading

- [OpenAI-compatible drop-in (Cursor, Aider, Continue, Goose…)](/docs/openai-drop-in/)
- [Your Cursor bill is too high — three ways to cut it](/blog/cursor-too-expensive-options/)
- [Aider on a budget](/blog/aider-cheap-inference/)
- [The cheapest LLM API for coding agents in 2026](/blog/cheapest-llm-api-for-coding-2026/)

---

*Raw markdown: [/blog/cline-custom-endpoint.md](/blog/cline-custom-endpoint.md)*
