Skip to content

Overview

A harness is the agent runtime that drives the loop — the CLI or service that reads your prompt, calls tools, and edits files. Each one already speaks some model API; the move is always the same: point it at the local proxy at http://127.0.0.1:4356 (Routeplane Cloud is on the Phase D roadmap; not yet shipping) instead of the vendor’s, and address models by their provider/model id. From there the same harness can run on Anthropic, OpenAI, Google, or an open model — with provider selection and fallback underneath.

Two of these share a name with a model source, and it’s worth keeping them straight:

You want to… Use
Run the Claude Code CLI on any model Claude Code (harness)
Spend your Claude plan as tokens Claude subscription (model)
Run the Codex CLI on any model Codex (harness)
Spend your ChatGPT plan as tokens Codex subscription (model)

A harness is what runs; a model source is where the tokens come from. You can combine them freely — e.g. drive Claude Code (harness) against your ChatGPT plan (Codex subscription), or against a local Ollama model.

Start with the Quick Start to get the proxy running, then come back and point your harness at it.