Skip to content

Ollama

Ollama is the easiest way to run open models locally. It serves an OpenAI-compatible API at http://localhost:11434/v1, so it drops into routeplane.yaml as one provider block.

  • Routeplane installed, with a routeplane.yaml (scaffold one with routeplane init).

  • Ollama running, with a model pulled:

    Terminal window
    ollama serve & # default port 11434
    ollama pull llama3.1
routeplane.yaml
providers:
ollama:
api_base: http://localhost:11434/v1
api_protocol:
- "*": chat_completions
models:
- id: llama3.1

Each models entry is a name you’ve pulled with ollama pull. List what’s available with ollama list.

**No API key.** Ollama accepts anonymous loopback requests, so the block has no `api_key`. (Its own SDK examples pass a dummy `"ollama"` key only because the OpenAI client library demands a non-empty string — Routeplane doesn't.)
Terminal window
routeplane route ollama:llama3.1

Then start Routeplane and send a request. Use the provider-qualified id ollama:llama3.1 to pin the request, or the bare llama3.1 to let Routeplane cascade.