Ollama

Ollama is the easiest way to run open models locally. It serves an OpenAI-compatible API at http://localhost:11434/v1, so it drops into routeplane.yaml as one provider block.

Prerequisites

Routeplane installed, with a routeplane.yaml (scaffold one with routeplane init).

Ollama running, with a model pulled:

ollama serve &          # default port 11434
ollama pull llama3.1

Add Ollama to Routeplane

providers:
  ollama:
    api_base: http://localhost:11434/v1
    api_protocol:
      - "*": chat_completions
    models:
      - id: llama3.1

Each models entry is a name you’ve pulled with ollama pull. List what’s available with ollama list.

**No API key.** Ollama accepts anonymous loopback requests, so the block has no `api_key`. (Its own SDK examples pass a dummy `"ollama"` key only because the OpenAI client library demands a non-empty string — Routeplane doesn't.)

Route to it

routeplane route ollama:llama3.1

Then start Routeplane and send a request. Use the provider-qualified id ollama:llama3.1 to pin the request, or the bare llama3.1 to let Routeplane cascade.

Learn more

Ollama — OpenAI compatibility
Model fallback — fail over from local Ollama to a hosted model.