Model providers

Using vLLM with OpenClaw

3 min read

Browse more in Model providers.

All model providers guides →

This guide shows you how to connect a vLLM server to OpenClaw using the OpenAI-compatible `openai-completions` API. You will configure both auto-discovered and explicit vLLM models, including custom base URLs and token limits.

By the end, your OpenClaw agents will call vLLM models via the `vllm` provider ID.

Setup flow

Prerequisites

  • A running vLLM server exposing OpenAI-compatible `/v1` endpoints such as `/v1/models` and `/v1/chat/completions`.
  • Network access from your OpenClaw runtime to the vLLM base URL, commonly `http://127.0.0.1:8000/v1`.
  • The OpenClaw CLI installed so you can run `openclaw models list --provider vllm`.

Steps

  1. 1

    Start vLLM with an OpenAI-compatible server

    Run vLLM in OpenAI-compatible server mode so it exposes `/v1` endpoints that OpenClaw can call. Your base URL should serve `/v1/models` and `/v1/chat/completions`, and during local development it commonly runs on this address.

    text
    GET http://127.0.0.1:8000/v1/models
  2. 2

    Set the VLLM_API_KEY environment variable

    Set `VLLM_API_KEY` so OpenClaw knows to enable the vLLM provider and, if you do not configure it explicitly, to auto-discover models. If your vLLM server does not enforce auth, any non-empty value works as the opt-in signal.

    bash
    export VLLM_API_KEY="vllm-local"
  3. 3

    Select a vLLM model for your agents

    Point your agent defaults at a vLLM model by using the `vllm/` prefix and a model ID that exists on your vLLM server. This is the minimal configuration when you rely on auto-discovery and the default base URL.

    json
    {
      agents: {
        defaults: {
          model: { primary: "vllm/your-model-id" },
        },
      },
    }
  4. 4

    Verify that OpenClaw can list vLLM models

    Use the models CLI to confirm that OpenClaw can reach vLLM and list available models. If this command fails or returns no models, you either have a connectivity issue or auto-discovery is disabled by explicit config.

    bash
    openclaw models list --provider vllm
  5. 5

    Configure vLLM explicitly with local models and limits

    Switch to explicit configuration when you need a non-default host/port, pinned `contextWindow` or `maxTokens`, or custom auth behavior. This block defines the `vllm` provider on the default base URL and declares a local model with cost and token settings.

    json
    {
      models: {
        providers: {
          vllm: {
            baseUrl: "http://127.0.0.1:8000/v1",
            apiKey: "${VLLM_API_KEY}",
            api: "openai-completions",
            models: [
              {
                id: "your-model-id",
                name: "Local vLLM Model",
                reasoning: false,
                input: ["text"],
                cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
                contextWindow: 128000,
                maxTokens: 8192,
              },
            ],
          },
        },
      },
    }
  6. 6

    Point OpenClaw at a remote vLLM server with a custom base URL

    When vLLM runs on another host or port, set `baseUrl` in the provider config so OpenClaw targets the correct `/v1` endpoint. This example also shows how to define a remote model with its own context window and max token limits.

    json
    {
      models: {
        providers: {
          vllm: {
            baseUrl: "http://192.168.1.50:9000/v1",
            apiKey: "${VLLM_API_KEY}",
            api: "openai-completions",
            models: [
              {
                id: "my-custom-model",
                name: "Remote vLLM Model",
                reasoning: false,
                input: ["text"],
                contextWindow: 64000,
                maxTokens: 4096,
              },
            ],
          },
        },
      },
    }

Configuration

OptionDescriptionExample
VLLM_API_KEYAuth token for the vLLM OpenAI-compatible server and the opt-in signal that enables the vLLM provider and model auto-discovery when no explicit `models.providers.vllm` is defined.vllm-local
models.providers.vllm.baseUrlThe base URL for the vLLM OpenAI-compatible `/v1` API that OpenClaw calls.http://127.0.0.1:8000/v1
models.providers.vllm.apiKeyThe API key value OpenClaw sends to vLLM, typically wired from `VLLM_API_KEY`.${VLLM_API_KEY}
models.providers.vllm.apiThe API type OpenClaw uses for vLLM; vLLM uses the OpenAI-compatible completions API.openai-completions
agents.defaults.model.primaryThe default primary model reference for your agents, using the `vllm/` prefix and a vLLM model ID.vllm/your-model-id
models.providers.vllm.models[].contextWindowThe maximum context window size in tokens that OpenClaw assumes for the vLLM model.128000
models.providers.vllm.models[].maxTokensThe maximum number of output tokens OpenClaw requests from the vLLM model.8192

Troubleshooting

curl to the vLLM models endpoint fails or hangs when checking connectivity

OpenClaw cannot reach vLLM if the server is down, bound to a different host/port, or not running in OpenAI-compatible mode. Hit the models endpoint directly to confirm connectivity and that `/v1` is exposed.

bash
curl http://127.0.0.1:8000/v1/models

Requests to vLLM fail with auth errors even though VLLM_API_KEY is set

Your vLLM server likely expects a specific API key or header configuration. vllm` so you control the auth behavior.

No models appear when you run `openclaw models list --provider vllm`

vllm` config entry. If you have defined the provider manually, OpenClaw skips discovery and uses only your declared models, so add your vLLM models to the explicit config.

Frequently asked questions

Powered by Mem0

Add persistent memory to OpenClaw

Official Mem0 plugin for OpenClaw keeps context across chats and tools. Smaller prompts, lower cost, better continuity for your agents.

More in Model providers