How do I run OpenClaw against a local Inferrs model server?

Start Inferrs with a model on a known host and port, verify `/health` and `/v1/models` with curl, then add a `models.providers.inferrs` entry pointing `baseUrl` at `http://127.0.0.1:8080/v1` and set your agents’ default model to `inferrs/google/gemma-4-E2B-it`. After that, `openclaw infer model run --model inferrs/google/gemma-4-E2B-it` will send traffic through Inferrs.

Why does Inferrs complain that messages[].content expected a string with OpenClaw?

Some Inferrs chat completions routes only accept string `messages[].content` and reject structured content-part arrays that OpenClaw can send. Set `compat.requiresStringContent: true` in your Inferrs model config so OpenClaw flattens pure text content into plain strings before calling Inferrs.

How do I add persistent memory to OpenClaw?

Mem0 is the official persistent-memory plugin for OpenClaw. Install it with `openclaw plugins install @mem0/openclaw-mem0`. After installation your agent stores and retrieves user memories automatically.

Model providers

Using Inferrs with OpenClaw

3 min read

Browse more in Model providers.

All model providers guides →

This guide shows you how to wire up Inferrs as an OpenAI-compatible backend for OpenClaw using the generic openai-completions path. You will start a local Inferrs server, verify it with curl, and register it as a provider so your agents can run against a Gemma 4 model.

By the end, you will have OpenClaw calling a local Inferrs-served model and a couple of flags tuned for the quirks of Inferrs’ chat API.

Setup flow

Prerequisites

✓An Inferrs installation with the `inferrs` CLI available on your PATH.
✓A local model that Inferrs can serve, such as `google/gemma-4-E2B-it`.
✓An OpenClaw setup where you can edit the agents and models configuration and run `openclaw infer model run`.

Steps

1
Start Inferrs with a local model
Start the Inferrs server and bind it to a host and port that OpenClaw can reach. 1:8080` using the `metal` device, which the later OpenClaw config expects.
bash
```
inferrs serve google/gemma-4-E2B-it \
  --host 127.0.0.1 \
  --port 8080 \
  --device metal
```
2
Verify the Inferrs server is reachable
Before touching OpenClaw, confirm that Inferrs is actually listening and exposing the OpenAI-compatible endpoints. These curl checks hit the health probe and list models; if either fails, fix Inferrs networking or model loading first.
bash
```
curl http://127.0.0.1:8080/health
curl http://127.0.0.1:8080/v1/models
```
3
Configure Inferrs as an OpenClaw provider
Add a provider entry that points OpenClaw at your Inferrs `/v1` base URL and describes the model capabilities. requiresStringContent` so OpenClaw flattens content into plain strings for Inferrs.
json
```
{
  agents: {
    defaults: {
      model: { primary: "inferrs/google/gemma-4-E2B-it" },
      models: {
        "inferrs/google/gemma-4-E2B-it": {
          alias: "Gemma 4 (inferrs)",
        },
      },
    },
  },
  models: {
    mode: "merge",
    providers: {
      inferrs: {
        baseUrl: "http://127.0.0.1:8080/v1",
        apiKey: "inferrs-local",
        api: "openai-completions",
        models: [
          {
            id: "google/gemma-4-E2B-it",
            name: "Gemma 4 E2B (inferrs)",
            reasoning: false,
            input: ["text"],
            cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
            contextWindow: 131072,
            maxTokens: 4096,
            compat: {
              requiresStringContent: true,
            },
          },
        ],
      },
    },
  },
}
```
4
Run a manual Inferrs chat completion smoke test
Test Inferrs directly with a minimal `/v1/chat/completions` request to confirm the model responds before involving OpenClaw. This isolates Inferrs issues from OpenClaw configuration problems and gives you a known-good baseline.
bash
```
curl http://127.0.0.1:8080/v1/chat/completions \
  -H 'content-type: application/json' \
  -d '{"model":"google/gemma-4-E2B-it","messages":[{"role":"user","content":"What is 2 + 2?"}],"stream":false}'
```
5
Run an OpenClaw model inference against Inferrs
Once the direct curl call works, exercise the full OpenClaw → Inferrs path. This command uses the configured `inferrs/google/gemma-4-E2B-it` model and returns JSON so you can see exactly what OpenClaw got back.
bash
```
openclaw infer model run \
  --model inferrs/google/gemma-4-E2B-it \
  --prompt "What is 2 + 2? Reply with one short sentence." \
  --json
```
6
Tune Inferrs compatibility flags for Gemma and tools
If you see schema errors or tool-related crashes, adjust the `compat` block for your Inferrs model. Disabling tools with `supportsTools: false` can help when Gemma accepts small direct calls but fails on full agent turns.
text
```
compat: {
  requiresStringContent: true,
  supportsTools: false
}
```

Configuration

Option	Description	Example
agents.defaults.model.primary	Sets the primary default model for agents, here pointing to the Inferrs-backed Gemma 4 model.	inferrs/google/gemma-4-E2B-it
agents.defaults.models["inferrs/google/gemma-4-E2B-it"].alias	Human-friendly alias for the Inferrs Gemma 4 model shown in OpenClaw.	Gemma 4 (inferrs)
models.mode	Controls how the models configuration is applied; `merge` merges with existing providers.	merge
models.providers.inferrs.baseUrl	Base URL for the Inferrs OpenAI-compatible `/v1` API that OpenClaw calls.	http://127.0.0.1:8080/v1
models.providers.inferrs.apiKey	API key value OpenClaw sends to Inferrs; for local setups this can be a placeholder.	inferrs-local
models.providers.inferrs.api	Specifies that this provider uses the generic OpenAI completions-compatible path.	openai-completions
models.providers.inferrs.models[0].id	Model identifier as exposed by Inferrs, used in requests to the `/v1` API.	google/gemma-4-E2B-it
models.providers.inferrs.models[0].name	Display name for the Inferrs model inside OpenClaw.	Gemma 4 E2B (inferrs)
models.providers.inferrs.models[0].reasoning	Flags whether the model supports reasoning features; set to false for this Inferrs Gemma model.	false
models.providers.inferrs.models[0].input	Lists the input modalities supported by the model; Inferrs Gemma here accepts text.	["text"]
models.providers.inferrs.models[0].cost.input	Per-token input cost for the Inferrs model, set to 0 for local usage.	0
models.providers.inferrs.models[0].cost.output	Per-token output cost for the Inferrs model, set to 0 for local usage.	0
models.providers.inferrs.models[0].cost.cacheRead	Cost for cache reads, set to 0 for this Inferrs configuration.	0
models.providers.inferrs.models[0].cost.cacheWrite	Cost for cache writes, set to 0 for this Inferrs configuration.	0
models.providers.inferrs.models[0].contextWindow	Maximum context window size in tokens for the Inferrs Gemma model.	131072
models.providers.inferrs.models[0].maxTokens	Maximum number of tokens the model can generate in a single completion.	4096
models.providers.inferrs.models[0].compat.requiresStringContent	When true, OpenClaw flattens content parts into plain strings to satisfy Inferrs chat routes that only accept string `messages[].content`.	true
models.providers.inferrs.models[0].compat.supportsTools	When set to false, disables OpenClaw’s tool schema surface for this Inferrs model to avoid tool-related crashes.	false

Troubleshooting

curl /v1/models fails

This usually means Inferrs is not running, not reachable, or not bound to the host/port you configured. Start Inferrs with the expected `--host` and `--port` values and re-run the health and models curl checks to confirm it is listening.

messages[1].content: invalid type: sequence, expected a string

content`. requiresStringContent: true` so OpenClaw flattens pure text content parts into strings before sending the request.

bash

compat: {
  requiresStringContent: true
}

Direct /v1/chat/completions calls pass but openclaw infer model run fails

If a small direct curl request works but `openclaw infer model run` fails, the tool schema surface may be too heavy for your Inferrs + Gemma combo. supportsTools: false` in the model entry to disable tools and reduce prompt pressure.

bash

compat: {
  requiresStringContent: true,
  supportsTools: false
}

inferrs still crashes on larger agent turns

When schema errors are gone but Inferrs continues to crash on larger agent turns, you are likely hitting an upstream Inferrs or model limitation. Reduce prompt size or switch to a different local backend or model, since OpenClaw’s transport layer is already sending a compatible payload.

Frequently asked questions

Add persistent memory to OpenClaw

Official Mem0 plugin for OpenClaw keeps context across chats and tools. Smaller prompts, lower cost, better continuity for your agents.

Read the official Mem0 guide Open Mem0

More in Model providers

Model providers

Using Alibaba Model Studio with OpenClaw

Model providers

Using BytePlus with OpenClaw

Model providers

Using fal with OpenClaw

Model providers

Using GLM (Zhipu AI) with OpenClaw

Model providers

Using Hugging Face with OpenClaw

Model providers

Using Kilocode with OpenClaw

Prerequisites

Steps

Start Inferrs with a local model

Verify the Inferrs server is reachable

Configure Inferrs as an OpenClaw provider

Run a manual Inferrs chat completion smoke test

Run an OpenClaw model inference against Inferrs

Tune Inferrs compatibility flags for Gemma and tools

Configuration

Troubleshooting

Frequently asked questions

Add persistent memory to OpenClaw

More in Model providers