Blog

OpenClaw: "Native hook relay unavailable" blocking tool calls

Q: How do I fix "Native hook relay unavailable" without killing my other sessions?

Start a fresh session for just the affected agent — /new or /reset in a bound chat, or abort the specific long-running run. That registers a new relay with the current generation and unblocks tool calls, without the collateral damage of a gateway or pod restart, which would reset relays for every session on the instance.

Your agent was running fine, and now every shell command, apply_patch, or tool call dies before it executes with Native hook relay unavailable. Here's the cause: in OpenClaw 2026.5.x the pre-tool hooks moved out of process — into a small "native hook relay" the Codex app-server talks to over a localhost bridge. When that relay goes stale or expires, the hook can't get a verdict, so it fails closed: pre_tool_use returns a BLOCK (a permission_request returns a DENY). "Unavailable" isn't a warning — it's the tool being denied. The fix, in one line: end the wedged session and start a fresh one (/new or /reset) — a gateway restart won't heal a session that's already mid-flight, and can leave it permanently un-runnable.

What you'll see

PreToolUse: blocked — Native hook relay unavailable

What the "native hook relay" actually is

Before 2026.5.x, pre-tool hooks ran in-process. Starting in 2026.5.x they're out-of-process: the vendored Codex app-server fires pre_tool_use / permission_request hooks by shelling out to codex hooks relay --relay-id <id> --generation <uuid> --event ..., which HTTP-connects to a localhost bridge server inside the node gateway. Each live relay drops a descriptor file at /tmp/openclaw-native-hook-relays-<uid>/<hash>.json containing its port, a token, and an expiresAtMs.

The bridge is the thing that says "allow" or "deny" for each tool call. When the relay can't reach a valid bridge registration, the fallback message is literally Native hook relay unavailable, and the safe default is to block the tool. That fail-closed choice is why a stale relay doesn't degrade gracefully — it stops your agent from doing anything.

Why it happens (two triggers)

There are two common ways the relay goes "unavailable," and they need different responses.

1) The relay expired (long task / long model thinking)

A relay has a TTL of about 30 minutes and has to be renewed to stay alive. The catch: the renewal pings are driven by the Codex app-server's tool-call cadence — there's no independent keepalive timer. So if the model goes into a long thinking stretch (or a long-running session sits between tool calls), the renewals stop, the bridge gets garbage-collected, and the next tool call lands on a relay that's already gone → "unavailable." This is exactly the failure described upstream in openclaw/openclaw#89147. Long-running, stacking sessions (think an hourly cron that sits processing for minutes and overlaps the next run) are the usual victims — they outlive their own relay.

2) The registration went stale after a gateway restart

Every relay registers with a generation (a fresh UUID). When you restart the gateway, it re-registers the bridge with a new generation. Any Codex session that was already in-flight is still holding the old generation, so the bridge answers it with HTTP 410 "native hook relay bridge stale registration" — and that's a block.

This is the trap: restarting the gateway is the obvious reflex, but it does not heal the session that's currently wedged. It only makes new sessions work. The in-flight session is now carrying a generation the bridge will reject forever. Operators hit this and report exactly that — the relay stays unavailable after a gateway restart even though hooks reload and report ready (openclaw/openclaw#73723).

A third, rarer cause: the bridge descriptor is simply gone or the port is down (ECONNREFUSED/ENOENT) — same block.

Fix it (self-hosted)

Do these in order. The order matters because the obvious first move (restart the gateway) is the one that strands the wedged session.

1) Confirm it's the relay, not memory pressure

A flood of blocks across every session, with pod restarts, points at something systemic — usually an out-of-memory crash, not a stale relay. Check for restarts and runaway log growth (the codex-home/logs_2.sqlite page-cache OOM pattern) first. A stale relay is typically isolated: one or two sessions wedged, no restarts, normal memory. If the whole instance is crash-looping, fix that first — this page is for the isolated case.

2) End the wedged session and start a fresh one

In a bound Codex chat, run /new (or /reset) to begin a fresh session — it registers a new relay with the current generation and works immediately. For a long-running agent or cron run, abort that specific run; its next run gets a clean relay. This is the actual fix: a fresh session sidesteps both the expired-TTL and stale-generation cases.

3) Only then consider an app-server / gateway restart

If even a fresh session fails on its first tool call, the bridge itself is down (case 3 above). Restart the Codex app-server or the gateway so new sessions get a clean bridge — but know what you're buying: this fixes future sessions, it does not revive anything mid-turn. Don't lead with it.

4) Nuclear option

Restarting the pod or the whole process wipes every /tmp relay descriptor and every in-flight session, so it clears the wedge — at the cost of killing all running work. Use it only when sessions are wedged across the board and you can afford to lose in-flight runs.

Diagnose the relay directly

Look at the descriptor the bridge wrote:

cat /tmp/openclaw-native-hook-relays-*/*.json

Check expiresAtMs against the current time (expired = the TTL case) and that port is actually listening. The directory is ephemeral, so a missing file means the bridge is gone (case 3).

Don't do this

Don't restart the gateway as your first move. It feels right, but it re-registers the bridge with a new generation, and any session that was already running now holds a stale one — the bridge will answer it with HTTP 410 forever. Restarting the gateway makes new sessions work while leaving the wedged one permanently blocked. End the wedged session first; restart the gateway only for the bridge-is-down case, knowingly.

Stop babysitting your OpenClaw box

Fix it once — or stop fixing it for good.

Apply the checklist above and keep self-hosting, or skip the maintenance entirely: run your OpenClaw on managed hosting from $6.90/mo, starting with a 7-day free trial. We handle the stale locks, gateway restarts, version upgrades, and uptime — and you can import your existing instance in a couple of minutes. Cancel anytime.

Managed hosting — from $6.90/mo Your own hosted OpenClaw instance with automatic restarts and version upgrades. Starts with a 7-day free trial — import your current setup, keep your channels, cancel anytime.

$199 managed setup — optional Prefer we do it for you? One workspace configured end-to-end: first-run config, one 30-minute onboarding/debug session, and a 7-day follow-up. Limited weekly slots.

Managed hosting handles stale .jsonl.lock files, gateway restarts, and version upgrades for you
Import your existing OpenClaw setup in minutes — keep your channels and configuration
The optional $199 setup is scoped: no custom development, enterprise/SRE support, or unsupported self-hosting repair

If you would rather compare options first, review OpenClaw cloud hosting or see the best OpenClaw hosting options before deciding.

Run it on managed hosting — from $6.90/mo Prefer we set it up? $199 managed setup See how managed OpenClaw hosting supervises sessions so relays don't go stale

OpenClaw import first screen in OpenClaw Setup dashboard (light theme) — 1) Paste import payload

OpenClaw import first screen in OpenClaw Setup dashboard (dark theme) — 1) Paste import payload

OpenClaw import completed screen in OpenClaw Setup dashboard (light theme) — 2) Review and launch

OpenClaw import completed screen in OpenClaw Setup dashboard (dark theme) — 2) Review and launch

Related, but not this

Your OpenAI/Codex tools went missing right after the 5.27 update (rather than a mid-session block) — that's an OAuth-and-native-hook recovery checklist, covered in why OpenClaw OpenAI OAuth tools go missing after the 5.27 update.
The session ended with "Codex stopped before confirming the turn was complete" — a long final tool call tripping a completion-idle timeout, not a hook block: see why Codex stops before confirming the turn was complete.

How managed hosting avoids this

Both triggers come from how the instance is run, not from a bug you can configure away. On managed OpenClaw hosting from Lobsterland, session and cron lifecycles are supervised so runs don't stack and outlive the 30-minute relay TTL, and config changes use restart-tiering — a routine setting edit doesn't trigger a full gateway restart that would bump every in-flight relay's generation out from under it. When a relay does go stale, clearing it is an operational step the platform handles, not a judgment call you make under pressure about whether to abort a session or restart the gateway. If you'd rather not learn the difference between an expired relay and a stale generation the hard way, that's the gap managed hosting closes.

Import your current OpenClaw instance in 1 click

Frequently asked questions

What does "Native hook relay unavailable" mean in OpenClaw?

In OpenClaw 2026.5.x the pre-tool hooks run out-of-process, in a "native hook relay" the Codex app-server reaches over a localhost bridge. When that relay is expired, stale, or gone, the hook can't return a verdict, so it fails closed: pre_tool_use returns a BLOCK and permission_request returns a DENY. So the message means your tool call was denied because the relay couldn't be reached — not that something is merely slow.

Why doesn't restarting the gateway fix "Native hook relay unavailable"?

Because a gateway restart re-registers the bridge with a new generation UUID, while the session that's already running still holds the old one. The bridge rejects the old generation with HTTP 410 "stale registration," so the in-flight session stays blocked. A restart only makes new sessions work; it cannot heal the wedged one. End the wedged session (start a fresh one) instead.

How do I fix "Native hook relay unavailable" without killing my other sessions?

Start a fresh session for just the affected agent — /new or /reset in a bound chat, or abort the specific long-running run. That registers a new relay with the current generation and unblocks tool calls, without the collateral damage of a gateway or pod restart, which would reset relays for every session on the instance.

Why does the native hook relay go unavailable during a long task?

The relay's TTL is about 30 minutes and its renewal pings are driven by the app-server's tool-call cadence — there's no separate keepalive timer. During a long model-thinking gap or a long-running session, the renewals stop, the bridge is garbage-collected, and the next tool call finds the relay expired. Long, overlapping sessions are the most common victims because they outlive their own relay.

Does "Native hook relay unavailable" mean my instance is broken or out of memory?

Usually no. A stale relay is normally isolated — one or two sessions blocked, no pod restarts, normal memory. If instead every session is failing and the pod is restarting, suspect an out-of-memory crash (for example, runaway logs_2.sqlite growth) rather than the relay, and fix that first. Check restarts and memory before assuming it's the hook relay.