OpenClaw: "Native hook relay unavailable" blocking tool calls
Your agent was running fine, and now every shell command, apply_patch, or tool call dies before
it executes with Native hook relay unavailable. Here's the cause: in OpenClaw 2026.5.x the
pre-tool hooks moved out of process — into a small "native hook relay" the Codex app-server
talks to over a localhost bridge. When that relay goes stale or expires, the hook can't get a verdict, so it
fails closed: pre_tool_use returns a BLOCK (a
permission_request returns a DENY). "Unavailable" isn't a warning — it's the
tool being denied. The fix, in one line: end the wedged session and start a fresh one
(/new or /reset) — a gateway restart won't heal a session that's already mid-flight,
and can leave it permanently un-runnable.
PreToolUse: blocked — Native hook relay unavailable What the "native hook relay" actually is
Before 2026.5.x, pre-tool hooks ran in-process. Starting in 2026.5.x they're out-of-process: the vendored
Codex app-server fires pre_tool_use / permission_request hooks by
shelling out to codex hooks relay --relay-id <id> --generation <uuid> --event ...,
which HTTP-connects to a localhost bridge server inside the node gateway. Each live relay drops a descriptor
file at /tmp/openclaw-native-hook-relays-<uid>/<hash>.json containing its
port, a token, and an expiresAtMs.
The bridge is the thing that says "allow" or "deny" for each tool call. When the relay can't reach a
valid bridge registration, the fallback message is literally Native hook relay unavailable,
and the safe default is to block the tool. That fail-closed choice is why a stale relay
doesn't degrade gracefully — it stops your agent from doing anything.
Why it happens (two triggers)
There are two common ways the relay goes "unavailable," and they need different responses.
1) The relay expired (long task / long model thinking)
A relay has a TTL of about 30 minutes and has to be renewed to stay alive. The catch: the
renewal pings are driven by the Codex app-server's tool-call cadence — there's no independent
keepalive timer. So if the model goes into a long thinking stretch (or a long-running session sits between
tool calls), the renewals stop, the bridge gets garbage-collected, and the next tool call lands on a relay
that's already gone → "unavailable." This is exactly the failure described upstream in
openclaw/openclaw#89147.
Long-running, stacking sessions (think an hourly cron that sits processing for minutes and
overlaps the next run) are the usual victims — they outlive their own relay.
2) The registration went stale after a gateway restart
Every relay registers with a generation (a fresh UUID). When you restart the gateway,
it re-registers the bridge with a new generation. Any Codex session that was already in-flight is
still holding the old generation, so the bridge answers it with
HTTP 410 "native hook relay bridge stale registration" — and that's a block.
This is the trap: restarting the gateway is the obvious reflex, but it does not heal the session that's currently wedged. It only makes new sessions work. The in-flight session is now carrying a generation the bridge will reject forever. Operators hit this and report exactly that — the relay stays unavailable after a gateway restart even though hooks reload and report ready (openclaw/openclaw#73723).
A third, rarer cause: the bridge descriptor is simply gone or the port is down
(ECONNREFUSED/ENOENT) — same block.
Fix it (self-hosted)
Do these in order. The order matters because the obvious first move (restart the gateway) is the one that strands the wedged session.
1) Confirm it's the relay, not memory pressure
A flood of blocks across every session, with pod restarts, points at something systemic — usually an
out-of-memory crash, not a stale relay. Check for restarts and runaway log growth (the
codex-home/logs_2.sqlite page-cache OOM pattern) first. A stale relay is typically isolated: one
or two sessions wedged, no restarts, normal memory. If the whole instance is crash-looping, fix that first —
this page is for the isolated case.
2) End the wedged session and start a fresh one
In a bound Codex chat, run /new (or /reset) to begin a fresh session — it registers
a new relay with the current generation and works immediately. For a long-running agent or cron run,
abort that specific run; its next run gets a clean relay. This is the actual fix: a fresh
session sidesteps both the expired-TTL and stale-generation cases.
3) Only then consider an app-server / gateway restart
If even a fresh session fails on its first tool call, the bridge itself is down (case 3 above). Restart the Codex app-server or the gateway so new sessions get a clean bridge — but know what you're buying: this fixes future sessions, it does not revive anything mid-turn. Don't lead with it.
4) Nuclear option
Restarting the pod or the whole process wipes every /tmp relay descriptor and every in-flight
session, so it clears the wedge — at the cost of killing all running work. Use it only when sessions are
wedged across the board and you can afford to lose in-flight runs.
Look at the descriptor the bridge wrote:
cat /tmp/openclaw-native-hook-relays-*/*.json
Check expiresAtMs against the current time (expired = the TTL case) and that port
is actually listening. The directory is ephemeral, so a missing file means the bridge is gone (case 3).
Don't restart the gateway as your first move. It feels right, but it re-registers the bridge with a new generation, and any session that was already running now holds a stale one — the bridge will answer it with HTTP 410 forever. Restarting the gateway makes new sessions work while leaving the wedged one permanently blocked. End the wedged session first; restart the gateway only for the bridge-is-down case, knowingly.
Stop babysitting your OpenClaw box
Fix it once — or stop fixing it for good.
Apply the checklist above and keep self-hosting, or skip the maintenance entirely: run your OpenClaw on managed hosting from $6.90/mo, starting with a 7-day free trial. We handle the stale locks, gateway restarts, version upgrades, and uptime — and you can import your existing instance in a couple of minutes. Cancel anytime.
- Managed hosting handles stale
.jsonl.lockfiles, gateway restarts, and version upgrades for you - Import your existing OpenClaw setup in minutes — keep your channels and configuration
- The optional $199 setup is scoped: no custom development, enterprise/SRE support, or unsupported self-hosting repair
If you would rather compare options first, review OpenClaw cloud hosting or see the best OpenClaw hosting options before deciding.
Related, but not this
- Your OpenAI/Codex tools went missing right after the 5.27 update (rather than a mid-session block) — that's an OAuth-and-native-hook recovery checklist, covered in why OpenClaw OpenAI OAuth tools go missing after the 5.27 update.
- The session ended with "Codex stopped before confirming the turn was complete" — a long final tool call tripping a completion-idle timeout, not a hook block: see why Codex stops before confirming the turn was complete.
How managed hosting avoids this
Both triggers come from how the instance is run, not from a bug you can configure away. On managed OpenClaw hosting from Lobsterland, session and cron lifecycles are supervised so runs don't stack and outlive the 30-minute relay TTL, and config changes use restart-tiering — a routine setting edit doesn't trigger a full gateway restart that would bump every in-flight relay's generation out from under it. When a relay does go stale, clearing it is an operational step the platform handles, not a judgment call you make under pressure about whether to abort a session or restart the gateway. If you'd rather not learn the difference between an expired relay and a stale generation the hard way, that's the gap managed hosting closes.
Import your current OpenClaw instance in 1 clickFrequently asked questions
What does "Native hook relay unavailable" mean in OpenClaw?
In OpenClaw 2026.5.x the pre-tool hooks run out-of-process, in a "native hook relay" the Codex app-server
reaches over a localhost bridge. When that relay is expired, stale, or gone, the hook can't return a verdict,
so it fails closed: pre_tool_use returns a BLOCK and permission_request returns a
DENY. So the message means your tool call was denied because the relay couldn't be reached — not that
something is merely slow.
Why doesn't restarting the gateway fix "Native hook relay unavailable"?
Because a gateway restart re-registers the bridge with a new generation UUID, while the session
that's already running still holds the old one. The bridge rejects the old generation with HTTP 410 "stale
registration," so the in-flight session stays blocked. A restart only makes new sessions work; it
cannot heal the wedged one. End the wedged session (start a fresh one) instead.
How do I fix "Native hook relay unavailable" without killing my other sessions?
Start a fresh session for just the affected agent — /new or /reset in a bound chat,
or abort the specific long-running run. That registers a new relay with the current generation and unblocks
tool calls, without the collateral damage of a gateway or pod restart, which would reset relays for every
session on the instance.
Why does the native hook relay go unavailable during a long task?
The relay's TTL is about 30 minutes and its renewal pings are driven by the app-server's tool-call cadence — there's no separate keepalive timer. During a long model-thinking gap or a long-running session, the renewals stop, the bridge is garbage-collected, and the next tool call finds the relay expired. Long, overlapping sessions are the most common victims because they outlive their own relay.
Does "Native hook relay unavailable" mean my instance is broken or out of memory?
Usually no. A stale relay is normally isolated — one or two sessions blocked, no pod restarts, normal memory.
If instead every session is failing and the pod is restarting, suspect an out-of-memory crash (for
example, runaway logs_2.sqlite growth) rather than the relay, and fix that first. Check restarts
and memory before assuming it's the hook relay.