Fix owner-PID lifecycle monitoring for cross-platform reliability

Two bugs caused the brainstorm server to self-terminate within 60s:

1. ownerAlive() treated EPERM (permission denied) as "process dead".
   When the owner PID belongs to a different user (Tailscale SSH,
   system daemons), process.kill(pid, 0) throws EPERM — but the
   process IS alive. Fixed: return e.code === 'EPERM'.

2. On WSL, the grandparent PID resolves to a short-lived subprocess
   that exits before the first 60s lifecycle check. The PID is
   genuinely dead (ESRCH), so the EPERM fix alone doesn't help.
   Fixed: validate the owner PID at server startup — if it's already
   dead, it was a bad resolution, so disable monitoring and rely on
   the 30-minute idle timeout.

This also removes the Windows/MSYS2-specific OWNER_PID="" carve-out
from start-server.sh, since the server now handles invalid PIDs
generically at startup regardless of platform.

Tested on Linux (magic-kingdom) via Tailscale SSH:
- Root-owned owner PID (EPERM): server survives ✓
- Dead owner PID at startup (WSL sim): monitoring disabled, survives ✓
- Valid owner that dies: server shuts down within 60s ✓

Fixes #879
This commit is contained in:
Jesse Vincent
2026-03-24 14:39:20 -07:00
parent 738a18d6ff
commit af025aa35b
3 changed files with 17 additions and 10 deletions

View File

@@ -107,12 +107,6 @@ if [[ -z "$OWNER_PID" || "$OWNER_PID" == "1" ]]; then
OWNER_PID="$PPID"
fi
# On Windows/MSYS2, the MSYS2 PID namespace is invisible to Node.js.
# Skip owner-PID monitoring — the 30-minute idle timeout prevents orphans.
case "${OSTYPE:-}" in
msys*|cygwin*|mingw*) OWNER_PID="" ;;
esac
# Foreground mode for environments that reap detached/background processes.
if [[ "$FOREGROUND" == "true" ]]; then
echo "$$" > "$PID_FILE"