M .gitignore => .gitignore +1 -0
@@ 1,6 1,7 @@
node_modules/
browse/dist/
.gstack/
+.claude/skills/
/tmp/
*.log
bun.lock
M BROWSER.md => BROWSER.md +15 -23
@@ 33,7 33,7 @@ gstack's browser is a compiled CLI binary that talks to a persistent local Chrom
│ ▼ │
│ ┌──────────┐ HTTP POST ┌──────────────┐ │
│ │ browse │ ──────────────── │ Bun HTTP │ │
-│ │ CLI │ localhost:9400 │ server │ │
+│ │ CLI │ localhost:rand │ server │ │
│ │ │ Bearer token │ │ │
│ │ compiled │ ◄────────────── │ Playwright │──── Chromium │
│ │ binary │ plain text │ API calls │ (headless) │
@@ 46,7 46,7 @@ gstack's browser is a compiled CLI binary that talks to a persistent local Chrom
### Lifecycle
-1. **First call**: CLI checks `/tmp/browse-server.json` for a running server. None found — it spawns `bun run browse/src/server.ts` in the background. The server launches headless Chromium via Playwright, picks a port (9400-9410), generates a bearer token, writes the state file, and starts accepting HTTP requests. This takes ~3 seconds.
+1. **First call**: CLI checks `.gstack/browse.json` (in the project root) for a running server. None found — it spawns `bun run browse/src/server.ts` in the background. The server launches headless Chromium via Playwright, picks a random port (10000-60000), generates a bearer token, writes the state file, and starts accepting HTTP requests. This takes ~3 seconds.
2. **Subsequent calls**: CLI reads the state file, sends an HTTP POST with the bearer token, prints the response. ~100-200ms round trip.
@@ 94,15 94,15 @@ No DOM mutation. No injected scripts. Just Playwright's native accessibility API
### Authentication
-Each server session generates a random UUID as a bearer token. The token is written to the state file (`/tmp/browse-server.json`) with chmod 600. Every HTTP request must include `Authorization: Bearer <token>`. This prevents other processes on the machine from controlling the browser.
+Each server session generates a random UUID as a bearer token. The token is written to the state file (`.gstack/browse.json`) with chmod 600. Every HTTP request must include `Authorization: Bearer <token>`. This prevents other processes on the machine from controlling the browser.
### Console, network, and dialog capture
The server hooks into Playwright's `page.on('console')`, `page.on('response')`, and `page.on('dialog')` events. All entries are kept in O(1) circular buffers (50,000 capacity each) and flushed to disk asynchronously via `Bun.write()`:
-- Console: `/tmp/browse-console.log`
-- Network: `/tmp/browse-network.log`
-- Dialog: `/tmp/browse-dialog.log`
+- Console: `.gstack/browse-console.log`
+- Network: `.gstack/browse-network.log`
+- Dialog: `.gstack/browse-dialog.log`
The `console`, `network`, and `dialog` commands read from the in-memory buffers, not disk.
@@ 112,30 112,22 @@ Dialogs (alert, confirm, prompt) are auto-accepted by default to prevent browser
### Multi-workspace support
-Each workspace gets its own isolated browser instance with its own Chromium process, tabs, cookies, and logs.
+Each workspace gets its own isolated browser instance with its own Chromium process, tabs, cookies, and logs. State is stored in `.gstack/` inside the project root (detected via `git rev-parse --show-toplevel`).
-If `CONDUCTOR_PORT` is set (e.g., by [Conductor](https://conductor.dev)), the browse port is derived deterministically:
+| Workspace | State file | Port |
+|-----------|------------|------|
+| `/code/project-a` | `/code/project-a/.gstack/browse.json` | random (10000-60000) |
+| `/code/project-b` | `/code/project-b/.gstack/browse.json` | random (10000-60000) |
-```
-browse_port = CONDUCTOR_PORT - 45600
-```
-
-| Workspace | CONDUCTOR_PORT | Browse port | State file |
-|-----------|---------------|-------------|------------|
-| Workspace A | 55040 | 9440 | `/tmp/browse-server-9440.json` |
-| Workspace B | 55041 | 9441 | `/tmp/browse-server-9441.json` |
-| No Conductor | — | 9400 (scan) | `/tmp/browse-server.json` |
-
-You can also set `BROWSE_PORT` directly.
+No port collisions. No shared state. Each project is fully isolated.
### Environment variables
| Variable | Default | Description |
|----------|---------|-------------|
-| `BROWSE_PORT` | 0 (auto-scan 9400-9410) | Fixed port for the HTTP server |
-| `CONDUCTOR_PORT` | — | If set, browse port = this - 45600 |
+| `BROWSE_PORT` | 0 (random 10000-60000) | Fixed port for the HTTP server (debug override) |
| `BROWSE_IDLE_TIMEOUT` | 1800000 (30 min) | Idle shutdown timeout in ms |
-| `BROWSE_STATE_FILE` | `/tmp/browse-server.json` | Path to state file |
+| `BROWSE_STATE_FILE` | `.gstack/browse.json` | Path to state file (CLI passes to server) |
| `BROWSE_SERVER_SCRIPT` | auto-detected | Path to server.ts |
### Performance
@@ 206,7 198,7 @@ Tests spin up a local HTTP server (`browse/test/test-server.ts`) serving HTML fi
| File | Role |
|------|------|
-| `browse/src/cli.ts` | Entry point. Reads `/tmp/browse-server.json`, sends HTTP to the server, prints response. |
+| `browse/src/cli.ts` | Entry point. Reads `.gstack/browse.json`, sends HTTP to the server, prints response. |
| `browse/src/server.ts` | Bun HTTP server. Routes commands to the right handler. Manages idle timeout. |
| `browse/src/browser-manager.ts` | Chromium lifecycle — launch, tab management, ref map, crash detection. |
| `browse/src/snapshot.ts` | Parses accessibility tree, assigns `@e`/`@c` refs, builds Locator map. Handles `--diff`, `--annotate`, `-C`. |
M CHANGELOG.md => CHANGELOG.md +43 -0
@@ 1,5 1,48 @@
# Changelog
+## 0.3.2 — 2026-03-13
+
+### Fixed
+- Cookie import picker now returns JSON instead of HTML — `jsonResponse()` referenced `url` out of scope, crashing every API call
+- `help` command routed correctly (was unreachable due to META_COMMANDS dispatch ordering)
+- Stale servers from global install no longer shadow local changes — removed legacy `~/.claude/skills/gstack` fallback from `resolveServerScript()`
+- Crash log path references updated from `/tmp/` to `.gstack/`
+
+### Added
+- **Diff-aware QA mode** — `/qa` on a feature branch auto-analyzes `git diff`, identifies affected pages/routes, detects the running app on localhost, and tests only what changed. No URL needed.
+- **Project-local browse state** — state file, logs, and all server state now live in `.gstack/` inside the project root (detected via `git rev-parse --show-toplevel`). No more `/tmp` state files.
+- **Shared config module** (`browse/src/config.ts`) — centralizes path resolution for CLI and server, eliminates duplicated port/state logic
+- **Random port selection** — server picks a random port 10000-60000 instead of scanning 9400-9409. No more CONDUCTOR_PORT magic offset. No more port collisions across workspaces.
+- **Binary version tracking** — state file includes `binaryVersion` SHA; CLI auto-restarts the server when the binary is rebuilt
+- **Legacy /tmp cleanup** — CLI scans for and removes old `/tmp/browse-server*.json` files, verifying PID ownership before sending signals
+- **Greptile integration** — `/review` and `/ship` fetch and triage Greptile bot comments; `/retro` tracks Greptile batting average across weeks
+- **Local dev mode** — `bin/dev-setup` symlinks skills from the repo for in-place development; `bin/dev-teardown` restores global install
+- `help` command — agents can self-discover all commands and snapshot flags
+- Version-aware `find-browse` with META signal protocol — detects stale binaries and prompts agents to update
+- `browse/dist/find-browse` compiled binary with git SHA comparison against origin/main (4hr cached)
+- `.version` file written at build time for binary version tracking
+- Route-level tests for cookie picker (13 tests) and find-browse version check (10 tests)
+- Config resolution tests (14 tests) covering git root detection, BROWSE_STATE_FILE override, ensureStateDir, readVersionHash, resolveServerScript, and version mismatch detection
+- Browser interaction guidance in CLAUDE.md — prevents Claude from using mcp\_\_claude-in-chrome\_\_\* tools
+- CONTRIBUTING.md with quick start, dev mode explanation, and instructions for testing branches in other repos
+
+### Changed
+- State file location: `.gstack/browse.json` (was `/tmp/browse-server.json`)
+- Log files location: `.gstack/browse-{console,network,dialog}.log` (was `/tmp/browse-*.log`)
+- Atomic state file writes: `.json.tmp` → rename (prevents partial reads)
+- CLI passes `BROWSE_STATE_FILE` to spawned server (server derives all paths from it)
+- SKILL.md setup checks parse META signals and handle `META:UPDATE_AVAILABLE`
+- `/qa` SKILL.md now describes four modes (diff-aware, full, quick, regression) with diff-aware as the default on feature branches
+- `jsonResponse`/`errorResponse` use options objects to prevent positional parameter confusion
+- Build script compiles both `browse` and `find-browse` binaries, cleans up `.bun-build` temp files
+- README updated with Greptile setup instructions, diff-aware QA examples, and revised demo transcript
+
+### Removed
+- `CONDUCTOR_PORT` magic offset (`browse_port = CONDUCTOR_PORT - 45600`)
+- Port scan range 9400-9409
+- Legacy fallback to `~/.claude/skills/gstack/browse/src/server.ts`
+- `DEVELOPING_GSTACK.md` (renamed to CONTRIBUTING.md)
+
## 0.3.1 — 2026-03-12
### Phase 3.5: Browser cookie import
M CLAUDE.md => CLAUDE.md +7 -0
@@ 27,6 27,13 @@ gstack/
└── package.json # Build scripts for browse
```
+## Browser interaction
+
+When you need to interact with a browser (QA, dogfooding, cookie setup), use the
+`/browse` skill or run the browse binary directly via `$B <command>`. NEVER use
+`mcp__claude-in-chrome__*` tools — they are slow, unreliable, and not what this
+project uses.
+
## Deploying to the active skill
The active skill lives at `~/.claude/skills/gstack/`. After making changes:
A CONTRIBUTING.md => CONTRIBUTING.md +153 -0
@@ 0,0 1,153 @@
+# Contributing to gstack
+
+Thanks for wanting to make gstack better. Whether you're fixing a typo in a skill prompt or building an entirely new workflow, this guide will get you up and running fast.
+
+## Quick start
+
+gstack skills are Markdown files that Claude Code discovers from a `skills/` directory. Normally they live at `~/.claude/skills/gstack/` (your global install). But when you're developing gstack itself, you want Claude Code to use the skills *in your working tree* — so edits take effect instantly without copying or deploying anything.
+
+That's what dev mode does. It symlinks your repo into the local `.claude/skills/` directory so Claude Code reads skills straight from your checkout.
+
+```bash
+git clone <repo> && cd gstack
+bun install # install dependencies
+bin/dev-setup # activate dev mode
+```
+
+Now edit any `SKILL.md`, invoke it in Claude Code (e.g. `/review`), and see your changes live. When you're done developing:
+
+```bash
+bin/dev-teardown # deactivate — back to your global install
+```
+
+## How dev mode works
+
+`bin/dev-setup` creates a `.claude/skills/` directory inside the repo (gitignored) and fills it with symlinks pointing back to your working tree. Claude Code sees the local `skills/` first, so your edits win over the global install.
+
+```
+gstack/ <- your working tree
+├── .claude/skills/ <- created by dev-setup (gitignored)
+│ ├── gstack -> ../../ <- symlink back to repo root
+│ ├── review -> gstack/review
+│ ├── ship -> gstack/ship
+│ └── ... <- one symlink per skill
+├── review/
+│ └── SKILL.md <- edit this, test with /review
+├── ship/
+│ └── SKILL.md
+├── browse/
+│ ├── src/ <- TypeScript source
+│ └── dist/ <- compiled binary (gitignored)
+└── ...
+```
+
+## Day-to-day workflow
+
+```bash
+# 1. Enter dev mode
+bin/dev-setup
+
+# 2. Edit a skill
+vim review/SKILL.md
+
+# 3. Test it in Claude Code — changes are live
+# > /review
+
+# 4. Editing browse source? Rebuild the binary
+bun run build
+
+# 5. Done for the day? Tear down
+bin/dev-teardown
+```
+
+## Running tests
+
+```bash
+bun test # all tests (browse integration + snapshot)
+bun run dev <cmd> # run CLI in dev mode, e.g. bun run dev goto https://example.com
+bun run build # compile binary to browse/dist/browse
+```
+
+Tests run against the browse binary directly — they don't require dev mode.
+
+## Things to know
+
+- **SKILL.md changes are instant.** They're just Markdown. Edit, save, invoke.
+- **Browse source changes need a rebuild.** If you touch `browse/src/*.ts`, run `bun run build`.
+- **Dev mode shadows your global install.** Project-local skills take priority over `~/.claude/skills/gstack`. `bin/dev-teardown` restores the global one.
+- **Conductor workspaces are independent.** Each workspace is its own clone. Run `bin/dev-setup` in the one you're working in.
+- **`.claude/skills/` is gitignored.** The symlinks never get committed.
+
+## Testing a branch in another repo
+
+When you're developing gstack in one workspace and want to test your branch in a
+different project (e.g. testing browse changes against your real app), there are
+two cases depending on how gstack is installed in that project.
+
+### Global install only (no `.claude/skills/gstack/` in the project)
+
+Point your global install at the branch:
+
+```bash
+cd ~/.claude/skills/gstack
+git fetch origin
+git checkout origin/<branch> # e.g. origin/v0.3.2
+bun install # in case deps changed
+bun run build # rebuild the binary
+```
+
+Now open Claude Code in the other project — it picks up skills from
+`~/.claude/skills/` automatically. To go back to main when you're done:
+
+```bash
+cd ~/.claude/skills/gstack
+git checkout main && git pull
+bun run build
+```
+
+### Vendored project copy (`.claude/skills/gstack/` checked into the project)
+
+Some projects vendor gstack by copying it into the repo (no `.git` inside the
+copy). Project-local skills take priority over global, so you need to update
+the vendored copy too. This is a three-step process:
+
+1. **Update your global install to the branch** (so you have the source):
+ ```bash
+ cd ~/.claude/skills/gstack
+ git fetch origin
+ git checkout origin/<branch> # e.g. origin/v0.3.2
+ bun install && bun run build
+ ```
+
+2. **Replace the vendored copy** in the other project:
+ ```bash
+ cd /path/to/other-project
+
+ # Remove old skill symlinks and vendored copy
+ for s in browse plan-ceo-review plan-eng-review review ship retro qa setup-browser-cookies; do
+ rm -f .claude/skills/$s
+ done
+ rm -rf .claude/skills/gstack
+
+ # Copy from global install (strips .git so it stays vendored)
+ cp -Rf ~/.claude/skills/gstack .claude/skills/gstack
+ rm -rf .claude/skills/gstack/.git
+
+ # Rebuild binary and re-create skill symlinks
+ cd .claude/skills/gstack && ./setup
+ ```
+
+3. **Test your changes** — open Claude Code in that project and use the skills.
+
+To revert to main later, repeat steps 1-2 with `git checkout main && git pull`
+instead of `git checkout origin/<branch>`.
+
+## Shipping your changes
+
+When you're happy with your skill edits:
+
+```bash
+/ship
+```
+
+This runs tests, reviews the diff, bumps the version, and opens a PR. See `ship/SKILL.md` for the full workflow.
M README.md => README.md +99 -13
@@ 19,10 19,10 @@ Eight opinionated workflow skills for [Claude Code](https://docs.anthropic.com/e
|-------|------|--------------|
| `/plan-ceo-review` | Founder / CEO | Rethink the problem. Find the 10-star product hiding inside the request. |
| `/plan-eng-review` | Eng manager / tech lead | Lock in architecture, data flow, diagrams, edge cases, and tests. |
-| `/review` | Paranoid staff engineer | Find the bugs that pass CI but blow up in production. Not a style nitpick pass. |
-| `/ship` | Release engineer | Sync main, run tests, push, open PR. For a ready branch, not for deciding what to build. |
+| `/review` | Paranoid staff engineer | Find the bugs that pass CI but blow up in production. Triages Greptile review comments. |
+| `/ship` | Release engineer | Sync main, run tests, resolve Greptile reviews, push, open PR. For a ready branch, not for deciding what to build. |
| `/browse` | QA engineer | Give the agent eyes. It logs in, clicks through your app, takes screenshots, catches breakage. Full QA pass in 60 seconds. |
-| `/qa` | QA lead | Systematic QA testing with structured reports, health scores, screenshots, and regression tracking. Three modes: full, quick, regression. |
+| `/qa` | QA lead | Systematic QA testing. On a feature branch, auto-analyzes your diff, identifies affected pages, and tests them. Also: full exploration, quick smoke test, regression mode. |
| `/setup-browser-cookies` | Session manager | Import cookies from your real browser (Comet, Chrome, Arc, Brave, Edge) into the headless session. Test authenticated pages without logging in manually. |
| `/retro` | Engineering manager | Team-aware retro: your deep-dive + per-person praise and growth opportunities for every contributor. |
@@ 63,6 63,12 @@ You: /ship
Claude: [Syncs main, runs tests, pushes branch, opens PR — 6 tool calls, done]
+You: /qa
+
+Claude: Analyzing branch diff... 8 files changed, 3 routes affected.
+ [Tests /listings/new, /listings/:id, /api/listings against localhost:3000]
+ All 3 routes working. Upload + enrichment flow passes end to end.
+
You: /setup-browser-cookies staging.myapp.com
Claude: Imported 8 cookies for staging.myapp.com from Chrome.
@@ 71,12 77,6 @@ You: /qa https://staging.myapp.com --quick
Claude: [Smoke test: homepage + 5 pages, 30 seconds]
Health Score: 91/100. No critical issues. 1 medium: mobile nav overlap.
-
-You: /browse staging.myapp.com/listings/new — test the upload flow specifically
-
-Claude: [22 tool calls — navigates routes, fills the upload form, verifies
- enrichment renders, checks console for errors, screenshots each step]
- Listing flow works end to end on staging.
```
## Who this is for
@@ 91,7 91,7 @@ gstack is powerful with one Claude Code session. It is transformative with ten.
[Conductor](https://conductor.build) runs multiple Claude Code sessions in parallel — each in its own isolated workspace. That means you can have one session running `/qa` on staging, another doing `/review` on a PR, a third implementing a feature, and seven more working on other branches. All at the same time.
-gstack is Conductor-aware out of the box. Each workspace gets its own isolated browser instance (separate Chromium process, cookies, tabs, and logs) so `/browse` and `/qa` sessions never collide. No configuration needed — it just works.
+Each workspace gets its own isolated browser instance automatically — separate Chromium process, cookies, tabs, and logs stored in `.gstack/` inside each project root. No port collisions, no shared state, no configuration needed. `/browse` and `/qa` sessions never interfere with each other, even across ten parallel workspaces.
This is the setup I use. One person, ten parallel agents, each with the right cognitive mode for its task. That is not incremental improvement. That is a different way of building software.
@@ 338,6 338,72 @@ I want the plane landed.
---
+## Greptile integration
+
+[Greptile](https://greptile.com) is a YC company that reviews your PRs automatically. It catches real bugs — race conditions, security issues, things that pass CI and blow up in production. It has genuinely saved my ass more than once. I love these guys.
+
+**Setup:** Install Greptile on your GitHub repo at [greptile.com](https://greptile.com) — it takes about 30 seconds. Once it's reviewing your PRs, gstack picks up its comments automatically. No additional configuration.
+
+The problem with any automated reviewer is triage. Greptile is good, but not every comment is a real issue. Some are false positives. Some flag things you already fixed three commits ago. Without a triage layer, the comments pile up and you start ignoring them — which defeats the purpose.
+
+gstack solves this. `/review` and `/ship` are now Greptile-aware. They read Greptile's comments, classify each one, and take action:
+
+- **Valid issues** get added to the critical findings and fixed before shipping
+- **Already-fixed issues** get an auto-reply acknowledging the catch
+- **False positives** get pushed back — you confirm, and a reply goes out explaining why it's wrong
+
+The result is a two-layer review: Greptile catches things asynchronously on the PR, then `/review` and `/ship` triage those findings as part of the normal workflow. Nothing falls through the cracks.
+
+It also learns. Every false positive you confirm gets saved to `~/.gstack/greptile-history.md`. Future runs auto-skip known FP patterns for your codebase. And `/retro` tracks Greptile's batting average over time — so you can see whether the signal-to-noise ratio is improving.
+
+### Example
+
+```
+You: /ship
+
+Claude: [syncs main, runs tests, pre-landing review...]
+
+ Greptile found 3 comments on this PR:
+
+ [VALID] app/services/payment_service.rb:47 — Race condition:
+ concurrent charges can double-debit. Recommend DB-level advisory
+ lock around the charge block.
+ → https://github.com/you/app/pull/42#discussion_r123
+
+ [ALREADY FIXED] app/models/user.rb:12 — Missing null check
+ on email before downcase.
+ → Already fixed in commit abc1234. Auto-replying.
+
+ [FALSE POSITIVE] lib/auth.rb:88 — "Token comparison should
+ use constant-time comparison." We already use
+ ActiveSupport::SecurityUtils.secure_compare here.
+ → https://github.com/you/app/pull/42#discussion_r125
+
+ Fixing the race condition in payment_service.rb...
+ Applied advisory lock. Committed: fix: address Greptile
+ review — advisory lock on concurrent charges
+
+ The auth token comment is a false positive — we're already
+ using secure_compare.
+
+ Options for the false positive:
+ A) Reply to Greptile explaining (recommended)
+ B) Fix it anyway
+ C) Ignore
+
+You: A
+
+Claude: Replied to Greptile. Re-running tests after the fix...
+ All tests pass. Continuing to version bump.
+
+ [creates PR with Greptile Review section in body]
+ PR: https://github.com/you/app/pull/42
+```
+
+Three Greptile comments. One real fix. One auto-acknowledged. One false positive pushed back with a reply. Total extra time: about 30 seconds. And the PR body has the full audit trail.
+
+---
+
## `/browse`
This is my **QA engineer mode**.
@@ 405,11 471,31 @@ This is my **QA lead mode**.
`/browse` gives the agent eyes. `/qa` gives it a testing methodology.
-Where `/browse` is a single command — go here, click this, screenshot that — `/qa` is a full systematic test pass. It explores every reachable page, fills forms, clicks buttons, checks console errors, tests responsive layouts, and produces a structured report with a health score, screenshots as evidence, and ranked issues with repro steps.
+The most common use case: you're on a feature branch, you just finished coding, and you want to verify everything works. Just say `/qa` — it reads your git diff, identifies which pages and routes your changes affect, spins up the browser, and tests each one. No URL required. No manual test plan. It figures out what to test from the code you changed.
+
+```
+You: /qa
+
+Claude: Analyzing branch diff against main...
+ 12 files changed: 3 controllers, 2 views, 4 services, 3 tests
+
+ Affected routes: /listings/new, /listings/:id, /api/listings
+ Detected app running on localhost:3000.
+
+ [Tests each affected page — navigates, fills forms, clicks buttons,
+ screenshots, checks console errors]
+
+ QA Report: 3 routes tested, all working.
+ - /listings/new: upload + enrichment flow works end to end
+ - /listings/:id: detail page renders correctly
+ - /api/listings: returns 200 with expected shape
+ No console errors. No regressions on adjacent pages.
+```
-Three modes:
+Four modes:
-- **Full** (default) — systematic exploration of the entire app. 5-15 minutes depending on app size. Documents 5-10 well-evidenced issues.
+- **Diff-aware** (automatic on feature branches) — reads `git diff main`, identifies affected pages, tests them specifically. The fastest path from "I just wrote code" to "it works."
+- **Full** — systematic exploration of the entire app. 5-15 minutes depending on app size. Documents 5-10 well-evidenced issues.
- **Quick** (`--quick`) — 30-second smoke test. Homepage + top 5 nav targets. Loads? Console errors? Broken links?
- **Regression** (`--regression baseline.json`) — run full mode, then diff against a previous baseline. Which issues are fixed? Which are new? What's the score delta?
M SKILL.md => SKILL.md +11 -1
@@ 21,9 21,12 @@ Auto-shuts down after 30 min idle. State persists between calls (cookies, tabs,
## SETUP (run this check BEFORE any browse command)
```bash
-B=$(browse/bin/find-browse 2>/dev/null || ~/.claude/skills/gstack/browse/bin/find-browse 2>/dev/null)
+BROWSE_OUTPUT=$(browse/bin/find-browse 2>/dev/null || ~/.claude/skills/gstack/browse/bin/find-browse 2>/dev/null)
+B=$(echo "$BROWSE_OUTPUT" | head -1)
+META=$(echo "$BROWSE_OUTPUT" | grep "^META:" || true)
if [ -n "$B" ]; then
echo "READY: $B"
+ [ -n "$META" ] && echo "$META"
else
echo "NEEDS_SETUP"
fi
@@ 34,6 37,13 @@ If `NEEDS_SETUP`:
2. Run: `cd <SKILL_DIR> && ./setup`
3. If `bun` is not installed: `curl -fsSL https://bun.sh/install | bash`
+If you see `META:UPDATE_AVAILABLE`:
+1. Parse the JSON payload to get `current`, `latest`, and `command`.
+2. Tell the user: "A gstack update is available (current: X, latest: Y). OK to update?"
+3. **STOP and wait for approval.**
+4. Run the command from the META payload.
+5. Re-run the setup check above to get the updated binary path.
+
## IMPORTANT
- Use the compiled binary via Bash: `$B <command>`
M TODO.md => TODO.md +2 -0
@@ 77,6 77,7 @@
- Pass/fail with evidence
## Phase 5: State & Sessions
+ - [ ] Bundle server.ts into compiled binary (eliminate resolveServerScript() fallback chain entirely) (P2, M)
- [ ] v20 encryption format support (AES-256-GCM) — future Chromium versions may change from v10
- [ ] Sessions (isolated browser instances with separate cookies/storage/history)
- [ ] State persistence (save/load cookies + localStorage to JSON files)
@@ 103,6 104,7 @@
- [ ] Trend tracking across QA runs — compare baseline.json over time, detect regressions (P2, S)
- [ ] CI/CD integration — `/qa` as GitHub Action step, fail PR if health score drops (P2, M)
- [ ] Accessibility audit mode — `--a11y` flag for focused accessibility testing (P3, S)
+ - [ ] Greptile training feedback loop — export suppression patterns to Greptile team for model improvement (P3, S)
## Ideas & Notes
- Browser is the nervous system — every skill should be able to see, interact with, and verify the web
M VERSION => VERSION +1 -1
@@ 1,1 1,1 @@
-0.3.1
+0.3.2
A bin/dev-setup => bin/dev-setup +36 -0
@@ 0,0 1,36 @@
+#!/usr/bin/env bash
+# Set up gstack for local development — test skills from within this repo.
+#
+# Creates .claude/skills/gstack → (symlink to repo root) so Claude Code
+# discovers skills from your working tree. Changes take effect immediately.
+#
+# Usage: bin/dev-setup # set up
+# bin/dev-teardown # clean up
+set -e
+
+REPO_ROOT="$(cd "$(dirname "$0")/.." && pwd)"
+
+# 1. Create .claude/skills/ inside the repo
+mkdir -p "$REPO_ROOT/.claude/skills"
+
+# 2. Symlink .claude/skills/gstack → repo root
+# This makes setup think it's inside a real .claude/skills/ directory
+GSTACK_LINK="$REPO_ROOT/.claude/skills/gstack"
+if [ -L "$GSTACK_LINK" ]; then
+ echo "Updating existing symlink..."
+ rm "$GSTACK_LINK"
+elif [ -d "$GSTACK_LINK" ]; then
+ echo "Error: .claude/skills/gstack is a real directory, not a symlink." >&2
+ echo "Remove it manually if you want to use dev mode." >&2
+ exit 1
+fi
+ln -s "$REPO_ROOT" "$GSTACK_LINK"
+
+# 3. Run setup via the symlink so it detects .claude/skills/ as its parent
+"$GSTACK_LINK/setup"
+
+echo ""
+echo "Dev mode active. Skills resolve from this working tree."
+echo "Edit any SKILL.md and test immediately — no copy/deploy needed."
+echo ""
+echo "To tear down: bin/dev-teardown"
A bin/dev-teardown => bin/dev-teardown +39 -0
@@ 0,0 1,39 @@
+#!/usr/bin/env bash
+# Remove local dev skill symlinks. Restores global gstack as the active install.
+set -e
+
+REPO_ROOT="$(cd "$(dirname "$0")/.." && pwd)"
+SKILLS_DIR="$REPO_ROOT/.claude/skills"
+
+if [ ! -d "$SKILLS_DIR" ]; then
+ echo "Nothing to tear down — .claude/skills/ doesn't exist."
+ exit 0
+fi
+
+# Remove individual skill symlinks
+removed=()
+for link in "$SKILLS_DIR"/*/; do
+ name="$(basename "$link")"
+ [ "$name" = "gstack" ] && continue
+ if [ -L "${link%/}" ]; then
+ rm "${link%/}"
+ removed+=("$name")
+ fi
+done
+
+# Remove the gstack symlink
+if [ -L "$SKILLS_DIR/gstack" ]; then
+ rm "$SKILLS_DIR/gstack"
+ removed+=("gstack")
+fi
+
+# Clean up empty dirs
+rmdir "$SKILLS_DIR" 2>/dev/null || true
+rmdir "$REPO_ROOT/.claude" 2>/dev/null || true
+
+if [ ${#removed[@]} -gt 0 ]; then
+ echo "Removed: ${removed[*]}"
+else
+ echo "No symlinks found."
+fi
+echo "Dev mode deactivated. Global gstack (~/.claude/skills/gstack) is now active."
M browse/bin/find-browse => browse/bin/find-browse +7 -1
@@ 1,5 1,11 @@
#!/bin/bash
-# Find the gstack browse binary. Echoes path and exits 0, or exits 1 if not found.
+# Shim: delegates to compiled find-browse binary, falls back to basic discovery.
+# The compiled binary adds version checking and META signal support.
+DIR="$(cd "$(dirname "$0")/.." && pwd)/dist"
+if test -x "$DIR/find-browse"; then
+ exec "$DIR/find-browse" "$@"
+fi
+# Fallback: basic discovery (no version check)
ROOT=$(git rev-parse --show-toplevel 2>/dev/null)
if [ -n "$ROOT" ] && test -x "$ROOT/.claude/skills/gstack/browse/dist/browse"; then
echo "$ROOT/.claude/skills/gstack/browse/dist/browse"
M browse/src/browser-manager.ts => browse/src/browser-manager.ts +1 -1
@@ 47,7 47,7 @@ export class BrowserManager {
// Chromium crash → exit with clear message
this.browser.on('disconnected', () => {
console.error('[browse] FATAL: Chromium process crashed or was killed. Server exiting.');
- console.error('[browse] Console/network logs flushed to /tmp/browse-*.log');
+ console.error('[browse] Console/network logs flushed to .gstack/browse-*.log');
process.exit(1);
});
M browse/src/cli.ts => browse/src/cli.ts +80 -13
@@ 2,22 2,18 @@
* gstack CLI — thin wrapper that talks to the persistent server
*
* Flow:
- * 1. Read /tmp/browse-server.json for port + token
+ * 1. Read .gstack/browse.json for port + token
* 2. If missing or stale PID → start server in background
- * 3. Health check
+ * 3. Health check + version mismatch detection
* 4. Send command via HTTP POST
* 5. Print response to stdout (or stderr for errors)
*/
import * as fs from 'fs';
import * as path from 'path';
+import { resolveConfig, ensureStateDir, readVersionHash } from './config';
-const PORT_OFFSET = 45600;
-const BROWSE_PORT = process.env.CONDUCTOR_PORT
- ? parseInt(process.env.CONDUCTOR_PORT, 10) - PORT_OFFSET
- : parseInt(process.env.BROWSE_PORT || '0', 10);
-const INSTANCE_SUFFIX = BROWSE_PORT ? `-${BROWSE_PORT}` : '';
-const STATE_FILE = process.env.BROWSE_STATE_FILE || `/tmp/browse-server${INSTANCE_SUFFIX}.json`;
+const config = resolveConfig();
const MAX_START_WAIT = 8000; // 8 seconds to start
export function resolveServerScript(
@@ 45,8 41,9 @@ export function resolveServerScript(
}
}
- // Legacy fallback for user-level installs
- return path.resolve(env.HOME || '/tmp', '.claude/skills/gstack/browse/src/server.ts');
+ throw new Error(
+ 'Cannot find server.ts. Set BROWSE_SERVER_SCRIPT env or run from the browse source tree.'
+ );
}
const SERVER_SCRIPT = resolveServerScript();
@@ 57,12 54,13 @@ interface ServerState {
token: string;
startedAt: string;
serverPath: string;
+ binaryVersion?: string;
}
// ─── State File ────────────────────────────────────────────────
function readState(): ServerState | null {
try {
- const data = fs.readFileSync(STATE_FILE, 'utf-8');
+ const data = fs.readFileSync(config.stateFile, 'utf-8');
return JSON.parse(data);
} catch {
return null;
@@ 78,15 76,73 @@ function isProcessAlive(pid: number): boolean {
}
}
+// ─── Process Management ─────────────────────────────────────────
+async function killServer(pid: number): Promise<void> {
+ if (!isProcessAlive(pid)) return;
+
+ try { process.kill(pid, 'SIGTERM'); } catch { return; }
+
+ // Wait up to 2s for graceful shutdown
+ const deadline = Date.now() + 2000;
+ while (Date.now() < deadline && isProcessAlive(pid)) {
+ await Bun.sleep(100);
+ }
+
+ // Force kill if still alive
+ if (isProcessAlive(pid)) {
+ try { process.kill(pid, 'SIGKILL'); } catch {}
+ }
+}
+
+/**
+ * Clean up legacy /tmp/browse-server*.json files from before project-local state.
+ * Verifies PID ownership before sending signals.
+ */
+function cleanupLegacyState(): void {
+ try {
+ const files = fs.readdirSync('/tmp').filter(f => f.startsWith('browse-server') && f.endsWith('.json'));
+ for (const file of files) {
+ const fullPath = `/tmp/${file}`;
+ try {
+ const data = JSON.parse(fs.readFileSync(fullPath, 'utf-8'));
+ if (data.pid && isProcessAlive(data.pid)) {
+ // Verify this is actually a browse server before killing
+ const check = Bun.spawnSync(['ps', '-p', String(data.pid), '-o', 'command='], {
+ stdout: 'pipe', stderr: 'pipe', timeout: 2000,
+ });
+ const cmd = check.stdout.toString().trim();
+ if (cmd.includes('bun') || cmd.includes('server.ts')) {
+ try { process.kill(data.pid, 'SIGTERM'); } catch {}
+ }
+ }
+ fs.unlinkSync(fullPath);
+ } catch {
+ // Best effort — skip files we can't parse or clean up
+ }
+ }
+ // Clean up legacy log files too
+ const logFiles = fs.readdirSync('/tmp').filter(f =>
+ f.startsWith('browse-console') || f.startsWith('browse-network') || f.startsWith('browse-dialog')
+ );
+ for (const file of logFiles) {
+ try { fs.unlinkSync(`/tmp/${file}`); } catch {}
+ }
+ } catch {
+ // /tmp read failed — skip legacy cleanup
+ }
+}
+
// ─── Server Lifecycle ──────────────────────────────────────────
async function startServer(): Promise<ServerState> {
+ ensureStateDir(config);
+
// Clean up stale state file
- try { fs.unlinkSync(STATE_FILE); } catch {}
+ try { fs.unlinkSync(config.stateFile); } catch {}
// Start server as detached background process
const proc = Bun.spawn(['bun', 'run', SERVER_SCRIPT], {
stdio: ['ignore', 'pipe', 'pipe'],
- env: { ...process.env },
+ env: { ...process.env, BROWSE_STATE_FILE: config.stateFile },
});
// Don't hold the CLI open
@@ 120,6 176,14 @@ async function ensureServer(): Promise<ServerState> {
const state = readState();
if (state && isProcessAlive(state.pid)) {
+ // Check for binary version mismatch (auto-restart on update)
+ const currentVersion = readVersionHash();
+ if (currentVersion && state.binaryVersion && currentVersion !== state.binaryVersion) {
+ console.error('[browse] Binary updated, restarting server...');
+ await killServer(state.pid);
+ return startServer();
+ }
+
// Server appears alive — do a health check
try {
const resp = await fetch(`http://127.0.0.1:${state.port}/health`, {
@@ 237,6 301,9 @@ Refs: After 'snapshot', use @e1, @e2... as selectors:
process.exit(0);
}
+ // One-time cleanup of legacy /tmp state files
+ cleanupLegacyState();
+
const command = args[0];
const commandArgs = args.slice(1);
A browse/src/config.ts => browse/src/config.ts +105 -0
@@ 0,0 1,105 @@
+/**
+ * Shared config for browse CLI + server.
+ *
+ * Resolution:
+ * 1. BROWSE_STATE_FILE env → derive stateDir from parent
+ * 2. git rev-parse --show-toplevel → projectDir/.gstack/
+ * 3. process.cwd() fallback (non-git environments)
+ *
+ * The CLI computes the config and passes BROWSE_STATE_FILE to the
+ * spawned server. The server derives all paths from that env var.
+ */
+
+import * as fs from 'fs';
+import * as path from 'path';
+
+export interface BrowseConfig {
+ projectDir: string;
+ stateDir: string;
+ stateFile: string;
+ consoleLog: string;
+ networkLog: string;
+ dialogLog: string;
+}
+
+/**
+ * Detect the git repository root, or null if not in a repo / git unavailable.
+ */
+export function getGitRoot(): string | null {
+ try {
+ const proc = Bun.spawnSync(['git', 'rev-parse', '--show-toplevel'], {
+ stdout: 'pipe',
+ stderr: 'pipe',
+ timeout: 2_000, // Don't hang if .git is broken
+ });
+ if (proc.exitCode !== 0) return null;
+ return proc.stdout.toString().trim() || null;
+ } catch {
+ return null;
+ }
+}
+
+/**
+ * Resolve all browse config paths.
+ *
+ * If BROWSE_STATE_FILE is set (e.g. by CLI when spawning server, or by
+ * tests for isolation), all paths are derived from it. Otherwise, the
+ * project root is detected via git or cwd.
+ */
+export function resolveConfig(
+ env: Record<string, string | undefined> = process.env,
+): BrowseConfig {
+ let stateFile: string;
+ let stateDir: string;
+ let projectDir: string;
+
+ if (env.BROWSE_STATE_FILE) {
+ stateFile = env.BROWSE_STATE_FILE;
+ stateDir = path.dirname(stateFile);
+ projectDir = path.dirname(stateDir); // parent of .gstack/
+ } else {
+ projectDir = getGitRoot() || process.cwd();
+ stateDir = path.join(projectDir, '.gstack');
+ stateFile = path.join(stateDir, 'browse.json');
+ }
+
+ return {
+ projectDir,
+ stateDir,
+ stateFile,
+ consoleLog: path.join(stateDir, 'browse-console.log'),
+ networkLog: path.join(stateDir, 'browse-network.log'),
+ dialogLog: path.join(stateDir, 'browse-dialog.log'),
+ };
+}
+
+/**
+ * Create the .gstack/ state directory if it doesn't exist.
+ * Throws with a clear message on permission errors.
+ */
+export function ensureStateDir(config: BrowseConfig): void {
+ try {
+ fs.mkdirSync(config.stateDir, { recursive: true });
+ } catch (err: any) {
+ if (err.code === 'EACCES') {
+ throw new Error(`Cannot create state directory ${config.stateDir}: permission denied`);
+ }
+ if (err.code === 'ENOTDIR') {
+ throw new Error(`Cannot create state directory ${config.stateDir}: a file exists at that path`);
+ }
+ throw err;
+ }
+}
+
+/**
+ * Read the binary version (git SHA) from browse/dist/.version.
+ * Returns null if the file doesn't exist or can't be read.
+ */
+export function readVersionHash(execPath: string = process.execPath): string | null {
+ try {
+ const versionFile = path.resolve(path.dirname(execPath), '.version');
+ return fs.readFileSync(versionFile, 'utf-8').trim() || null;
+ } catch {
+ return null;
+ }
+}
M browse/src/cookie-picker-routes.ts => browse/src/cookie-picker-routes.ts +28 -21
@@ 26,18 26,25 @@ const importedCounts = new Map<string, number>();
// ─── JSON Helpers ───────────────────────────────────────────────
-function jsonResponse(data: any, status = 200): Response {
+function corsOrigin(port: number): string {
+ return `http://127.0.0.1:${port}`;
+}
+
+function jsonResponse(data: any, opts: { port: number; status?: number }): Response {
return new Response(JSON.stringify(data), {
- status,
+ status: opts.status ?? 200,
headers: {
'Content-Type': 'application/json',
- 'Access-Control-Allow-Origin': `http://127.0.0.1:${parseInt(url.port, 10) || 9400}`,
+ 'Access-Control-Allow-Origin': corsOrigin(opts.port),
},
});
}
-function errorResponse(message: string, code: string, status = 400, action?: string): Response {
- return jsonResponse({ error: message, code, ...(action ? { action } : {}) }, status);
+function errorResponse(message: string, code: string, opts: { port: number; status?: number; action?: string }): Response {
+ return jsonResponse(
+ { error: message, code, ...(opts.action ? { action: opts.action } : {}) },
+ { port: opts.port, status: opts.status ?? 400 },
+ );
}
// ─── Route Handler ──────────────────────────────────────────────
@@ 48,13 55,14 @@ export async function handleCookiePickerRoute(
bm: BrowserManager,
): Promise<Response> {
const pathname = url.pathname;
+ const port = parseInt(url.port, 10) || 9400;
// CORS preflight
if (req.method === 'OPTIONS') {
return new Response(null, {
status: 204,
headers: {
- 'Access-Control-Allow-Origin': `http://127.0.0.1:${parseInt(url.port, 10) || 9400}`,
+ 'Access-Control-Allow-Origin': corsOrigin(port),
'Access-Control-Allow-Methods': 'GET, POST, OPTIONS',
'Access-Control-Allow-Headers': 'Content-Type',
},
@@ 64,7 72,6 @@ export async function handleCookiePickerRoute(
try {
// GET /cookie-picker — serve the picker UI
if (pathname === '/cookie-picker' && req.method === 'GET') {
- const port = parseInt(url.port, 10) || 9400;
const html = getCookiePickerHTML(port);
return new Response(html, {
status: 200,
@@ 80,20 87,20 @@ export async function handleCookiePickerRoute(
name: b.name,
aliases: b.aliases,
})),
- });
+ }, { port });
}
// GET /cookie-picker/domains?browser=<name> — list domains + counts
if (pathname === '/cookie-picker/domains' && req.method === 'GET') {
const browserName = url.searchParams.get('browser');
if (!browserName) {
- return errorResponse("Missing 'browser' parameter", 'missing_param');
+ return errorResponse("Missing 'browser' parameter", 'missing_param', { port });
}
const result = listDomains(browserName);
return jsonResponse({
browser: result.browser,
domains: result.domains,
- });
+ }, { port });
}
// POST /cookie-picker/import — decrypt + import to Playwright session
@@ 102,13 109,13 @@ export async function handleCookiePickerRoute(
try {
body = await req.json();
} catch {
- return errorResponse('Invalid JSON body', 'bad_request');
+ return errorResponse('Invalid JSON body', 'bad_request', { port });
}
const { browser, domains } = body;
- if (!browser) return errorResponse("Missing 'browser' field", 'missing_param');
+ if (!browser) return errorResponse("Missing 'browser' field", 'missing_param', { port });
if (!domains || !Array.isArray(domains) || domains.length === 0) {
- return errorResponse("Missing or empty 'domains' array", 'missing_param');
+ return errorResponse("Missing or empty 'domains' array", 'missing_param', { port });
}
// Decrypt cookies from the browser DB
@@ 122,7 129,7 @@ export async function handleCookiePickerRoute(
message: result.failed > 0
? `All ${result.failed} cookies failed to decrypt`
: 'No cookies found for the specified domains',
- });
+ }, { port });
}
// Add to Playwright context
@@ 141,7 148,7 @@ export async function handleCookiePickerRoute(
imported: result.count,
failed: result.failed,
domainCounts: result.domainCounts,
- });
+ }, { port });
}
// POST /cookie-picker/remove — clear cookies for domains
@@ 150,12 157,12 @@ export async function handleCookiePickerRoute(
try {
body = await req.json();
} catch {
- return errorResponse('Invalid JSON body', 'bad_request');
+ return errorResponse('Invalid JSON body', 'bad_request', { port });
}
const { domains } = body;
if (!domains || !Array.isArray(domains) || domains.length === 0) {
- return errorResponse("Missing or empty 'domains' array", 'missing_param');
+ return errorResponse("Missing or empty 'domains' array", 'missing_param', { port });
}
const page = bm.getPage();
@@ 171,7 178,7 @@ export async function handleCookiePickerRoute(
return jsonResponse({
removed: domains.length,
domains,
- });
+ }, { port });
}
// GET /cookie-picker/imported — currently imported domains + counts
@@ 186,15 193,15 @@ export async function handleCookiePickerRoute(
domains: entries,
totalDomains: entries.length,
totalCookies: entries.reduce((sum, e) => sum + e.count, 0),
- });
+ }, { port });
}
return new Response('Not found', { status: 404 });
} catch (err: any) {
if (err instanceof CookieImportError) {
- return errorResponse(err.message, err.code, 400, err.action);
+ return errorResponse(err.message, err.code, { port, status: 400, action: err.action });
}
console.error(`[cookie-picker] Error: ${err.message}`);
- return errorResponse(err.message || 'Internal error', 'internal_error', 500);
+ return errorResponse(err.message || 'Internal error', 'internal_error', { port, status: 500 });
}
}
A browse/src/find-browse.ts => browse/src/find-browse.ts +181 -0
@@ 0,0 1,181 @@
+/**
+ * find-browse — locate the gstack browse binary + check for updates.
+ *
+ * Compiled to browse/dist/find-browse (standalone binary, no bun runtime needed).
+ *
+ * Output protocol:
+ * Line 1: /path/to/binary (always present)
+ * Line 2+: META:<TYPE> <json-payload> (optional, 0 or more)
+ *
+ * META types:
+ * META:UPDATE_AVAILABLE — local binary is behind origin/main
+ *
+ * All version checks are best-effort: network failures, missing files, and
+ * cache errors degrade gracefully to outputting only the binary path.
+ */
+
+import { existsSync } from 'fs';
+import { readFileSync, writeFileSync } from 'fs';
+import { join, dirname } from 'path';
+import { homedir } from 'os';
+
+const REPO_URL = 'https://github.com/garrytan/gstack.git';
+const CACHE_PATH = '/tmp/gstack-latest-version';
+const CACHE_TTL = 14400; // 4 hours in seconds
+
+// ─── Binary Discovery ───────────────────────────────────────────
+
+function getGitRoot(): string | null {
+ try {
+ const proc = Bun.spawnSync(['git', 'rev-parse', '--show-toplevel'], {
+ stdout: 'pipe',
+ stderr: 'pipe',
+ });
+ if (proc.exitCode !== 0) return null;
+ return proc.stdout.toString().trim();
+ } catch {
+ return null;
+ }
+}
+
+export function locateBinary(): string | null {
+ const root = getGitRoot();
+ const home = homedir();
+
+ // Workspace-local takes priority (for development)
+ if (root) {
+ const local = join(root, '.claude', 'skills', 'gstack', 'browse', 'dist', 'browse');
+ if (existsSync(local)) return local;
+ }
+
+ // Global fallback
+ const global = join(home, '.claude', 'skills', 'gstack', 'browse', 'dist', 'browse');
+ if (existsSync(global)) return global;
+
+ return null;
+}
+
+// ─── Version Check ──────────────────────────────────────────────
+
+interface CacheEntry {
+ sha: string;
+ timestamp: number;
+}
+
+function readCache(): CacheEntry | null {
+ try {
+ const content = readFileSync(CACHE_PATH, 'utf-8').trim();
+ const parts = content.split(/\s+/);
+ if (parts.length < 2) return null;
+ const sha = parts[0];
+ const timestamp = parseInt(parts[1], 10);
+ if (!sha || isNaN(timestamp)) return null;
+ // Validate SHA is hex
+ if (!/^[0-9a-f]{40}$/i.test(sha)) return null;
+ return { sha, timestamp };
+ } catch {
+ return null;
+ }
+}
+
+function writeCache(sha: string, timestamp: number): void {
+ try {
+ writeFileSync(CACHE_PATH, `${sha} ${timestamp}\n`);
+ } catch {
+ // Cache write failure is non-fatal
+ }
+}
+
+function fetchRemoteSHA(): string | null {
+ try {
+ const proc = Bun.spawnSync(['git', 'ls-remote', REPO_URL, 'refs/heads/main'], {
+ stdout: 'pipe',
+ stderr: 'pipe',
+ timeout: 10_000, // 10s timeout
+ });
+ if (proc.exitCode !== 0) return null;
+ const output = proc.stdout.toString().trim();
+ const sha = output.split(/\s+/)[0];
+ if (!sha || !/^[0-9a-f]{40}$/i.test(sha)) return null;
+ return sha;
+ } catch {
+ return null;
+ }
+}
+
+function resolveSkillDir(binaryPath: string): string | null {
+ const home = homedir();
+ const globalPrefix = join(home, '.claude', 'skills', 'gstack');
+ if (binaryPath.startsWith(globalPrefix)) return globalPrefix;
+
+ // Workspace-local: binary is at $ROOT/.claude/skills/gstack/browse/dist/browse
+ // Skill dir is $ROOT/.claude/skills/gstack
+ const parts = binaryPath.split('/.claude/skills/gstack/');
+ if (parts.length === 2) return parts[0] + '/.claude/skills/gstack';
+
+ return null;
+}
+
+export function checkVersion(binaryDir: string): string | null {
+ // Read local version
+ const versionFile = join(binaryDir, '.version');
+ let localSHA: string;
+ try {
+ localSHA = readFileSync(versionFile, 'utf-8').trim();
+ } catch {
+ return null; // No .version file → skip check
+ }
+ if (!localSHA) return null;
+
+ const now = Math.floor(Date.now() / 1000);
+
+ // Check cache
+ let remoteSHA: string | null = null;
+ const cache = readCache();
+ if (cache && (now - cache.timestamp) < CACHE_TTL) {
+ remoteSHA = cache.sha;
+ }
+
+ // Fetch from remote if cache miss
+ if (!remoteSHA) {
+ remoteSHA = fetchRemoteSHA();
+ if (remoteSHA) {
+ writeCache(remoteSHA, now);
+ }
+ }
+
+ if (!remoteSHA) return null; // Offline or error → skip check
+
+ // Compare
+ if (localSHA === remoteSHA) return null; // Up to date
+
+ // Determine skill directory for update command
+ const binaryPath = join(binaryDir, 'browse');
+ const skillDir = resolveSkillDir(binaryPath);
+ if (!skillDir) return null;
+
+ const payload = JSON.stringify({
+ current: localSHA.slice(0, 8),
+ latest: remoteSHA.slice(0, 8),
+ command: `cd ${skillDir} && git stash && git fetch origin && git reset --hard origin/main && ./setup`,
+ });
+
+ return `META:UPDATE_AVAILABLE ${payload}`;
+}
+
+// ─── Main ───────────────────────────────────────────────────────
+
+function main() {
+ const bin = locateBinary();
+ if (!bin) {
+ process.stderr.write('ERROR: browse binary not found. Run: cd <skill-dir> && ./setup\n');
+ process.exit(1);
+ }
+
+ console.log(bin);
+
+ const meta = checkVersion(dirname(bin));
+ if (meta) console.log(meta);
+}
+
+main();
M browse/src/server.ts => browse/src/server.ts +60 -21
@@ 6,6 6,11 @@
* Console/network/dialog buffers: CircularBuffer in-memory + async disk flush
* Chromium crash → server EXITS with clear error (CLI auto-restarts)
* Auto-shutdown after BROWSE_IDLE_TIMEOUT (default 30 min)
+ *
+ * State:
+ * State file: <project-root>/.gstack/browse.json (set via BROWSE_STATE_FILE env)
+ * Log files: <project-root>/.gstack/browse-{console,network,dialog}.log
+ * Port: random 10000-60000 (or BROWSE_PORT env for debug override)
*/
import { BrowserManager } from './browser-manager';
@@ 13,18 18,18 @@ import { handleReadCommand } from './read-commands';
import { handleWriteCommand } from './write-commands';
import { handleMetaCommand } from './meta-commands';
import { handleCookiePickerRoute } from './cookie-picker-routes';
+import { resolveConfig, ensureStateDir, readVersionHash } from './config';
import * as fs from 'fs';
import * as path from 'path';
import * as crypto from 'crypto';
-// ─── Auth (inline) ─────────────────────────────────────────────
+// ─── Config ─────────────────────────────────────────────────────
+const config = resolveConfig();
+ensureStateDir(config);
+
+// ─── Auth ───────────────────────────────────────────────────────
const AUTH_TOKEN = crypto.randomUUID();
-const PORT_OFFSET = 45600;
-const BROWSE_PORT = process.env.CONDUCTOR_PORT
- ? parseInt(process.env.CONDUCTOR_PORT, 10) - PORT_OFFSET
- : parseInt(process.env.BROWSE_PORT || '0', 10); // 0 = auto-scan
-const INSTANCE_SUFFIX = BROWSE_PORT ? `-${BROWSE_PORT}` : '';
-const STATE_FILE = process.env.BROWSE_STATE_FILE || `/tmp/browse-server${INSTANCE_SUFFIX}.json`;
+const BROWSE_PORT = parseInt(process.env.BROWSE_PORT || '0', 10);
const IDLE_TIMEOUT_MS = parseInt(process.env.BROWSE_IDLE_TIMEOUT || '1800000', 10); // 30 min
function validateAuth(req: Request): boolean {
@@ 36,9 41,9 @@ function validateAuth(req: Request): boolean {
import { consoleBuffer, networkBuffer, dialogBuffer, addConsoleEntry, addNetworkEntry, addDialogEntry, type LogEntry, type NetworkEntry, type DialogEntry } from './buffers';
export { consoleBuffer, networkBuffer, dialogBuffer, addConsoleEntry, addNetworkEntry, addDialogEntry, type LogEntry, type NetworkEntry, type DialogEntry };
-const CONSOLE_LOG_PATH = `/tmp/browse-console${INSTANCE_SUFFIX}.log`;
-const NETWORK_LOG_PATH = `/tmp/browse-network${INSTANCE_SUFFIX}.log`;
-const DIALOG_LOG_PATH = `/tmp/browse-dialog${INSTANCE_SUFFIX}.log`;
+const CONSOLE_LOG_PATH = config.consoleLog;
+const NETWORK_LOG_PATH = config.networkLog;
+const DIALOG_LOG_PATH = config.dialogLog;
let lastConsoleFlushed = 0;
let lastNetworkFlushed = 0;
let lastDialogFlushed = 0;
@@ 132,22 137,25 @@ export const META_COMMANDS = new Set([
const browserManager = new BrowserManager();
let isShuttingDown = false;
-// Find port: deterministic from CONDUCTOR_PORT, or scan range
+// Find port: explicit BROWSE_PORT, or random in 10000-60000
async function findPort(): Promise<number> {
- // Deterministic port from CONDUCTOR_PORT (e.g., 55040 - 45600 = 9440)
+ // Explicit port override (for debugging)
if (BROWSE_PORT) {
try {
const testServer = Bun.serve({ port: BROWSE_PORT, fetch: () => new Response('ok') });
testServer.stop();
return BROWSE_PORT;
} catch {
- throw new Error(`[browse] Port ${BROWSE_PORT} (from CONDUCTOR_PORT ${process.env.CONDUCTOR_PORT}) is in use`);
+ throw new Error(`[browse] Port ${BROWSE_PORT} (from BROWSE_PORT env) is in use`);
}
}
- // Fallback: scan range
- const start = parseInt(process.env.BROWSE_PORT_START || '9400', 10);
- for (let port = start; port < start + 10; port++) {
+ // Random port with retry
+ const MIN_PORT = 10000;
+ const MAX_PORT = 60000;
+ const MAX_RETRIES = 5;
+ for (let attempt = 0; attempt < MAX_RETRIES; attempt++) {
+ const port = MIN_PORT + Math.floor(Math.random() * (MAX_PORT - MIN_PORT));
try {
const testServer = Bun.serve({ port, fetch: () => new Response('ok') });
testServer.stop();
@@ 156,7 164,7 @@ async function findPort(): Promise<number> {
continue;
}
}
- throw new Error(`[browse] No available port in range ${start}-${start + 9}`);
+ throw new Error(`[browse] No available port after ${MAX_RETRIES} attempts in range ${MIN_PORT}-${MAX_PORT}`);
}
/**
@@ 201,6 209,34 @@ async function handleCommand(body: any): Promise<Response> {
result = await handleWriteCommand(command, args, browserManager);
} else if (META_COMMANDS.has(command)) {
result = await handleMetaCommand(command, args, browserManager, shutdown);
+ } else if (command === 'help') {
+ const helpText = [
+ 'gstack browse — headless browser for AI agents',
+ '',
+ 'Commands:',
+ ' Navigation: goto <url>, back, forward, reload',
+ ' Interaction: click <sel>, fill <sel> <text>, select <sel> <val>, hover, type, press, scroll, wait',
+ ' Read: text [sel], html [sel], links, forms, accessibility, cookies, storage, console, network, perf',
+ ' Evaluate: js <expr>, eval <expr>, css <sel> <prop>, attrs <sel>, is <sel> <state>',
+ ' Snapshot: snapshot [-i] [-c] [-d N] [-s sel] [-D] [-a] [-o path] [-C]',
+ ' Screenshot: screenshot [path], pdf [path], responsive <widths>',
+ ' Tabs: tabs, tab <id>, newtab [url], closetab [id]',
+ ' State: cookie <set|get|clear>, cookie-import <json>, cookie-import-browser [browser]',
+ ' Headers: header <set|clear> [name] [value], useragent [string]',
+ ' Upload: upload <sel> <file1> [file2...]',
+ ' Dialogs: dialog, dialog-accept [text], dialog-dismiss',
+ ' Meta: status, stop, restart, diff, chain, help',
+ '',
+ 'Snapshot flags:',
+ ' -i interactive only -c compact (remove empty nodes)',
+ ' -d N limit depth -s sel scope to CSS selector',
+ ' -D diff vs previous -a annotated screenshot with ref labels',
+ ' -o path output file -C cursor-interactive elements',
+ ].join('\n');
+ return new Response(helpText, {
+ status: 200,
+ headers: { 'Content-Type': 'text/plain' },
+ });
} else {
return new Response(JSON.stringify({
error: `Unknown command: ${command}`,
@@ 235,7 271,7 @@ async function shutdown() {
await browserManager.close();
// Clean up state file
- try { fs.unlinkSync(STATE_FILE); } catch {}
+ try { fs.unlinkSync(config.stateFile); } catch {}
process.exit(0);
}
@@ 301,19 337,22 @@ async function start() {
},
});
- // Write state file
+ // Write state file (atomic: write .tmp then rename)
const state = {
pid: process.pid,
port,
token: AUTH_TOKEN,
startedAt: new Date().toISOString(),
serverPath: path.resolve(import.meta.dir, 'server.ts'),
+ binaryVersion: readVersionHash() || undefined,
};
- fs.writeFileSync(STATE_FILE, JSON.stringify(state, null, 2), { mode: 0o600 });
+ const tmpFile = config.stateFile + '.tmp';
+ fs.writeFileSync(tmpFile, JSON.stringify(state, null, 2), { mode: 0o600 });
+ fs.renameSync(tmpFile, config.stateFile);
browserManager.serverPort = port;
console.log(`[browse] Server running on http://127.0.0.1:${port} (PID: ${process.pid})`);
- console.log(`[browse] State file: ${STATE_FILE}`);
+ console.log(`[browse] State file: ${config.stateFile}`);
console.log(`[browse] Idle timeout: ${IDLE_TIMEOUT_MS / 1000}s`);
}
M browse/test/commands.test.ts => browse/test/commands.test.ts +1 -4
@@ 457,14 457,11 @@ describe('CLI lifecycle', () => {
}));
const cliPath = path.resolve(__dirname, '../src/cli.ts');
- // Build env without CONDUCTOR_PORT/BROWSE_PORT so BROWSE_PORT_START takes effect
const cliEnv: Record<string, string> = {};
for (const [k, v] of Object.entries(process.env)) {
- if (k !== 'CONDUCTOR_PORT' && k !== 'BROWSE_PORT' && v !== undefined) cliEnv[k] = v;
+ if (v !== undefined) cliEnv[k] = v;
}
cliEnv.BROWSE_STATE_FILE = stateFile;
- // Use a random high port to avoid conflicts with running servers
- cliEnv.BROWSE_PORT_START = String(9600 + Math.floor(Math.random() * 100));
const result = await new Promise<{ code: number; stdout: string; stderr: string }>((resolve) => {
const proc = spawn('bun', ['run', cliPath, 'status'], {
timeout: 15000,
A browse/test/config.test.ts => browse/test/config.test.ts +125 -0
@@ 0,0 1,125 @@
+import { describe, test, expect } from 'bun:test';
+import { resolveConfig, ensureStateDir, readVersionHash, getGitRoot } from '../src/config';
+import * as fs from 'fs';
+import * as path from 'path';
+import * as os from 'os';
+
+describe('config', () => {
+ describe('getGitRoot', () => {
+ test('returns a path when in a git repo', () => {
+ const root = getGitRoot();
+ expect(root).not.toBeNull();
+ expect(fs.existsSync(path.join(root!, '.git'))).toBe(true);
+ });
+ });
+
+ describe('resolveConfig', () => {
+ test('uses git root by default', () => {
+ const config = resolveConfig({});
+ const gitRoot = getGitRoot();
+ expect(gitRoot).not.toBeNull();
+ expect(config.projectDir).toBe(gitRoot);
+ expect(config.stateDir).toBe(path.join(gitRoot!, '.gstack'));
+ expect(config.stateFile).toBe(path.join(gitRoot!, '.gstack', 'browse.json'));
+ });
+
+ test('derives paths from BROWSE_STATE_FILE when set', () => {
+ const stateFile = '/tmp/test-config/.gstack/browse.json';
+ const config = resolveConfig({ BROWSE_STATE_FILE: stateFile });
+ expect(config.stateFile).toBe(stateFile);
+ expect(config.stateDir).toBe('/tmp/test-config/.gstack');
+ expect(config.projectDir).toBe('/tmp/test-config');
+ });
+
+ test('log paths are in stateDir', () => {
+ const config = resolveConfig({});
+ expect(config.consoleLog).toBe(path.join(config.stateDir, 'browse-console.log'));
+ expect(config.networkLog).toBe(path.join(config.stateDir, 'browse-network.log'));
+ expect(config.dialogLog).toBe(path.join(config.stateDir, 'browse-dialog.log'));
+ });
+ });
+
+ describe('ensureStateDir', () => {
+ test('creates directory if it does not exist', () => {
+ const tmpDir = path.join(os.tmpdir(), `browse-config-test-${Date.now()}`);
+ const config = resolveConfig({ BROWSE_STATE_FILE: path.join(tmpDir, '.gstack', 'browse.json') });
+ expect(fs.existsSync(config.stateDir)).toBe(false);
+ ensureStateDir(config);
+ expect(fs.existsSync(config.stateDir)).toBe(true);
+ // Cleanup
+ fs.rmSync(tmpDir, { recursive: true, force: true });
+ });
+
+ test('is a no-op if directory already exists', () => {
+ const tmpDir = path.join(os.tmpdir(), `browse-config-test-${Date.now()}`);
+ const stateDir = path.join(tmpDir, '.gstack');
+ fs.mkdirSync(stateDir, { recursive: true });
+ const config = resolveConfig({ BROWSE_STATE_FILE: path.join(stateDir, 'browse.json') });
+ ensureStateDir(config); // should not throw
+ expect(fs.existsSync(config.stateDir)).toBe(true);
+ // Cleanup
+ fs.rmSync(tmpDir, { recursive: true, force: true });
+ });
+ });
+
+ describe('readVersionHash', () => {
+ test('returns null when .version file does not exist', () => {
+ const result = readVersionHash('/nonexistent/path/browse');
+ expect(result).toBeNull();
+ });
+
+ test('reads version from .version file adjacent to execPath', () => {
+ const tmpDir = path.join(os.tmpdir(), `browse-version-test-${Date.now()}`);
+ fs.mkdirSync(tmpDir, { recursive: true });
+ const versionFile = path.join(tmpDir, '.version');
+ fs.writeFileSync(versionFile, 'abc123def\n');
+ const result = readVersionHash(path.join(tmpDir, 'browse'));
+ expect(result).toBe('abc123def');
+ // Cleanup
+ fs.rmSync(tmpDir, { recursive: true, force: true });
+ });
+ });
+});
+
+describe('resolveServerScript', () => {
+ // Import the function from cli.ts
+ const { resolveServerScript } = require('../src/cli');
+
+ test('uses BROWSE_SERVER_SCRIPT env when set', () => {
+ const result = resolveServerScript({ BROWSE_SERVER_SCRIPT: '/custom/server.ts' }, '', '');
+ expect(result).toBe('/custom/server.ts');
+ });
+
+ test('finds server.ts adjacent to cli.ts in dev mode', () => {
+ const srcDir = path.resolve(__dirname, '../src');
+ const result = resolveServerScript({}, srcDir, '');
+ expect(result).toBe(path.join(srcDir, 'server.ts'));
+ });
+
+ test('throws when server.ts cannot be found', () => {
+ expect(() => resolveServerScript({}, '/nonexistent/$bunfs', '/nonexistent/browse'))
+ .toThrow('Cannot find server.ts');
+ });
+});
+
+describe('version mismatch detection', () => {
+ test('detects when versions differ', () => {
+ const stateVersion = 'abc123';
+ const currentVersion = 'def456';
+ expect(stateVersion !== currentVersion).toBe(true);
+ });
+
+ test('no mismatch when versions match', () => {
+ const stateVersion = 'abc123';
+ const currentVersion = 'abc123';
+ expect(stateVersion !== currentVersion).toBe(false);
+ });
+
+ test('no mismatch when either version is null', () => {
+ const currentVersion: string | null = null;
+ const stateVersion: string | undefined = 'abc123';
+ // Version mismatch only triggers when both are present
+ const shouldRestart = currentVersion !== null && stateVersion !== undefined && currentVersion !== stateVersion;
+ expect(shouldRestart).toBe(false);
+ });
+});
A browse/test/cookie-picker-routes.test.ts => browse/test/cookie-picker-routes.test.ts +205 -0
@@ 0,0 1,205 @@
+/**
+ * Tests for cookie-picker route handler
+ *
+ * Tests the HTTP glue layer directly with mock BrowserManager objects.
+ * Verifies that all routes return valid JSON (not HTML) with correct CORS headers.
+ */
+
+import { describe, test, expect } from 'bun:test';
+import { handleCookiePickerRoute } from '../src/cookie-picker-routes';
+
+// ─── Mock BrowserManager ──────────────────────────────────────
+
+function mockBrowserManager() {
+ const addedCookies: any[] = [];
+ const clearedDomains: string[] = [];
+ return {
+ bm: {
+ getPage: () => ({
+ context: () => ({
+ addCookies: (cookies: any[]) => { addedCookies.push(...cookies); },
+ clearCookies: (opts: { domain: string }) => { clearedDomains.push(opts.domain); },
+ }),
+ }),
+ } as any,
+ addedCookies,
+ clearedDomains,
+ };
+}
+
+function makeUrl(path: string, port = 9470): URL {
+ return new URL(`http://127.0.0.1:${port}${path}`);
+}
+
+function makeReq(method: string, body?: any): Request {
+ const opts: RequestInit = { method };
+ if (body) {
+ opts.body = JSON.stringify(body);
+ opts.headers = { 'Content-Type': 'application/json' };
+ }
+ return new Request('http://127.0.0.1:9470', opts);
+}
+
+// ─── Tests ──────────────────────────────────────────────────────
+
+describe('cookie-picker-routes', () => {
+ describe('CORS', () => {
+ test('OPTIONS returns 204 with correct CORS headers', async () => {
+ const { bm } = mockBrowserManager();
+ const url = makeUrl('/cookie-picker/browsers');
+ const req = new Request('http://127.0.0.1:9470', { method: 'OPTIONS' });
+
+ const res = await handleCookiePickerRoute(url, req, bm);
+
+ expect(res.status).toBe(204);
+ expect(res.headers.get('Access-Control-Allow-Origin')).toBe('http://127.0.0.1:9470');
+ expect(res.headers.get('Access-Control-Allow-Methods')).toContain('POST');
+ });
+
+ test('JSON responses include correct CORS origin with port', async () => {
+ const { bm } = mockBrowserManager();
+ const url = makeUrl('/cookie-picker/browsers', 9450);
+ const req = new Request('http://127.0.0.1:9450', { method: 'GET' });
+
+ const res = await handleCookiePickerRoute(url, req, bm);
+
+ expect(res.headers.get('Access-Control-Allow-Origin')).toBe('http://127.0.0.1:9450');
+ });
+ });
+
+ describe('JSON responses (not HTML)', () => {
+ test('GET /cookie-picker/browsers returns JSON', async () => {
+ const { bm } = mockBrowserManager();
+ const url = makeUrl('/cookie-picker/browsers');
+ const req = new Request('http://127.0.0.1:9470', { method: 'GET' });
+
+ const res = await handleCookiePickerRoute(url, req, bm);
+
+ expect(res.status).toBe(200);
+ expect(res.headers.get('Content-Type')).toBe('application/json');
+ const body = await res.json();
+ expect(body).toHaveProperty('browsers');
+ expect(Array.isArray(body.browsers)).toBe(true);
+ });
+
+ test('GET /cookie-picker/domains without browser param returns JSON error', async () => {
+ const { bm } = mockBrowserManager();
+ const url = makeUrl('/cookie-picker/domains');
+ const req = new Request('http://127.0.0.1:9470', { method: 'GET' });
+
+ const res = await handleCookiePickerRoute(url, req, bm);
+
+ expect(res.status).toBe(400);
+ expect(res.headers.get('Content-Type')).toBe('application/json');
+ const body = await res.json();
+ expect(body).toHaveProperty('error');
+ expect(body).toHaveProperty('code', 'missing_param');
+ });
+
+ test('POST /cookie-picker/import with invalid JSON returns JSON error', async () => {
+ const { bm } = mockBrowserManager();
+ const url = makeUrl('/cookie-picker/import');
+ const req = new Request('http://127.0.0.1:9470', {
+ method: 'POST',
+ body: 'not json',
+ headers: { 'Content-Type': 'application/json' },
+ });
+
+ const res = await handleCookiePickerRoute(url, req, bm);
+
+ expect(res.status).toBe(400);
+ expect(res.headers.get('Content-Type')).toBe('application/json');
+ const body = await res.json();
+ expect(body.code).toBe('bad_request');
+ });
+
+ test('POST /cookie-picker/import missing browser field returns JSON error', async () => {
+ const { bm } = mockBrowserManager();
+ const url = makeUrl('/cookie-picker/import');
+ const req = makeReq('POST', { domains: ['.example.com'] });
+
+ const res = await handleCookiePickerRoute(url, req, bm);
+
+ expect(res.status).toBe(400);
+ const body = await res.json();
+ expect(body.code).toBe('missing_param');
+ });
+
+ test('POST /cookie-picker/import missing domains returns JSON error', async () => {
+ const { bm } = mockBrowserManager();
+ const url = makeUrl('/cookie-picker/import');
+ const req = makeReq('POST', { browser: 'Chrome' });
+
+ const res = await handleCookiePickerRoute(url, req, bm);
+
+ expect(res.status).toBe(400);
+ const body = await res.json();
+ expect(body.code).toBe('missing_param');
+ });
+
+ test('POST /cookie-picker/remove with invalid JSON returns JSON error', async () => {
+ const { bm } = mockBrowserManager();
+ const url = makeUrl('/cookie-picker/remove');
+ const req = new Request('http://127.0.0.1:9470', {
+ method: 'POST',
+ body: '{bad',
+ headers: { 'Content-Type': 'application/json' },
+ });
+
+ const res = await handleCookiePickerRoute(url, req, bm);
+
+ expect(res.status).toBe(400);
+ expect(res.headers.get('Content-Type')).toBe('application/json');
+ });
+
+ test('POST /cookie-picker/remove missing domains returns JSON error', async () => {
+ const { bm } = mockBrowserManager();
+ const url = makeUrl('/cookie-picker/remove');
+ const req = makeReq('POST', {});
+
+ const res = await handleCookiePickerRoute(url, req, bm);
+
+ expect(res.status).toBe(400);
+ const body = await res.json();
+ expect(body.code).toBe('missing_param');
+ });
+
+ test('GET /cookie-picker/imported returns JSON with domain list', async () => {
+ const { bm } = mockBrowserManager();
+ const url = makeUrl('/cookie-picker/imported');
+ const req = new Request('http://127.0.0.1:9470', { method: 'GET' });
+
+ const res = await handleCookiePickerRoute(url, req, bm);
+
+ expect(res.status).toBe(200);
+ expect(res.headers.get('Content-Type')).toBe('application/json');
+ const body = await res.json();
+ expect(body).toHaveProperty('domains');
+ expect(body).toHaveProperty('totalDomains');
+ expect(body).toHaveProperty('totalCookies');
+ });
+ });
+
+ describe('routing', () => {
+ test('GET /cookie-picker returns HTML', async () => {
+ const { bm } = mockBrowserManager();
+ const url = makeUrl('/cookie-picker');
+ const req = new Request('http://127.0.0.1:9470', { method: 'GET' });
+
+ const res = await handleCookiePickerRoute(url, req, bm);
+
+ expect(res.status).toBe(200);
+ expect(res.headers.get('Content-Type')).toContain('text/html');
+ });
+
+ test('unknown path returns 404', async () => {
+ const { bm } = mockBrowserManager();
+ const url = makeUrl('/cookie-picker/nonexistent');
+ const req = new Request('http://127.0.0.1:9470', { method: 'GET' });
+
+ const res = await handleCookiePickerRoute(url, req, bm);
+
+ expect(res.status).toBe(404);
+ });
+ });
+});
A browse/test/find-browse.test.ts => browse/test/find-browse.test.ts +144 -0
@@ 0,0 1,144 @@
+/**
+ * Tests for find-browse version check logic
+ *
+ * Tests the checkVersion() and locateBinary() functions directly.
+ * Uses temp directories with mock .version files and cache files.
+ */
+
+import { describe, test, expect, beforeEach, afterEach } from 'bun:test';
+import { checkVersion, locateBinary } from '../src/find-browse';
+import { mkdtempSync, writeFileSync, rmSync, existsSync, mkdirSync } from 'fs';
+import { join } from 'path';
+import { tmpdir } from 'os';
+
+let tempDir: string;
+
+beforeEach(() => {
+ tempDir = mkdtempSync(join(tmpdir(), 'find-browse-test-'));
+});
+
+afterEach(() => {
+ rmSync(tempDir, { recursive: true, force: true });
+ // Clean up test cache
+ try { rmSync('/tmp/gstack-latest-version'); } catch {}
+});
+
+describe('checkVersion', () => {
+ test('returns null when .version file is missing', () => {
+ const result = checkVersion(tempDir);
+ expect(result).toBeNull();
+ });
+
+ test('returns null when .version file is empty', () => {
+ writeFileSync(join(tempDir, '.version'), '');
+ const result = checkVersion(tempDir);
+ expect(result).toBeNull();
+ });
+
+ test('returns null when .version has only whitespace', () => {
+ writeFileSync(join(tempDir, '.version'), ' \n');
+ const result = checkVersion(tempDir);
+ expect(result).toBeNull();
+ });
+
+ test('returns null when local SHA matches remote (cache hit)', () => {
+ const sha = 'a'.repeat(40);
+ writeFileSync(join(tempDir, '.version'), sha);
+ // Write cache with same SHA, recent timestamp
+ const now = Math.floor(Date.now() / 1000);
+ writeFileSync('/tmp/gstack-latest-version', `${sha} ${now}\n`);
+
+ const result = checkVersion(tempDir);
+ expect(result).toBeNull();
+ });
+
+ test('returns META:UPDATE_AVAILABLE when SHAs differ (cache hit)', () => {
+ const localSha = 'a'.repeat(40);
+ const remoteSha = 'b'.repeat(40);
+ writeFileSync(join(tempDir, '.version'), localSha);
+ // Create a fake browse binary path so resolveSkillDir works
+ const browsePath = join(tempDir, 'browse');
+ writeFileSync(browsePath, '');
+ // Write cache with different SHA, recent timestamp
+ const now = Math.floor(Date.now() / 1000);
+ writeFileSync('/tmp/gstack-latest-version', `${remoteSha} ${now}\n`);
+
+ const result = checkVersion(tempDir);
+ // Result may be null if resolveSkillDir can't determine skill dir from temp path
+ // That's expected — the META signal requires a known skill dir path
+ if (result !== null) {
+ expect(result).toStartWith('META:UPDATE_AVAILABLE');
+ const jsonStr = result.replace('META:UPDATE_AVAILABLE ', '');
+ const payload = JSON.parse(jsonStr);
+ expect(payload.current).toBe('a'.repeat(8));
+ expect(payload.latest).toBe('b'.repeat(8));
+ expect(payload.command).toContain('git stash');
+ expect(payload.command).toContain('git reset --hard origin/main');
+ expect(payload.command).toContain('./setup');
+ }
+ });
+
+ test('uses cached SHA when cache is fresh (< 4hr)', () => {
+ const localSha = 'a'.repeat(40);
+ const remoteSha = 'a'.repeat(40);
+ writeFileSync(join(tempDir, '.version'), localSha);
+ // Cache is 1 hour old — should still be valid
+ const oneHourAgo = Math.floor(Date.now() / 1000) - 3600;
+ writeFileSync('/tmp/gstack-latest-version', `${remoteSha} ${oneHourAgo}\n`);
+
+ const result = checkVersion(tempDir);
+ expect(result).toBeNull(); // SHAs match
+ });
+
+ test('treats expired cache as stale', () => {
+ const localSha = 'a'.repeat(40);
+ writeFileSync(join(tempDir, '.version'), localSha);
+ // Cache is 5 hours old — should be stale
+ const fiveHoursAgo = Math.floor(Date.now() / 1000) - 18000;
+ writeFileSync('/tmp/gstack-latest-version', `${'b'.repeat(40)} ${fiveHoursAgo}\n`);
+
+ // This will try git ls-remote which may fail in test env — that's OK
+ // The important thing is it doesn't use the stale cache value
+ const result = checkVersion(tempDir);
+ // Result depends on whether git ls-remote succeeds in test environment
+ // If offline, returns null (graceful degradation)
+ expect(result === null || typeof result === 'string').toBe(true);
+ });
+
+ test('handles corrupt cache file gracefully', () => {
+ const localSha = 'a'.repeat(40);
+ writeFileSync(join(tempDir, '.version'), localSha);
+ writeFileSync('/tmp/gstack-latest-version', 'garbage data here');
+
+ // Should not throw, should treat as stale
+ const result = checkVersion(tempDir);
+ expect(result === null || typeof result === 'string').toBe(true);
+ });
+
+ test('handles cache with invalid SHA gracefully', () => {
+ const localSha = 'a'.repeat(40);
+ writeFileSync(join(tempDir, '.version'), localSha);
+ writeFileSync('/tmp/gstack-latest-version', `not-a-sha ${Math.floor(Date.now() / 1000)}\n`);
+
+ // Invalid SHA should be treated as no cache
+ const result = checkVersion(tempDir);
+ expect(result === null || typeof result === 'string').toBe(true);
+ });
+});
+
+describe('locateBinary', () => {
+ test('returns null when no binary exists at known paths', () => {
+ // This test depends on the test environment — if a real binary exists at
+ // ~/.claude/skills/gstack/browse/dist/browse, it will find it.
+ // We mainly test that the function doesn't throw.
+ const result = locateBinary();
+ expect(result === null || typeof result === 'string').toBe(true);
+ });
+
+ test('returns string path when binary exists', () => {
+ const result = locateBinary();
+ if (result !== null) {
+ expect(existsSync(result)).toBe(true);
+ }
+ });
+});
M package.json => package.json +2 -2
@@ 1,6 1,6 @@
{
"name": "gstack",
- "version": "0.3.1",
+ "version": "0.3.2",
"description": "Garry's Stack — Claude Code skills + fast headless browser. One repo, one install, entire AI engineering workflow.",
"license": "MIT",
"type": "module",
@@ 8,7 8,7 @@
"browse": "./browse/dist/browse"
},
"scripts": {
- "build": "bun build --compile browse/src/cli.ts --outfile browse/dist/browse",
+ "build": "bun build --compile browse/src/cli.ts --outfile browse/dist/browse && bun build --compile browse/src/find-browse.ts --outfile browse/dist/find-browse && git rev-parse HEAD > browse/dist/.version && rm -f .*.bun-build",
"dev": "bun run browse/src/cli.ts",
"server": "bun run browse/src/server.ts",
"test": "bun test",
M qa/SKILL.md => qa/SKILL.md +58 -7
@@ 3,9 3,10 @@ name: qa
version: 1.0.0
description: |
Systematically QA test a web application. Use when asked to "qa", "QA", "test this site",
- "find bugs", "dogfood", or review quality. Three modes: full (systematic exploration),
- quick (30-second smoke test), regression (compare against baseline). Produces structured
- report with health score, screenshots, and repro steps.
+ "find bugs", "dogfood", or review quality. Four modes: diff-aware (automatic on feature
+ branches — analyzes git diff, identifies affected pages, tests them), full (systematic
+ exploration), quick (30-second smoke test), regression (compare against baseline). Produces
+ structured report with health score, screenshots, and repro steps.
allowed-tools:
- Bash
- Read
@@ 22,22 23,30 @@ You are a QA engineer. Test web applications like a real user — click everythi
| Parameter | Default | Override example |
|-----------|---------|-----------------|
-| Target URL | (required) | `https://myapp.com`, `http://localhost:3000` |
+| Target URL | (auto-detect or required) | `https://myapp.com`, `http://localhost:3000` |
| Mode | full | `--quick`, `--regression .gstack/qa-reports/baseline.json` |
| Output dir | `.gstack/qa-reports/` | `Output to /tmp/qa` |
-| Scope | Full app | `Focus on the billing page` |
+| Scope | Full app (or diff-scoped) | `Focus on the billing page` |
| Auth | None | `Sign in to user@example.com`, `Import cookies from cookies.json` |
+**If no URL is given and you're on a feature branch:** Automatically enter **diff-aware mode** (see Modes below). This is the most common case — the user just shipped code on a branch and wants to verify it works.
+
**Find the browse binary:**
```bash
-B=$(browse/bin/find-browse 2>/dev/null || ~/.claude/skills/gstack/browse/bin/find-browse 2>/dev/null)
+BROWSE_OUTPUT=$(browse/bin/find-browse 2>/dev/null || ~/.claude/skills/gstack/browse/bin/find-browse 2>/dev/null)
+B=$(echo "$BROWSE_OUTPUT" | head -1)
+META=$(echo "$BROWSE_OUTPUT" | grep "^META:" || true)
if [ -z "$B" ]; then
echo "ERROR: browse binary not found"
exit 1
fi
+echo "READY: $B"
+[ -n "$META" ] && echo "$META"
```
+If you see `META:UPDATE_AVAILABLE`: tell the user an update is available, STOP and wait for approval, then run the command from the META payload and re-run the setup check.
+
**Create output directories:**
```bash
@@ 49,7 58,49 @@ mkdir -p "$REPORT_DIR/screenshots"
## Modes
-### Full (default)
+### Diff-aware (automatic when on a feature branch with no URL)
+
+This is the **primary mode** for developers verifying their work. When the user says `/qa` without a URL and the repo is on a feature branch, automatically:
+
+1. **Analyze the branch diff** to understand what changed:
+ ```bash
+ git diff main...HEAD --name-only
+ git log main..HEAD --oneline
+ ```
+
+2. **Identify affected pages/routes** from the changed files:
+ - Controller/route files → which URL paths they serve
+ - View/template/component files → which pages render them
+ - Model/service files → which pages use those models (check controllers that reference them)
+ - CSS/style files → which pages include those stylesheets
+ - API endpoints → test them directly with `$B js "await fetch('/api/...')"`
+ - Static pages (markdown, HTML) → navigate to them directly
+
+3. **Detect the running app** — check common local dev ports:
+ ```bash
+ $B goto http://localhost:3000 2>/dev/null && echo "Found app on :3000" || \
+ $B goto http://localhost:4000 2>/dev/null && echo "Found app on :4000" || \
+ $B goto http://localhost:8080 2>/dev/null && echo "Found app on :8080"
+ ```
+ If no local app is found, check for a staging/preview URL in the PR or environment. If nothing works, ask the user for the URL.
+
+4. **Test each affected page/route:**
+ - Navigate to the page
+ - Take a screenshot
+ - Check console for errors
+ - If the change was interactive (forms, buttons, flows), test the interaction end-to-end
+ - Use `snapshot -D` before and after actions to verify the change had the expected effect
+
+5. **Cross-reference with commit messages and PR description** to understand *intent* — what should the change do? Verify it actually does that.
+
+6. **Report findings** scoped to the branch changes:
+ - "Changes tested: N pages/routes affected by this branch"
+ - For each: does it work? Screenshot evidence.
+ - Any regressions on adjacent pages?
+
+**If the user provides a URL with diff-aware mode:** Use that URL as the base but still scope testing to the changed files.
+
+### Full (default when URL is provided)
Systematic exploration. Visit every reachable page. Document 5-10 well-evidenced issues. Produce health score. Takes 5-15 minutes depending on app size.
### Quick (`--quick`)
M retro/SKILL.md => retro/SKILL.md +16 -1
@@ 80,6 80,9 @@ git log origin/main --since="<window>" --format="AUTHOR:%aN" --name-only
# 7. Per-author commit counts (quick summary)
git shortlog origin/main --since="<window>" -sn --no-merges
+
+# 8. Greptile triage history (if available)
+cat ~/.gstack/greptile-history.md 2>/dev/null || true
```
### Step 2: Compute Metrics
@@ 100,6 103,7 @@ Calculate and present these metrics in a summary table:
| Active days | N |
| Detected sessions | N |
| Avg LOC/session-hour | N |
+| Greptile signal | N% (Y catches, Z FPs) |
Then show a **per-author leaderboard** immediately below:
@@ 112,6 116,8 @@ bob 3 +120/-40 tests/
Sort by commits descending. The current user (from `git config user.name`) always appears first, labeled "You (name)".
+**Greptile signal (if history exists):** Read `~/.gstack/greptile-history.md` (fetched in Step 1, command 8). Filter entries within the retro time window by date. Count entries by type: `fix`, `fp`, `already-fixed`. Compute signal ratio: `(fix + already-fixed) / (fix + already-fixed + fp)`. If no entries exist in the window or the file doesn't exist, skip the Greptile metric row. Skip unparseable lines silently.
+
### Step 3: Commit Time Distribution
Show hourly histogram in Pacific time using bar chart:
@@ 297,10 303,18 @@ Use the Write tool to save the JSON file with this schema:
},
"version_range": ["1.16.0.0", "1.16.1.0"],
"streak_days": 47,
- "tweetable": "Week of Mar 1: 47 commits (3 contributors), 3.2k LOC, 38% tests, 12 PRs, peak: 10pm"
+ "tweetable": "Week of Mar 1: 47 commits (3 contributors), 3.2k LOC, 38% tests, 12 PRs, peak: 10pm",
+ "greptile": {
+ "fixes": 3,
+ "fps": 1,
+ "already_fixed": 2,
+ "signal_pct": 83
+ }
}
```
+**Note:** Only include the `greptile` field if `~/.gstack/greptile-history.md` exists and has entries within the time window. If no history data is available, omit the field entirely.
+
### Step 14: Write the Narrative
Structure the output as:
@@ 342,6 356,7 @@ Narrative covering:
- Test LOC ratio trend
- Hotspot analysis (are the same files churning?)
- Any XL PRs that should have been split
+- Greptile signal ratio and trend (if history exists): "Greptile: X% signal (Y valid catches, Z false positives)"
### Focus & Highlights
(from Step 8)
M review/SKILL.md => review/SKILL.md +34 -0
@@ 36,6 36,16 @@ Read `.claude/skills/review/checklist.md`.
---
+## Step 2.5: Check for Greptile review comments
+
+Read `.claude/skills/review/greptile-triage.md` and follow the fetch, filter, and classify steps.
+
+**If no PR exists, `gh` fails, API returns an error, or there are zero Greptile comments:** Skip this step silently. Greptile integration is additive — the review works without it.
+
+**If Greptile comments are found:** Store the classifications (VALID & ACTIONABLE, VALID BUT ALREADY FIXED, FALSE POSITIVE, SUPPRESSED) — you will need them in Step 5.
+
+---
+
## Step 3: Get the diff
Fetch the latest main to avoid false positives from a stale local main:
@@ 68,6 78,30 @@ Follow the output format specified in the checklist. Respect the suppressions
- If only non-critical issues found: output findings. No further action needed.
- If no issues found: output `Pre-Landing Review: No issues found.`
+### Greptile comment resolution
+
+After outputting your own findings, if Greptile comments were classified in Step 2.5:
+
+**Include a Greptile summary in your output header:** `+ N Greptile comments (X valid, Y fixed, Z FP)`
+
+1. **VALID & ACTIONABLE comments:** These are already included in your CRITICAL findings — they follow the same AskUserQuestion flow (A: Fix it now, B: Acknowledge, C: False positive). If the user chooses C (false positive), post a reply using the appropriate API from the triage doc and save the pattern to `~/.gstack/greptile-history.md` (type: fp).
+
+2. **FALSE POSITIVE comments:** Present each one via AskUserQuestion:
+ - Show the Greptile comment: file:line (or [top-level]) + body summary + permalink URL
+ - Explain concisely why it's a false positive
+ - Options:
+ - A) Reply to Greptile explaining why this is incorrect (recommended if clearly wrong)
+ - B) Fix it anyway (if low-effort and harmless)
+ - C) Ignore — don't reply, don't fix
+
+ If the user chooses A, post a reply using the appropriate API from the triage doc and save the pattern to `~/.gstack/greptile-history.md` (type: fp).
+
+3. **VALID BUT ALREADY FIXED comments:** Reply acknowledging the catch — no AskUserQuestion needed:
+ - Post reply: `"Good catch — already fixed in <commit-sha>."`
+ - Save to `~/.gstack/greptile-history.md` (type: already-fixed)
+
+4. **SUPPRESSED comments:** Skip silently — these are known false positives from previous triage.
+
---
## Important Rules
A review/greptile-triage.md => review/greptile-triage.md +122 -0
@@ 0,0 1,122 @@
+# Greptile Comment Triage
+
+Shared reference for fetching, filtering, and classifying Greptile review comments on GitHub PRs. Both `/review` (Step 2.5) and `/ship` (Step 3.75) reference this document.
+
+---
+
+## Fetch
+
+Run these commands to detect the PR and fetch comments. Both API calls run in parallel.
+
+```bash
+REPO=$(gh repo view --json nameWithOwner --jq '.nameWithOwner' 2>/dev/null)
+PR_NUMBER=$(gh pr view --json number --jq '.number' 2>/dev/null)
+```
+
+**If either fails or is empty:** Skip Greptile triage silently. This integration is additive — the workflow works without it.
+
+```bash
+# Fetch line-level review comments AND top-level PR comments in parallel
+gh api repos/$REPO/pulls/$PR_NUMBER/comments \
+ --jq '.[] | select(.user.login == "greptile-apps[bot]") | select(.position != null) | {id: .id, path: .path, line: .line, body: .body, html_url: .html_url, source: "line-level"}' > /tmp/greptile_line.json &
+gh api repos/$REPO/issues/$PR_NUMBER/comments \
+ --jq '.[] | select(.user.login == "greptile-apps[bot]") | {id: .id, body: .body, html_url: .html_url, source: "top-level"}' > /tmp/greptile_top.json &
+wait
+```
+
+**If API errors or zero Greptile comments across both endpoints:** Skip silently.
+
+The `position != null` filter on line-level comments automatically skips outdated comments from force-pushed code.
+
+---
+
+## Suppressions Check
+
+Read `~/.gstack/greptile-history.md` if it exists. Each line records a previous triage outcome:
+
+```
+<date> | <repo> | <type:fp|fix|already-fixed> | <file-pattern> | <category>
+```
+
+**Categories** (fixed set): `race-condition`, `null-check`, `error-handling`, `style`, `type-safety`, `security`, `performance`, `correctness`, `other`
+
+Match each fetched comment against entries where:
+- `type == fp` (only suppress known false positives, not previously fixed real issues)
+- `repo` matches the current repo
+- `file-pattern` matches the comment's file path
+- `category` matches the issue type in the comment
+
+Skip matched comments as **SUPPRESSED**.
+
+If the history file doesn't exist or has unparseable lines, skip those lines and continue — never fail on a malformed history file.
+
+---
+
+## Classify
+
+For each non-suppressed comment:
+
+1. **Line-level comments:** Read the file at the indicated `path:line` and surrounding context (±10 lines)
+2. **Top-level comments:** Read the full comment body
+3. Cross-reference the comment against the full diff (`git diff origin/main`) and the review checklist
+4. Classify:
+ - **VALID & ACTIONABLE** — a real bug, race condition, security issue, or correctness problem that exists in the current code
+ - **VALID BUT ALREADY FIXED** — a real issue that was addressed in a subsequent commit on the branch. Identify the fixing commit SHA.
+ - **FALSE POSITIVE** — the comment misunderstands the code, flags something handled elsewhere, or is stylistic noise
+ - **SUPPRESSED** — already filtered in the suppressions check above
+
+---
+
+## Reply APIs
+
+When replying to Greptile comments, use the correct endpoint based on comment source:
+
+**Line-level comments** (from `pulls/$PR/comments`):
+```bash
+gh api repos/$REPO/pulls/$PR_NUMBER/comments/$COMMENT_ID/replies \
+ -f body="<reply text>"
+```
+
+**Top-level comments** (from `issues/$PR/comments`):
+```bash
+gh api repos/$REPO/issues/$PR_NUMBER/comments \
+ -f body="<reply text>"
+```
+
+**If a reply POST fails** (e.g., PR was closed, no write permission): warn and continue. Do not stop the workflow for a failed reply.
+
+---
+
+## History File Writes
+
+Before writing, ensure the directory exists:
+```bash
+mkdir -p ~/.gstack
+```
+
+Append one line per triage outcome to `~/.gstack/greptile-history.md`:
+```
+<YYYY-MM-DD> | <owner/repo> | <type> | <file-pattern> | <category>
+```
+
+Example entries:
+```
+2026-03-13 | garrytan/myapp | fp | app/services/auth_service.rb | race-condition
+2026-03-13 | garrytan/myapp | fix | app/models/user.rb | null-check
+2026-03-13 | garrytan/myapp | already-fixed | lib/payments.rb | error-handling
+```
+
+---
+
+## Output Format
+
+Include a Greptile summary in the output header:
+```
++ N Greptile comments (X valid, Y fixed, Z FP)
+```
+
+For each classified comment, show:
+- Classification tag: `[VALID]`, `[FIXED]`, `[FALSE POSITIVE]`, `[SUPPRESSED]`
+- File:line reference (for line-level) or `[top-level]` (for top-level)
+- One-line body summary
+- Permalink URL (the `html_url` field)
M setup => setup +4 -0
@@ 32,6 32,10 @@ if [ "$NEEDS_BUILD" -eq 1 ]; then
bun install
bun run build
)
+ # Safety net: write .version if build script didn't (e.g., git not available during build)
+ if [ ! -f "$GSTACK_DIR/browse/dist/.version" ]; then
+ git -C "$GSTACK_DIR" rev-parse HEAD > "$GSTACK_DIR/browse/dist/.version" 2>/dev/null || true
+ fi
fi
if [ ! -x "$BROWSE_BIN" ]; then
M setup-browser-cookies/SKILL.md => setup-browser-cookies/SKILL.md +6 -1
@@ 26,9 26,12 @@ Import logged-in sessions from your real Chromium browser into the headless brow
### 1. Find the browse binary
```bash
-B=$(browse/bin/find-browse 2>/dev/null || ~/.claude/skills/gstack/browse/bin/find-browse 2>/dev/null)
+BROWSE_OUTPUT=$(browse/bin/find-browse 2>/dev/null || ~/.claude/skills/gstack/browse/bin/find-browse 2>/dev/null)
+B=$(echo "$BROWSE_OUTPUT" | head -1)
+META=$(echo "$BROWSE_OUTPUT" | grep "^META:" || true)
if [ -n "$B" ]; then
echo "READY: $B"
+ [ -n "$META" ] && echo "$META"
else
echo "NEEDS_SETUP"
fi
@@ 39,6 42,8 @@ If `NEEDS_SETUP`:
2. Run: `cd <SKILL_DIR> && ./setup`
3. If `bun` is not installed: `curl -fsSL https://bun.sh/install | bash`
+If you see `META:UPDATE_AVAILABLE`: tell the user an update is available, STOP and wait for approval, then run the command from the META payload and re-run the setup check.
+
### 2. Open the cookie picker
```bash
M ship/SKILL.md => ship/SKILL.md +43 -0
@@ 23,6 23,7 @@ You are running the `/ship` workflow. This is a **non-interactive, fully automat
- Test failures (stop, show failures)
- Pre-landing review finds CRITICAL issues and user chooses to fix (not acknowledge or skip)
- MINOR or MAJOR version bump needed (ask — see Step 4)
+- Greptile review comments that need user decision (complex fixes, false positives)
**Never stop for:**
- Uncommitted changes (always include them)
@@ 171,6 172,43 @@ Save the review output — it goes into the PR body in Step 8.
---
+## Step 3.75: Address Greptile review comments (if PR exists)
+
+Read `.claude/skills/review/greptile-triage.md` and follow the fetch, filter, and classify steps.
+
+**If no PR exists, `gh` fails, API returns an error, or there are zero Greptile comments:** Skip this step silently. Continue to Step 4.
+
+**If Greptile comments are found:**
+
+Include a Greptile summary in your output: `+ N Greptile comments (X valid, Y fixed, Z FP)`
+
+For each classified comment:
+
+**VALID & ACTIONABLE:** Use AskUserQuestion with:
+- The comment (file:line or [top-level] + body summary + permalink URL)
+- Your recommended fix
+- Options: A) Fix now (recommended), B) Acknowledge and ship anyway, C) It's a false positive
+- If user chooses A: apply the fix, commit the fixed files (`git add <fixed-files> && git commit -m "fix: address Greptile review — <brief description>"`), reply to the comment (`"Fixed in <commit-sha>."`), and save to `~/.gstack/greptile-history.md` (type: fix).
+- If user chooses C: reply explaining the false positive, save to history (type: fp).
+
+**VALID BUT ALREADY FIXED:** Reply acknowledging the catch — no AskUserQuestion needed:
+- Post reply: `"Good catch — already fixed in <commit-sha>."`
+- Save to `~/.gstack/greptile-history.md` (type: already-fixed)
+
+**FALSE POSITIVE:** Use AskUserQuestion:
+- Show the comment and why you think it's wrong (file:line or [top-level] + body summary + permalink URL)
+- Options:
+ - A) Reply to Greptile explaining the false positive (recommended if clearly wrong)
+ - B) Fix it anyway (if trivial)
+ - C) Ignore silently
+- If user chooses A: post reply using the appropriate API from the triage doc, save to history (type: fp)
+
+**SUPPRESSED:** Skip silently — these are known false positives from previous triage.
+
+**After all comments are resolved:** If any fixes were applied, the tests from Step 3 are now stale. **Re-run tests** (Step 3) before continuing to Step 4. If no fixes were applied, continue to Step 4.
+
+---
+
## Step 4: Version bump (auto-decide)
1. Read the current `VERSION` file (4-digit format: `MAJOR.MINOR.PATCH.MICRO`)
@@ 275,6 313,11 @@ gh pr create --title "<type>: <summary>" --body "$(cat <<'EOF'
## Eval Results
<If evals ran: suite names, pass/fail counts, cost dashboard summary. If skipped: "No prompt-related files changed — evals skipped.">
+## Greptile Review
+<If Greptile comments were found: bullet list with [FIXED] / [FALSE POSITIVE] / [ALREADY FIXED] tag + one-line summary per comment>
+<If no Greptile comments found: "No Greptile comments.">
+<If no PR existed during Step 3.75: omit this section entirely>
+
## Test plan
- [x] All Rails tests pass (N runs, 0 failures)
- [x] All Vitest tests pass (N tests)