~cytrogen/gstack

ref: c0f3c3a91a8da5882d6c0f2aaf5d82ad91776910 gstack/browse/src/browser-manager.ts -rw-r--r-- 20.7 KiB
c0f3c3a9 — Garry Tan a month ago
fix: security hardening + issue triage (v0.8.3) (#205)

* fix: check for bun before running setup (#147)

Users without bun installed got a cryptic "command not found" error.
Now prints a clear message with install instructions.

Closes #147

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: block SSRF via URL validation in browse commands (#17)

Adds validateNavigationUrl() that blocks non-HTTP(S) schemes (file://,
javascript:, data:) and cloud metadata endpoints (169.254.169.254,
metadata.google.internal). Applied to goto, diff, and newTab commands.
Localhost and private IPs remain allowed for local dev QA.

Closes #17

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: replace eval $(gstack-slug) with source <(...) (#133)

Eliminates unnecessary use of eval across all skill templates and
generated files. source <(...) has identical behavior without the
shell injection surface. Also hardens gstack-diff-scope usage.

Closes #133

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: rename /debug to /investigate to avoid Claude Code conflict (#190)

Claude Code has a built-in /debug command that shadows the gstack skill.
Renaming to /investigate which better reflects the systematic root-cause
investigation methodology.

Closes #190

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: add unit tests for path validation helpers

validateOutputPath() and validateReadPath() are security-critical
functions with zero test coverage. Adds 14 tests covering safe paths,
traversal attacks, and prefix collision edge cases.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: bump version and changelog (v0.8.3)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: update /debug → /investigate references in docs

CLAUDE.md, README.md, and docs/skills.md still referenced the old
/debug skill name after the rename.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: harden URL validation against hostname bypasses (Codex P1)

Codex review found that metadata IPs could be reached via hex
(0xA9FEA9FE), decimal (2852039166), octal, trailing dot, and IPv6
bracket forms. Now normalizes hostnames before checking the blocklist
and probes numeric IP representations via URL constructor.

Also moves URL validation before page allocation in newTab() to
prevent zombie tabs on rejection (Codex P3).

5 new test cases for bypass variants.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2d97ab99 — Garry Tan a month ago
feat: browse handoff — headless-to-headed browser switching (v0.7.4) (#201)

* feat: browse handoff — headless-to-headed browser switching

Add `handoff` and `resume` commands that let users take over a visible
Chrome when the headless browser gets stuck (CAPTCHAs, auth walls, MFA).

Architecture: launch-first-close-second for safe rollback. State transfer
via extracted saveState()/restoreState() helpers (DRY with recreateContext).
Auto-handoff hint after 3 consecutive command failures.

* test: handoff unit + integration tests (15 tests)

Covers saveState/restoreState, failure tracking, edge cases (already
headed, resume without handoff), and full integration flow with cookie
and tab preservation across headless-to-headed switch.

* docs: handoff section in browse template + TODOS update

Add User Handoff section to browse/SKILL.md.tmpl with usage examples.
Update State Persistence TODO noting saveState/restoreState reusability.

* chore: bump version and changelog (v0.7.4)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
716e4c93 — Garry Tan a month ago
Merge pull request #128 from xr843/fix/chain-duplication-and-flush-perf

fix: eliminate duplicate command sets in chain, improve flush perf and type safety
1a100a2a — xianren a month ago
fix: eliminate duplicate command sets in chain, improve flush perf and type safety

- Remove duplicate CHAIN_READ/CHAIN_WRITE/CHAIN_META sets from meta-commands.ts
  and import from commands.ts (single source of truth). The duplicated sets would
  silently fail to route new commands added to commands.ts.
- Replace read+concat+write log flush with fs.appendFileSync — O(new entries)
  instead of O(total log size) per flush cycle.
- Replace `any` types for contextOptions with Playwright's BrowserContextOptions
  and add proper types for storage state in recreateContext().

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
318ffdbd — Garry Tan a month ago
fix: js statement wrapping + click auto-routes option to selectOption (v0.4.5) (#117)

* fix: js statement wrapping + click auto-routes option to selectOption

Bug 1: js command wrapped all code as expressions — const, semicolons,
and multi-line code broke with SyntaxError. Added needsBlockWrapper()
and wrapForEvaluate() helpers (shared with eval) to detect statements
and use block wrapper {…} instead of expression wrapper (…).

Bug 2: clicking <option> refs hung forever because Playwright can't
.click() native select UI. Click handler now checks ARIA role + DOM
tagName and auto-routes to selectOption() via parent <select>.

Bug 3: click timeouts on <option> elements gave no guidance. Now
throws helpful error: "Use browse select instead of click."

* chore: bump version and changelog (v0.4.5)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
f3ee0ee2 — Garry Tan a month ago
feat: QA restructure, browser ref staleness, eval efficiency metrics (v0.4.0) (#83)

* feat: browser ref staleness detection via async count() validation

resolveRef() now checks element count to detect stale refs after page
mutations (e.g. SPA navigation). RefEntry stores role+name metadata
for better diagnostics. 3 new snapshot tests for staleness detection.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: qa-only skill, qa fix loop, plan-to-QA artifact flow

Add /qa-only (report-only, Edit tool blocked), restructure /qa with
find-fix-verify cycle, add {{QA_METHODOLOGY}} DRY placeholder for
shared methodology. /plan-eng-review now writes test-plan artifacts
to ~/.gstack/projects/<slug>/ for QA consumption.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: eval efficiency metrics — turns, duration, commentary across all surfaces

Add generateCommentary() for natural-language delta interpretation,
per-test turns/duration in comparison and summary output, judgePassed
unit tests, 3 new E2E tests (qa-only, qa fix loop, plan artifact).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: bump version and changelog (v0.4.0)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs: update ARCHITECTURE, BROWSER, CONTRIBUTING, README for v0.4.0

- ARCHITECTURE: add ref staleness detection section, update RefEntry type
- BROWSER: add ref staleness paragraph to snapshot system docs
- CONTRIBUTING: update eval tool descriptions with commentary feature
- README: fix missing qa-only in project-local uninstall command

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs: add user-facing benefit descriptions to v0.4.0 changelog

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
07b4e15b — Garry Tan a month ago
feat: v0.3.2 — project-local state, diff-aware QA, Greptile integration (#36)

* fix: cookie import picker returns JSON instead of HTML

jsonResponse() was defined at module scope but referenced `url` which
only existed as a parameter of handleCookiePickerRoute(). Every API call
crashed, the catch block also crashed, and Bun returned a default HTML
page that the frontend couldn't parse as JSON.

Thread port via corsOrigin() helper and options objects. Add route-level
tests to prevent this class of bug from shipping again.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: add help command to browse server

Agents that don't have SKILL.md loaded (or misread flags) had no way to
self-discover the CLI. The help command returns a formatted reference of
all commands and snapshot flags.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: version-aware find-browse with META signal protocol

Agents in other workspaces found stale browse binaries that were missing
newer flags. find-browse now compares the local binary's git SHA against
origin/main via git ls-remote (4hr cache), and emits META:UPDATE_AVAILABLE
when behind. SKILL.md setup checks parse META signals and prompt the user
to update.

- New compiled binary: browse/dist/find-browse (TypeScript, testable)
- Bash shim at browse/bin/find-browse delegates to compiled binary
- .version file written at build time with git commit SHA
- Build script compiles both browse and find-browse binaries
- Graceful degradation: offline, missing .version, corrupt cache all skip check

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: clean up .bun-build temp files after compile

bun build --compile leaves ~58MB temp files in the working directory.
Add rm -f .*.bun-build to the build script to clean up after each build.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: make help command reachable by removing it from META_COMMANDS

help was in META_COMMANDS, so it dispatched to handleMetaCommand() which
threw "Unknown meta command: help". Removing it from the set lets the
dedicated else-if handler in handleCommand() execute correctly.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: bump version and changelog (v0.3.2)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: add shared Greptile comment triage reference doc

Shared reference for fetching, filtering, and classifying Greptile
review comments on GitHub PRs. Used by both /review and /ship skills.
Includes parallel API fetching, suppressions check, classification
logic, reply APIs, and history file writes.

* feat: make /review and /ship Greptile-aware

/review: Step 2.5 fetches and classifies Greptile comments, Step 5
resolves them with AskUserQuestion for valid issues and false positives.

/ship: Step 3.75 triages Greptile comments between pre-landing review
and version bump. Adds Greptile Review section to PR body in Step 8.
Re-runs tests if any Greptile fixes are applied.

* feat: add Greptile batting average to /retro

Reads ~/.gstack/greptile-history.md, computes signal ratio
(valid catches vs false positives), includes in metrics table,
JSON snapshot, and Code Quality Signals narrative.

* docs: add Greptile integration section to README

Personal endorsement, two-layer review narrative, full UX walkthrough
transcript, skills table updates. Add Greptile training feedback loop
to TODO.md future ideas.

* feat: add local dev mode for testing skills from within the repo

bin/dev-setup creates .claude/skills/gstack symlink to the working tree
so Claude Code discovers skills locally. bin/dev-teardown cleans up.
DEVELOPING_GSTACK.md documents the workflow.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: narrow gitignore to .claude/skills/ instead of all .claude/

Avoids ignoring legitimate Claude Code config like settings.json or CLAUDE.md.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs: rename DEVELOPING_GSTACK.md to CONTRIBUTING.md

Rewritten as a contributor-friendly guide instead of a dry plan doc.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs: explain why dev-setup is needed in CONTRIBUTING.md quick start

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: add browser interaction guidance to CLAUDE.md

Prevents Claude from using mcp__claude-in-chrome__* tools instead of /browse.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: add shared config module for project-local browse state

Centralizes path resolution (git root detection, state dir, log paths) into
config.ts. Both cli.ts and server.ts import from it, eliminating duplicated
PORT_OFFSET/BROWSE_PORT/STATE_FILE logic.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: rewrite port selection to use random ports

Replace CONDUCTOR_PORT magic offset and 9400-9409 scan with random port
10000-60000. Atomic state file writes, log paths from config module,
binaryVersion field for auto-restart on update.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: move browse state from /tmp to project-local .gstack/

CLI now uses config module for state paths, passes BROWSE_STATE_FILE to
spawned server. Adds version mismatch auto-restart, legacy /tmp cleanup
with PID verification, and removes stale global install fallback.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: update crash log path reference to .gstack/

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* test: add config tests and update CLI lifecycle test

14 new tests for config resolution, ensureStateDir, readVersionHash,
resolveServerScript, and version mismatch detection. Remove obsolete
CONDUCTOR_PORT/BROWSE_PORT filtering from commands.test.ts.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs: update BROWSER.md and TODO.md for project-local state

Replace /tmp paths with .gstack/, remove CONDUCTOR_PORT docs, document
random port selection and per-project isolation. Add server bundling TODO.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs: update README, CHANGELOG, and CONTRIBUTING for v0.3.2

- README: replace Conductor-aware language with project-local isolation,
  add Greptile setup note
- CHANGELOG: comprehensive v0.3.2 entry with all state management changes
- CONTRIBUTING: add instructions for testing branches in other repos

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: add diff-aware mode to /qa — auto-tests affected pages from branch diff

When on a feature branch, /qa now reads git diff main, identifies affected
pages/routes from changed files, and tests them automatically. No URL required.
The most natural flow: write code, /ship, /qa.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: update CHANGELOG for complete v0.3.2 coverage

Add missing entries: diff-aware QA mode, Greptile integration,
local dev mode, crash log path fix, README/SKILL.md updates.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
f7b95329 — Garry Tan a month ago
feat: Phase 3.5 — cookie import, QA testing, team retro (v0.3.1) (#29)

* Phase 2: Enhanced browser — dialog handling, upload, state checks, snapshots

- CircularBuffer O(1) ring buffer for console/network/dialog (was O(n) array+shift)
- Async buffer flush with Bun.write() (was appendFileSync)
- Dialog auto-accept/dismiss with buffer + prompt text support
- File upload command (upload <sel> <file...>)
- Element state checks (is visible/hidden/enabled/disabled/checked/editable/focused)
- Annotated screenshots with ref labels overlaid (-a flag)
- Snapshot diffing against previous snapshot (-D flag)
- Cursor-interactive element scan for non-ARIA clickables (-C flag)
- Snapshot scoping depth limit (-d N flag)
- Health check with page.evaluate + 2s timeout
- Playwright error wrapping — actionable messages for AI agents
- Fix useragent — context recreation preserves cookies/storage/URLs
- wait --networkidle / --load / --domcontentloaded flags
- console --errors filter (error + warning only)
- cookie-import <json-file> with auto-fill domain from page URL
- 166 integration tests (was ~63)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Phase 2: Rewrite SKILL.md as QA playbook + command reference

Reorient SKILL.md files from raw command reference to QA-first playbook
with 10 workflow patterns (test user flows, verify deployments, dogfood
features, responsive layouts, file upload, forms, dialogs, compare pages).
Compact command reference tables at the bottom.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Phase 3: /qa skill — systematic QA testing with health scores

New /qa skill for systematic web app QA testing. Three modes:
- full: 5-10 documented issues with screenshots and repro steps
- quick: 30-second smoke test with health score
- regression: compare against saved baseline

Includes issue taxonomy (7 categories, 4 severity levels), structured
report template, health score rubric (weighted across 7 categories),
framework detection guidance (Next.js, Rails, WordPress, SPA).

Also adds browse/bin/find-browse (DRY binary discovery using git
rev-parse), .gstack/ to .gitignore, and updated TODO roadmap.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Bump to v0.3.0 — Phase 2 + Phase 3 changelog

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: cookie-import-browser — Chromium cookie decryption module + tests

Pure logic module for reading and decrypting cookies from macOS Chromium
browsers (Comet, Chrome, Arc, Brave, Edge). Supports v10 AES-128-CBC
encryption with macOS Keychain access, PBKDF2 key derivation, and
per-browser key caching. 18 unit tests with encrypted cookie fixtures.

* feat: cookie picker web UI + route handler

Two-panel dark-theme picker served from the browse server. Left panel
shows source browser domains with search and import buttons. Right panel
shows imported domains with trash buttons. No cookie values exposed.
6 API endpoints, importedDomains Set tracking, inline clearCookies.

* feat: wire cookie-import-browser into browse server

Add cookie-picker route dispatch (no auth, localhost-only), add
cookie-import-browser to WRITE_COMMANDS and CHAIN_WRITE, add serverPort
property to BrowserManager, add write command with two modes (picker UI
vs --domain direct import), update CLI help text.

* chore: /setup-browser-cookies skill + docs (Phase 3.5)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: bump version and changelog (v0.3.1)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* security: redact sensitive values from command output (PR #21)

type no longer echoes text (reports character count), cookie redacts
value with ****, header redacts Authorization/Cookie/X-API-Key/X-Auth-Token,
storage set drops value, forms redacts password fields. Prevents secrets
from persisting in LLM transcripts. 7 new tests.

Credit: fredluz (PR #21)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* security: path traversal prevention for screenshot/pdf/eval (PR #26)

Add validateOutputPath() for screenshot/pdf/responsive (restricts to
/tmp and cwd) and validateReadPath() for eval (blocks .. sequences and
absolute paths outside safe dirs). 7 new tests.

Credit: Jah-yee (PR #26)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: auto-install Playwright Chromium in setup (PR #22)

Setup now verifies Playwright can launch Chromium, and auto-installs
it via `bunx playwright install chromium` if missing. Exits non-zero
if build or Chromium launch fails.

Credit: AkbarDevop (PR #22)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* security: fix path validation bypass, CORS restriction, cookie-import path check

- startsWith('/tmp') matched '/tmpevil' — now requires trailing slash
- CORS Access-Control-Allow-Origin changed from * to http://127.0.0.1:<port>
- cookie-import now validates file paths (was missing validateReadPath)
- 3 new tests for prefix collision and cookie-import path traversal

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: address review informational issues + add regression tests

- Add cookie-import to CHAIN_WRITE set for chain command routing
- Add path validation to snapshot -a -o output path
- Fix package.json version to match 0.3.1
- Use crypto.randomUUID() for temp DB paths (unpredictable filenames)
- Add regression tests for chain cookie-import and snapshot path validation

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs: add /qa, /setup-browser-cookies to README + update BROWSER.md

- Add /qa and /setup-browser-cookies to skills table, install/update/uninstall blurbs
- Add dedicated README sections for both new skills with usage examples
- Update demo workflow to show cookie import → QA → browse flow
- Update BROWSER.md: cookie import commands, new source files, test count (203)
- Update skill count from 6 to 8

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: team-aware /retro v2.0 — per-person praise and growth opportunities

- Identify current user via git config, orient narrative as "you" vs teammates
- Add per-author metrics: commits, LOC, focus areas, commit type mix, sessions
- New "Your Week" section with personal deep-dive for whoever runs the command
- New "Team Breakdown" with per-person praise and growth opportunities
- Track AI-assisted commits via Co-Authored-By trailers
- Personal + team shipping streaks
- Tone: praise like a 1:1, growth like investment advice, never compare negatively

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs: add Conductor parallel sessions section to README

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
3d901066 — Garry Tan a month ago
Initial release — gstack v0.0.1

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>