This document covers the command reference and internals of gstack's headless browser.
| Category | Commands | What for |
|---|---|---|
| Navigate | goto, back, forward, reload, url |
Get to a page |
| Read | text, html, links, forms, accessibility |
Extract content |
| Snapshot | snapshot [-i] [-c] [-d N] [-s sel] |
Get refs for interaction |
| Interact | click, fill, select, hover, type, press, scroll, wait, viewport |
Use the page |
| Inspect | js, eval, css, attrs, console, network, cookies, storage, perf |
Debug and verify |
| Visual | screenshot, pdf, responsive |
See what Claude sees |
| Compare | diff <url1> <url2> |
Spot differences between environments |
| Tabs | tabs, tab, newtab, closetab |
Multi-page workflows |
| Multi-step | chain (JSON from stdin) |
Batch commands in one call |
All selector arguments accept CSS selectors or @ref after snapshot. 40+ commands total.
gstack's browser is a compiled CLI binary that talks to a persistent local Chromium daemon over HTTP. The CLI is a thin client — it reads a state file, sends a command, and prints the response to stdout. The server does the real work via Playwright.
┌─────────────────────────────────────────────────────────────────┐
│ Claude Code │
│ │
│ "browse goto https://staging.myapp.com" │
│ │ │
│ ▼ │
│ ┌──────────┐ HTTP POST ┌──────────────┐ │
│ │ browse │ ──────────────── │ Bun HTTP │ │
│ │ CLI │ localhost:9400 │ server │ │
│ │ │ Bearer token │ │ │
│ │ compiled │ ◄────────────── │ Playwright │──── Chromium │
│ │ binary │ plain text │ API calls │ (headless) │
│ └──────────┘ └──────────────┘ │
│ ~1ms startup persistent daemon │
│ auto-starts on first call │
│ auto-stops after 30 min idle │
└─────────────────────────────────────────────────────────────────┘
First call: CLI checks /tmp/browse-server.json for a running server. None found — it spawns bun run browse/src/server.ts in the background. The server launches headless Chromium via Playwright, picks a port (9400-9410), generates a bearer token, writes the state file, and starts accepting HTTP requests. This takes ~3 seconds.
Subsequent calls: CLI reads the state file, sends an HTTP POST with the bearer token, prints the response. ~100-200ms round trip.
Idle shutdown: After 30 minutes with no commands, the server shuts down and cleans up the state file. Next call restarts it automatically.
Crash recovery: If Chromium crashes, the server exits immediately (no self-healing — don't hide failure). The CLI detects the dead server on the next call and starts a fresh one.
browse/
├── src/
│ ├── cli.ts # Thin client — reads state file, sends HTTP, prints response
│ ├── server.ts # Bun.serve HTTP server — routes commands to Playwright
│ ├── browser-manager.ts # Chromium lifecycle — launch, tabs, ref map, crash handling
│ ├── snapshot.ts # Accessibility tree → @ref assignment → Locator map
│ ├── read-commands.ts # Non-mutating commands (text, html, links, js, css, etc.)
│ ├── write-commands.ts # Mutating commands (click, fill, select, navigate, etc.)
│ ├── meta-commands.ts # Server management (status, stop, restart)
│ └── buffers.ts # Console + network log capture (in-memory + disk flush)
├── test/ # Integration tests + HTML fixtures
└── dist/
└── browse # Compiled binary (~58MB, Bun --compile)
The browser's key innovation is ref-based element selection, built on Playwright's accessibility tree API:
page.locator(scope).ariaSnapshot() returns a YAML-like accessibility tree@e1, @e2, ...) to each elementLocator (using getByRole + nth-child)BrowserManagerclick @e3 look up the Locator and call locator.click()No DOM mutation. No injected scripts. Just Playwright's native accessibility API.
Each server session generates a random UUID as a bearer token. The token is written to the state file (/tmp/browse-server.json) with chmod 600. Every HTTP request must include Authorization: Bearer <token>. This prevents other processes on the machine from controlling the browser.
The server hooks into Playwright's page.on('console') and page.on('response') events. All entries are kept in memory and flushed to disk every second:
/tmp/browse-console.log/tmp/browse-network.logThe console and network commands read from the in-memory buffers, not disk.
Each workspace gets its own isolated browser instance with its own Chromium process, tabs, cookies, and logs.
If CONDUCTOR_PORT is set (e.g., by Conductor), the browse port is derived deterministically:
browse_port = CONDUCTOR_PORT - 45600
| Workspace | CONDUCTOR_PORT | Browse port | State file |
|---|---|---|---|
| Workspace A | 55040 | 9440 | /tmp/browse-server-9440.json |
| Workspace B | 55041 | 9441 | /tmp/browse-server-9441.json |
| No Conductor | — | 9400 (scan) | /tmp/browse-server.json |
You can also set BROWSE_PORT directly.
| Variable | Default | Description |
|---|---|---|
BROWSE_PORT |
0 (auto-scan 9400-9410) | Fixed port for the HTTP server |
CONDUCTOR_PORT |
— | If set, browse port = this - 45600 |
BROWSE_IDLE_TIMEOUT |
1800000 (30 min) | Idle shutdown timeout in ms |
BROWSE_STATE_FILE |
/tmp/browse-server.json |
Path to state file |
BROWSE_SERVER_SCRIPT |
auto-detected | Path to server.ts |
| Tool | First call | Subsequent calls | Context overhead per call |
|---|---|---|---|
| Chrome MCP | ~5s | ~2-5s | ~2000 tokens (schema + protocol) |
| Playwright MCP | ~3s | ~1-3s | ~1500 tokens (schema + protocol) |
| gstack browse | ~3s | ~100-200ms | 0 tokens (plain text stdout) |
The context overhead difference compounds fast. In a 20-command browser session, MCP tools burn 30,000-40,000 tokens on protocol framing alone. gstack burns zero.
MCP (Model Context Protocol) works well for remote services, but for local browser automation it adds pure overhead:
gstack skips all of this. Compiled binary. Plain text in, plain text out. No protocol. No schema. No connection management.
The browser automation layer is built on Playwright by Microsoft. Playwright's accessibility tree API, locator system, and headless Chromium management are what make ref-based interaction possible. The snapshot system — assigning @ref labels to accessibility tree nodes and mapping them back to Playwright Locators — is built entirely on top of Playwright's primitives. Thank you to the Playwright team for building such a solid foundation.
bun install)bun install # install dependencies + Playwright Chromium
bun test # run integration tests (~3s)
bun run dev <cmd> # run CLI from source (no compile)
bun run build # compile to browse/dist/browse
During development, use bun run dev instead of the compiled binary. It runs browse/src/cli.ts directly with Bun, so you get instant feedback without a compile step:
bun run dev goto https://example.com
bun run dev text
bun run dev snapshot -i
bun run dev click @e3
The compiled binary (bun run build) is only needed for distribution. It produces a single ~58MB executable at browse/dist/browse using Bun's --compile flag.
bun test # run all tests
bun test browse/test/commands # run command integration tests only
bun test browse/test/snapshot # run snapshot tests only
Tests spin up a local HTTP server (browse/test/test-server.ts) serving HTML fixtures from browse/test/fixtures/, then exercise the CLI commands against those pages. Tests take ~3 seconds.
| File | Role |
|---|---|
browse/src/cli.ts |
Entry point. Reads /tmp/browse-server.json, sends HTTP to the server, prints response. |
browse/src/server.ts |
Bun HTTP server. Routes commands to the right handler. Manages idle timeout. |
browse/src/browser-manager.ts |
Chromium lifecycle — launch, tab management, ref map, crash detection. |
browse/src/snapshot.ts |
Parses Playwright's accessibility tree, assigns @ref labels, builds Locator map. |
browse/src/read-commands.ts |
Non-mutating commands: text, html, links, js, css, forms, etc. |
browse/src/write-commands.ts |
Mutating commands: goto, click, fill, select, scroll, etc. |
browse/src/meta-commands.ts |
Server management: status, stop, restart. |
browse/src/buffers.ts |
In-memory + disk capture for console and network logs. |
The active skill lives at ~/.claude/skills/gstack/. After making changes:
cd ~/.claude/skills/gstack && git pullcd ~/.claude/skills/gstack && bun run buildOr copy the binary directly: cp browse/dist/browse ~/.claude/skills/gstack/browse/dist/browse
read-commands.ts (non-mutating) or write-commands.ts (mutating)server.tsbrowse/test/commands.test.ts with an HTML fixture if neededbun test to verifybun run build to compile