feat: skill prefix is now a persistent user choice (v0.12.11.0) (#571) * feat: make skill prefix a persistent, interactive user setting - Add --prefix flag alongside --no-prefix - Read/write skill_prefix from ~/.gstack/config.yaml (true/false) - Interactive prompt on first setup when no preference saved - Non-TTY environments default to flat names (no prefix) - Add cleanup_prefixed_claude_symlinks() for reverse direction - Fix gstack-config sed portability (mktemp+mv instead of BSD sed -i '') - Add SKILL_PREFIX to preamble output with namespace-aware instruction * test: add prefix config tests + README switching instructions 8 structural tests for persistent prefix setting: config reading, --prefix flag, config persistence, interactive prompt, TTY fallback, reverse cleanup, cleanup ordering, welcome. * chore: regenerate SKILL.md files with SKILL_PREFIX preamble * chore: bump version and changelog (v0.12.11.0) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: reframe changelog as feature, not mea culpa * docs: update CONTRIBUTING + CLAUDE.md for prefix-aware vendoring - CONTRIBUTING: vendoring now includes ./setup step for per-skill symlinks - CONTRIBUTING: prefix choice documented in contributor workflow + dev diagram - CONTRIBUTING: switching prefix mode section added - CLAUDE.md: vendored symlink awareness section covers prefix setting Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
feat: community PRs — faster install, skill namespacing, uninstall, Codex fallback, Windows fix, Python patterns (v0.12.9.0) (#561) * fix: sync package.json version with VERSION file (0.12.7.0) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * perf: shallow clone for faster install (#484) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: Python/async/SSRF patterns in review checklist (#531) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: namespace skill symlinks with gstack- prefix (#503) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add uninstall script (#323) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: office-hours Claude subagent fallback when Codex unavailable (#464) Updates generateCodexSecondOpinion resolver to always offer second opinion and fall back to Claude subagent when Codex is unavailable or errors. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: findPort() race condition via net.createServer (#490) Replaces Bun.serve() port probing with net.createServer() for proper async bind/close semantics. Fixes Windows EADDRINUSE race condition. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: add tests for uninstall, setup prefix, and resolver fallback - Uninstall integration tests: syntax, flags, mock install layout, upgrade path - Setup prefix tests: gstack-* prefixing, --no-prefix, cleanup migration - Resolver tests: Claude subagent fallback in generated SKILL.md Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: bump version and changelog (v0.12.9.0) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
feat: headed mode + sidebar agent + Chrome extension (v0.12.0) (#517) * feat: CDP connect — control real Chrome/Comet via Playwright Add `connectCDP()` to BrowserManager: connects to a running browser via Chrome DevTools Protocol. All existing browse commands work unchanged through Playwright's abstraction layer. - chrome-launcher.ts: browser discovery, CDP probe, auto-relaunch with rollback - browser-manager.ts: connectCDP(), mode guards (close/closeTab/recreateContext/handoff), auto-reconnect on browser restart, getRefMap() for extension API - server.ts: CDP branch in start(), /health gains mode field, /refs endpoint, idle timer only resets on /command (not passive endpoints) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: browse connect/disconnect/focus CLI commands - connect: pre-server command that discovers browser, starts server in CDP mode - disconnect: drops CDP connection, restarts in headless mode - focus: brings browser window to foreground via osascript (macOS) - status: now shows Mode: cdp | launched | headed - startServer() accepts extra env vars for CDP URL/port passthrough Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: CDP-aware skill templates — skip cookie import in real browser mode Skills now check `$B status` for CDP mode and skip: - /qa: cookie import prompt, user-agent override, headless workarounds - /design-review: cookie import for authenticated pages - /setup-browser-cookies: returns "not needed" in CDP mode Regenerated SKILL.md files from updated templates. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: activity streaming — SSE endpoint for Chrome extension Side Panel Real-time browse command feed via Server-Sent Events: - activity.ts: ActivityEntry type, CircularBuffer (capacity 1000), privacy filtering (redacts passwords, auth tokens, sensitive URL params), cursor-based gap detection, async subscriber notification - server.ts: /activity/stream SSE, /activity/history REST, handleCommand instrumented with command_start/command_end events - 18 unit tests for filterArgs privacy, emitActivity, subscribe lifecycle Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: Chrome extension Side Panel + Conductor API proposal Chrome extension (Manifest V3, sideload): - Side Panel with live activity feed, @ref overlays, dark terminal aesthetic - Background worker: health polling, SSE relay, ref fetching - Popup: port config, connection status, side panel launcher - Content script: floating ref panel with @ref badges Conductor API proposal (docs/designs/CONDUCTOR_SESSION_API.md): - SSE endpoint for full Claude Code session mirroring in Side Panel - Discovery via HTTP endpoint (not filesystem — extensions can't read files) TODOS.md: add $B watch, multi-agent tabs, cross-platform CDP, Web Store publishing. Mark CDP mode as shipped. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: detect Conductor runtime, skip osascript quit for sandboxed apps macOS App Management blocks Electron apps (Conductor) from quitting other apps via osascript. Now detects the runtime environment: - terminal/claude-code/codex: can manage apps freely - conductor: prints manual restart instructions + polls for 60s detectRuntime() checks env vars and parent process. When Chrome needs restart but we can't quit it, prints step-by-step instructions and waits for the user to restart Chrome with --remote-debugging-port. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: detect Conductor via actual env vars (CONDUCTOR_WORKSPACE_NAME) Previous detection checked CONDUCTOR_WORKSPACE_ID which doesn't exist. Conductor sets CONDUCTOR_WORKSPACE_NAME, CONDUCTOR_BIN_DIR, CONDUCTOR_PORT, and __CFBundleIdentifier=com.conductor.app. Check these FIRST because Conductor sessions also have ANTHROPIC_API_KEY (which was matching claude-code). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: connection status pill — floating indicator when gstack controls Chrome Small pill in bottom-right corner of every page: "● gstack · 3 refs" Shows when connected via CDP, fades to 30% opacity after 3s, full on hover. Disappears entirely when disconnected. Background worker now notifies content scripts on connect/disconnect state changes so the pill appears/disappears without polling. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: Chrome requires --user-data-dir for remote debugging Chrome refuses --remote-debugging-port without an explicit --user-data-dir. Add userDataDir to BrowserBinary registry (macOS Application Support paths) and pass it in both auto-launch and manual restart instructions. Fix double-quoting in CLI manual restart instructions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: Chrome must be fully quit before launching with --remote-debugging-port Chrome refuses to enable CDP on its default profile when another instance is running (even with explicit --user-data-dir). The only reliable path: fully quit Chrome first, then relaunch with the flag. Updated instructions to emphasize this clearly with verification step. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: bin/chrome-cdp — quit Chrome and relaunch with CDP in one command Quits Chrome gracefully, waits for full exit, relaunches with --remote-debugging-port, polls until CDP is ready. Usage: chrome-cdp [port] Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: use Playwright channel:chrome instead of broken connectOverCDP Playwright's connectOverCDP hangs with Chrome 146 due to CDP protocol version mismatch. Switch to channel:'chrome' which uses Playwright's native pipe protocol to launch the system Chrome binary directly. This is simpler and more reliable: - No CDP port discovery needed - No --remote-debugging-port or --user-data-dir hassles - $B connect just works — launches real Chrome headed window - All Playwright APIs (snapshot, click, fill) work unchanged bin/chrome-cdp updated with symlinked profile approach (kept for manual CDP use cases, but $B connect no longer needs it). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: green border + gstack label on controlled Chrome window Injects a 2px green border and small "gstack" label on every page loaded in the controlled Chrome window via context.addInitScript(). Users can instantly tell which Chrome window Claude controls. Also fixes close() for channel:chrome mode (uses browser.close() not browser.disconnect() which doesn't exist). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: cleanup chrome-launcher runtime detection, remove puppeteer-core dep Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * style(design): redesign controlled Chrome indicator Replace crude green border + label with polished indicator: - 2px shimmer gradient at top edge (green→cyan→green, 3s loop) - Floating pill bottom-right with frosted glass bg, fades to 25% opacity after 4s so it doesn't compete with page content - prefers-reduced-motion disables shimmer animation - Much more subtle — looks like a developer tool, not broken CSS Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: document real browser mode + Chrome extension in BROWSER.md and README.md BROWSER.md: new sections for connect/disconnect/focus commands, Chrome extension Side Panel install, CDP-aware skills, activity streaming. Updated command reference table, key components, env vars, source map. README.md: updated /browse description, added "Real browser mode" to What's New section. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: step-by-step Chrome extension install guide in BROWSER.md Replace terse bullet points with numbered walkthrough covering: developer mode toggle, load unpacked, macOS file picker tip (Cmd+Shift+G), pin extension, configure port, open side panel. Added troubleshooting section. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add Cmd+Shift+. tip for hidden folders in macOS file picker macOS hides folders starting with . by default. Added both shortcuts: Cmd+Shift+G (paste path directly) and Cmd+Shift+. (show hidden files). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: integrate hidden folder tips into the install flow naturally Move Cmd+Shift+G and Cmd+Shift+. tips inline with the file picker step instead of as a separate tip block after it. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: auto-load Chrome extension when $B connect launches Chrome Extension auto-loads via --load-extension flag — no manual chrome://extensions install needed. findExtensionPath() checks repo root, global install, and dev paths. Also adds bin/gstack-extension helper for manual install in regular Chrome, and rewrites BROWSER.md install docs with auto-load as primary path. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: /connect-chrome skill — one command to launch Chrome with Side Panel New skill that runs $B connect, verifies the connection, guides the user to open the Side Panel, and demos the live activity feed. Extension auto-loads via --load-extension so no manual chrome://extensions install needed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: use launchPersistentContext for Chrome extension loading Playwright's chromium.launch() silently ignores --load-extension. Switch to launchPersistentContext with ignoreDefaultArgs to remove --disable-extensions flag. Use bundled Chromium (real Chrome blocks unpacked extensions). Fixed port 34567 for CDP mode so the extension auto-connects. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: sync extension to DESIGN.md — amber accent, zinc neutrals, grain texture Import design system from gstack-website. Update all extension colors: green (#4ade80) → amber (#F59E0B/#FBBF24), zinc gray neutrals, grain texture overlay. Regenerate icons as amber "G" monogram on dark background. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: sidebar chat with Claude Code — icon opens side panel directly Replace popup flyout with direct side panel open on icon click. Primary UI is now a chat interface that sends messages to Claude Code via file queue. Activity/Refs tabs moved behind a debug toggle in the footer. Command bar with history, auto-poll for responses, amber design system. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: sidebar agent — Claude-powered chat backend via file queue Add /sidebar-command, /sidebar-response, and /sidebar-chat endpoints to the browse server. sidebar-agent.ts watches the command queue file, spawns claude -p with browse context for each message, and streams responses back to the sidebar chat. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: remove duplicate gstack pill overlay, hide crash restore bubble The addInitScript indicator and the extension's content script were both injecting bottom-right pills, causing duplicates. Remove the pill from addInitScript (extension handles it). Replace --restore-last-session with --hide-crash-restore-bubble to suppress the "Chromium didn't shut down correctly" dialog. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: state file authority — CDP server cannot be silently replaced Hardens the connect/disconnect lifecycle: - ensureServer() refuses to auto-start headless when CDP server is alive - $B connect does full cleanup: SIGTERM → 2s → SIGKILL, profile locks, state - shutdown() cleans Chromium SingletonLock/Socket/Cookie files - uncaughtException/unhandledRejection handlers do emergency cleanup This prevents the bug where a headless server overwrites the CDP server's state file, causing $B commands to hit the wrong browser. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: sidebar agent streaming events + session state management Enhance sidebar-agent.ts with: - Live streaming of claude -p events (tool_use, text, result) to sidebar - Session state file for BROWSE_STATE_FILE propagation to claude subprocess - Improved logging (stderr, exit codes, event types) - stdin.end() to prevent claude waiting for input - summarizeToolInput() with path shortening for compact sidebar display Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: sidebar chat UI — streaming events, agent status, reconnect retry Sidebar panel improvements: - Chat tab renders streaming agent events (tool_use, text, result) - Thinking dots animation while agent processes - Agent error display with styled error blocks - tryConnect() with 2s retry loop for initial connection - Debug tabs (Activity/Refs) hidden behind gear toggle - Clear chat button - Compact tool call display with path shortening Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: server-integrated sidebar agent with sessions and message queue Move the sidebar agent from a separate bun process into server.ts: - Agent spawns claude -p directly when messages arrive via /sidebar-command - In-memory chat buffer backed by per-session chat.jsonl on disk - Session manager: create, load, persist, list sessions - Message queue (cap 5) with agent status tracking (idle/processing/hung) - Stop/kill endpoints with queue dismiss support - /health now returns agent status + session info - All sidebar endpoints require Bearer auth - Agent killed on server shutdown - 120s timeout detects hung claude processes Eliminates: file-queue polling, separate sidebar-agent.ts process, stale auth tokens, state file conflicts between processes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: extension auth + token flow for server-integrated agent Update Chrome extension to use Bearer auth on all sidebar endpoints: - background.js captures auth token from /health, exposes via getToken msg - background.js sets openPanelOnActionClick for direct side panel access - sidepanel.js gets token from background, sends in all fetch headers - Health broadcasts include token so sidebar auto-authenticates - Removes popup from manifest — icon click opens side panel directly Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: self-healing sidebar — reconnect banner, state machine, copy button Sidebar UI now handles disconnection gracefully: - Connection state machine: connected → reconnecting → dead - Amber pulsing banner during reconnect (2s retry, 30 attempts) - Red "Server offline" banner with Reconnect + Copy /connect-chrome buttons - Green "Reconnected" toast that fades after 3s on successful reconnect - Copy button lets user paste /connect-chrome into any Claude Code session Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: crash handling — save session, kill agent, distinct exit codes Hardened shutdown/crash behavior: - Browser disconnect exits with code 2 (distinct from crash code 1) - emergencyCleanup kills agent subprocess and saves session state - Clean shutdown saves session before exit (chat history persists) - Clear user message on browser disconnect: "Run $B connect to reconnect" Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: worktree-per-session isolation for sidebar agent Each sidebar session gets an isolated git worktree so the agent's file operations don't conflict with the user's working directory: - createWorktree() creates detached HEAD worktree in ~/.gstack/worktrees/ - Falls back to main cwd for non-git repos or on creation failure - Handles collision cleanup from prior crashes - removeWorktree() cleans up on session switch and shutdown - worktreePath persisted in session.json Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(qa): ISSUE-001 — disconnect blocked by CDP guard in ensureServer $B disconnect was routed through ensureServer() which refused to start a headless server when a CDP state file existed. Disconnect is now handled before ensureServer() (like connect), with force-kill + cleanup fallback when the CDP server is unresponsive. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: resolve claude binary path for daemon-spawned agent The browse server runs as a daemon and may not inherit the user's shell PATH. Add findClaudeBin() that checks ~/.local/bin/claude (standard install location), which claude, and common system paths. Shows a clear error in the sidebar chat if claude CLI is not found. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: resolve claude symlinks + check Conductor bundled binary posix_spawn fails on symlinks in compiled bun binaries. Now: - Checks Conductor app's bundled binary first (not a symlink) - Scans ~/.local/share/claude/versions/ for direct versioned binaries - Uses fs.realpathSync() to resolve symlinks before spawning Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: compiled bun binary cannot posix_spawn — use external agent process Compiled bun binaries fail posix_spawn on ALL executables (even /bin/bash). The server now writes to an agent queue file, and a separate non-compiled bun process (sidebar-agent.ts) reads the queue, spawns claude, and POSTs events back via /sidebar-agent/event. Changes: - server.ts: spawnClaude writes to queue file instead of spawning directly - server.ts: new /sidebar-agent/event endpoint for agent → server relay - server.ts: fix result event field name (event.text vs event.result) - sidebar-agent.ts: rewritten to poll queue file, relay events via HTTP - cli.ts: $B connect auto-starts sidebar-agent as non-compiled bun process Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: loading spinner on sidebar open while connecting to server Shows an amber spinner with "Connecting..." when the sidebar first opens, replacing the empty state. After the first successful /sidebar-chat poll: - If chat history exists: renders it immediately - If no history: shows the welcome message Prevents the jarring empty-then-populated flash on sidebar open. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: zero-friction side panel — auto-open on install, pill is clickable Three changes to eliminate manual side panel setup: - Auto-open side panel on extension install/update (onInstalled listener) - gstack pill (bottom-right) is now clickable — opens the side panel - Pill has pointer-events: auto so clicks always register (was: none) User no longer needs to find the puzzle piece icon, pin the extension, or know the side panel exists. It opens automatically on first launch and can be re-opened by clicking the floating gstack pill. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: kill CDP naming, delete chrome-launcher.ts dead code The connectCDP() method and connectionMode: 'cdp' naming was a legacy artifact — real Chrome was tried but failed (silently blocks --load-extension), so the implementation already used Playwright's bundled Chromium via launchPersistentContext(). The naming was misleading. Changes: - Delete chrome-launcher.ts (361 LOC) — only import was in unreachable attemptReconnect() method - Delete dead attemptReconnect() and reconnecting field - Delete preExistingTabIds (was for protecting real Chrome tabs we never connect to) - Rename connectCDP() → launchHeaded() - Rename connectionMode: 'cdp' → 'headed' across all files - Replace BROWSE_CDP_URL/BROWSE_CDP_PORT env vars with BROWSE_HEADED=1 - Regenerate SKILL.md files for updated command descriptions - Move BrowserManager unit tests to browser-manager-unit.test.ts Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: converge handoff into connect — extension loads on handoff Handoff now uses launchPersistentContext() with extension auto-loading, same as the connect/launchHeaded() path. This means when the agent gets stuck (2FA, CAPTCHA) and hands off to the user, the Chrome extension + side panel are available automatically. Before: handoff used chromium.launch() + newContext() — no extension After: handoff uses chromium.launchPersistentContext() — extension loads Also sets connectionMode to 'headed' and disables dialog auto-accept on handoff, matching connect behavior. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: gate sidebar chat behind --chat flag $B connect (default): headed Chromium + extension with Activity + Refs tabs only. No separate agent spawned. Clean, no confusion. $B connect --chat: same + Chat tab with standalone claude -p agent. Shows experimental banner: "Standalone mode — this is a separate agent from your workspace." Implementation: - cli.ts: parse --chat, set BROWSE_SIDEBAR_CHAT env, conditionally spawn sidebar-agent - server.ts: gate /sidebar-* routes behind chatEnabled, return 403 when disabled, include chatEnabled in /health response - sidepanel.js: applyChatEnabled() hides/shows Chat tab + banner - background.js: forward chatEnabled from health response - sidepanel.html/css: experimental banner with amber styling Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: file drop relay + $B inbox command Sidebar agent now writes structured messages to .context/sidebar-inbox/ when processing user input. The workspace agent can read these via $B inbox to see what the user reported from the browser. File drop format: .context/sidebar-inbox/{timestamp}-observation.json { type, timestamp, page: {url}, userMessage, sidebarSessionId } Atomic writes (tmp + rename) prevent partial reads. $B inbox --clear removes messages after display. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: $B watch — passive observation mode Claude enters read-only mode and captures periodic snapshots (every 5s) while the user browses. Mutation commands (click, fill, etc.) are blocked during watch. $B watch stop exits and returns a summary with the last snapshot. Requires headed mode ($B connect). This is the inverse of the scout pattern — the workspace agent watches through the browser instead of the sidebar relaying to it. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: add coverage for sidebar-agent, file-drop, and watch mode 33 new tests covering: - Sidebar agent queue parsing (valid/malformed/empty JSONL) - writeToInbox file drop (directory creation, atomic writes, JSON format) - Inbox command (display, sorting, --clear, malformed file handling) - Watch mode state machine (start/stop cycles, snapshots, duration) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: TODOS cleanup + Chrome vs Chromium exploration doc - Update TODOS.md: mark CDP mode, $B watch, sidebar scout as SHIPPED - Delete dead "cross-platform CDP browser discovery" TODO - Rename dependencies from "CDP connect" to "headed mode" - Add docs/designs/CHROME_VS_CHROMIUM_EXPLORATION.md memorializing the architecture exploration and decision to use Playwright Chromium Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add Conductor Chrome sidebar integration design doc Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: sidebar-agent validates cwd before spawning claude The queue entry may reference a worktree that was cleaned up between sessions. Now falls back to process.cwd() if the path doesn't exist, preventing silent spawn failures. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: gen-skill-docs resolver merge + preamble tier gate + plan file discovery The local RESOLVERS record in gen-skill-docs.ts was shadowing the imported canonical resolvers, causing stale test coverage and preamble generators to be used instead of the authoritative versions in resolvers/. Changes: - Merge imported RESOLVERS with local overrides (spread + override pattern) - Fix preamble tier gate: tier 1 skills no longer get AskUserQuestion format - Make plan file discovery host-agnostic (search multiple plan dirs) - Add missing E2E tier entries for ship/review plan completion tests Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: ungate sidebar agent + raise timeout to 5 minutes (v0.12.0) Sidebar chat is now always available in headed mode — no --chat flag needed. Agent tasks get 5 minutes instead of 2, enabling multi-page workflows like navigating directories and filling forms across pages. Changes: - cli.ts: remove --chat flag, always set BROWSE_SIDEBAR_CHAT=1, always spawn agent - server.ts: remove chatEnabled gate (403 response), raise AGENT_TIMEOUT_MS to 300s - sidebar-agent.ts: raise child process timeout from 120s to 300s Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: headed mode + sidebar agent documentation (v0.12.0) - README: sidebar agent section, personal automation example (school parent portal), two auth paths (manual login + cookie import), DevTools MCP mention - BROWSER.md: sidebar agent section with usage, timeout, session isolation, authentication, and random delay documentation - connect-chrome template: add sidebar chat onboarding step - CHANGELOG: v0.12.0 entry covering headed mode, sidebar agent, extension - VERSION: bump to 0.12.0.0 - TODOS: Chrome DevTools MCP integration as P0 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: regenerate SKILL.md files Generated from updated templates + resolver merge. Key changes: - Tier 1 skills no longer include AskUserQuestion format section - Ship/review skills now include coverage gate with thresholds - Connect-chrome skill includes sidebar chat onboarding step - Plan file discovery uses host-agnostic paths Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: regenerate Codex connect-chrome skill Updated preamble with proactive prompt and sidebar chat onboarding step. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: network idle, state persistence, iframe support, chain pipe format (v0.12.1.0) (#516) * feat: network idle detection + chain pipe format - Upgrade click/fill/select from domcontentloaded to networkidle wait (2s timeout, best-effort). Catches XHR/fetch triggered by interactions. - Add pipe-delimited format to chain as JSON fallback: $B chain 'goto url | click @e5 | snapshot -ic' - Add post-loop networkidle wait in chain when last command was a write. - Frame-aware: commands use target (getActiveFrameOrPage) for locator ops, page-only ops (goto/back/forward/reload) guard against frame context. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: $B state save/load + $B frame — new browse commands - state save/load: persist cookies + URLs to .gstack/browse-states/{name}.json File perms 0o600, name sanitized to [a-zA-Z0-9_-]. V1 skips localStorage (breaks on load-before-navigate). Load replaces session via closeAllPages(). - frame: switch command context to iframe via CSS selector, @ref, --name, or --url. 'frame main' returns to main frame. Execution target abstraction (getActiveFrameOrPage) across read-commands, snapshot, and write-commands. - Frame context cleared on tab switch, navigation, resume, and handoff. - Snapshot shows [Context: iframe src="..."] header when in frame. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: add tests for network idle, chain pipe format, state, and frame - Network idle: click on fetch button waits for XHR, static click is fast - Chain pipe: pipe-delimited commands, quoted args, JSON still works - State: save/load round-trip, name sanitization, missing state error - Frame: switch to iframe + back, snapshot context header, fill in frame, goto-in-frame guard, usage error New fixtures: network-idle.html (fetch + static buttons), iframe.html (srcdoc) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: review fixes — iframe ref scoping, detached frame recovery, state validation - snapshot.ts: ref locators, cursor-interactive scan, and cursor locator now use target (frame-aware) instead of page — fixes @ref clicking in iframes - browser-manager.ts: getActiveFrameOrPage auto-recovers from detached frames via isDetached() check - meta-commands.ts: state load resets activeFrame, elementHandle disposed after contentFrame(), state file schema validation (cookies + pages arrays), filter empty pipe segments in chain tokenizer - write-commands.ts: upload command uses target.locator() for frame support Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: regenerate SKILL.md files + rebuild binary Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: bump version and changelog (v0.12.1.0) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
fix: Supabase telemetry security lockdown (v0.11.16.0) (#460) * fix: drop all anon RLS policies + revoke view access + add cache table Migration 002 locks down the Supabase telemetry backend: - Drops all SELECT, INSERT, UPDATE policies for the anon role - Explicitly revokes SELECT on crash_clusters and skill_sequences views - Drops stale error_message/failed_step columns (exist live but not in migration) - Creates community_pulse_cache table for server-side aggregation caching * feat: extend community-pulse with full dashboard data + server-side cache community-pulse now returns top skills, crash clusters, version distribution, and weekly active count in a single aggregated response. Results are cached in the community_pulse_cache table (1-hour TTL) to prevent DoS via repeated expensive queries. * fix: route all telemetry through edge functions, not PostgREST - gstack-telemetry-sync: POST to /functions/v1/telemetry-ingest instead of /rest/v1/telemetry_events. Removes sed field-renaming (edge function expects raw JSONL names). Parses inserted count — holds cursor if zero inserted. - gstack-update-check: POST to /functions/v1/update-check. - gstack-community-dashboard: calls community-pulse edge function instead of direct PostgREST queries. - config.sh: removes GSTACK_TELEMETRY_ENDPOINT, fixes misleading comment. * test: RLS smoke test + telemetry field name verification - verify-rls.sh: 9-check smoke test (5 reads + 3 inserts + 1 update) verifying anon key is fully locked out after migration. - telemetry.test.ts: verifies JSONL uses raw field names (v, ts, sessions) that the edge function expects, not Postgres column names. - README.md: fixes privacy claim to match actual RLS policy. * chore: bump version and changelog (v0.11.16.0) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: pre-landing review fixes — JSONB field order, version filter, RLS verification - Dashboard JSON parsing: use per-object grep instead of field-order-dependent regex (JSONB doesn't preserve key order) - Version distribution: filter to skill_run events only (was counting all types) - verify-rls.sh: only 401/403 count as PASS (not empty 200 or 5xx); add Authorization header to test as anon role properly - Remove dead empty loop in community-pulse * chore: untrack browse/dist binaries — 116MB of arm64-only Mach-O These compiled Bun binaries only work on arm64 macOS, and ./setup already rebuilds from source for every platform. They were tracked despite .gitignore due to being committed before the ignore rule. Untracking stops them from appearing as modified in every diff. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: tone down changelog — security hardening, not incident report * fix: keep INSERT policies for old client compat, preserve extra columns - Keep anon INSERT policies so pre-v0.11.16 clients can still sync telemetry via PostgREST while new clients use edge functions - Add error_message/failed_step columns to migration (reconcile repo with live schema) instead of dropping them - Security fix still lands: SELECT and UPDATE policies are dropped Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: sync package.json version with VERSION file (0.11.16.0) --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
fix: enforce Codex 1024-char description limit + auto-heal stale installs (v0.11.9.0) (#391) * fix: enforce 1024-char Codex description limit + auto-heal stale installs Build-time guard in gen-skill-docs.ts throws if any Codex description exceeds 1024 chars. Setup always regenerates .agents/ to prevent stale files. One-time migration in gstack-update-check deletes oversized SKILL.md files so they get regenerated on next setup/upgrade. * chore: bump version and changelog (v0.11.9.0) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
fix: Codex compatibility — 1024-char cap, duplicate skills, repo-local installs, kiro support (v0.11.2.0) (#346) * fix: cap gstack skill descriptions for codex (#251) Compresses SKILL.md.tmpl root description to <1024 chars (Codex token limit). Adds description-length validation test. Includes /autoplan in compressed skill list (added since PR was branched). Co-authored-by: cweill <cweill@users.noreply.github.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: skip sidecar dir in Codex skill linking (#269) Adds guard to skip .agents/skills/gstack in link_codex_skill_dirs() — it's a runtime asset sidecar, not a standalone skill. Prevents duplicate skill discovery and symlink overwriting. Fixes #261 Co-authored-by: mvanhorn <mvanhorn@users.noreply.github.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: generate .agents directory at setup time instead of shipping duplicates (#308) Removes 14K+ lines of committed generated Codex skill files from git. .agents/ is now gitignored and generated at setup time via `bun run gen:skill-docs --host codex`. Updates CI workflow to validate generation instead of checking committed file freshness. Co-authored-by: cskwork <cskwork@users.noreply.github.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: avoid duplicate Codex skill discovery (#236) Adds migrate_direct_codex_install() to move old direct installs from ~/.codex/skills/gstack to ~/.gstack/repos/gstack. Adds create_codex_runtime_root() to expose only runtime assets (bin/, browse/, review files) via symlinks instead of symlinking the entire repo. Fixes #235 Co-authored-by: shichangs <shichangs@users.noreply.github.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: support repo-local Codex installs (#317) Changes gen-skill-docs.ts to use dynamic $GSTACK_ROOT/$GSTACK_BIN/$GSTACK_BROWSE variables in generated Codex preambles instead of hardcoded ~/.codex/ paths. Renames GSTACK_DIR → SOURCE_GSTACK_DIR/INSTALL_GSTACK_DIR throughout setup for clarity. Supports both global (~/.codex/skills/) and repo-local (.agents/skills/) Codex installs. Co-authored-by: pengwk <pengwk@users.noreply.github.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add --host kiro support to setup script (#309) Adds Kiro CLI as a supported agent platform. Setup detects kiro-cli, copies+sed-rewrites SKILL.md paths from Codex/Claude to Kiro format, and symlinks runtime assets (bin/, browse/). Co-authored-by: AnshulDesai <AnshulDesai@users.noreply.github.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: add sidecar skip, GSTACK_ROOT, and kiro coverage (T1-T3) Adds 3 tests identified during CEO/Eng review: - T1: link_codex_skill_dirs() contains sidecar skip guard - T2: generated Codex preambles use dynamic $GSTACK_ROOT paths - T3: setup supports --host kiro with INSTALL_KIRO and sed rewrites Also fixes existing test to expect kiro in --host case statement. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: review fixes — ETHOS.md, runtime root, repo-local guard, kiro assets, upgrade paths Paranoid 4-pass review found 7 issues, all fixed: - Add ETHOS.md to create_codex_runtime_root - Clean old real dirs (not just symlinks) on upgrade - Skip runtime root for repo-local installs (prevent self-referential symlinks) - Add review/, ETHOS.md, gstack-upgrade/ to Kiro install - Update gstack-upgrade to detect ~/.gstack/repos/ and .agents/skills/ - Guard --host without value from silent exit - Fix Kiro sed patterns + timeout instruction in gen-skill-docs.ts Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: bump version and changelog (v0.11.2.0) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: remove last tracked .agents/ file from git index Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: cweill <cweill@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: mvanhorn <mvanhorn@users.noreply.github.com> Co-authored-by: cskwork <cskwork@users.noreply.github.com> Co-authored-by: shichangs <shichangs@users.noreply.github.com> Co-authored-by: pengwk <pengwk@users.noreply.github.com> Co-authored-by: AnshulDesai <AnshulDesai@users.noreply.github.com>
feat: /retro global — cross-project AI coding retrospective (v0.10.2.0) (#316) * feat: gstack-global-discover — cross-tool AI session discovery Standalone script that scans Claude Code, Codex CLI, and Gemini CLI session directories, resolves each session's working directory to a git repo, deduplicates by normalized remote URL, and outputs structured JSON. - Reads only first 4-8KB of session files (avoids OOM on large transcripts) - Only counts JSONL files modified within the time window (accurate counts) - Week windows midnight-aligned like day windows for consistency - 16 tests covering URL normalization, CLI behavior, and output structure * feat: /retro global — cross-project retro using discovery engine Adds Global Retrospective Mode to the /retro skill. When invoked as `/retro global`, skips the repo-scoped retro and instead uses gstack-global-discover to find all AI coding sessions across all tools, then runs git log on each discovered repo for a unified cross-project retrospective with global shipping streak and context-switching metrics. * chore: bump version and changelog (v0.9.9.0) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: sync documentation with shipped changes Update README /retro description to mention global mode. Add bin/ directory to CLAUDE.md project structure. * feat: /retro global adds per-project personal contributions breakdown Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: regenerate SKILL.md files after main merge * chore: bump version and changelog (v0.10.2.0) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: test coverage catalog — shared audit across plan/ship/review (v0.10.1.0) (#259) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: /retro global shareable personal card — screenshot-ready stats Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: regenerate Codex/agents SKILL.md for retro shareable card Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: widen retro global card — never truncate repo names Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: retro global card — left border only, drop unreliable right border Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
fix: community security + stability fixes (wave 1) (#325) * feat: add /cso skill — OWASP Top 10 + STRIDE security audit * fix: harden gstack-slug against shell injection via eval Whitelist safe characters (a-zA-Z0-9._-) in SLUG and BRANCH output to prevent shell metacharacter injection when used with eval. Only affects self-hosted git servers with lax naming rules — GitHub and GitLab enforce safe characters already. Defense-in-depth. * fix(security): sanitize gstack-slug output against shell injection The gstack-slug script is consumed via eval $(gstack-slug) throughout skill templates. If a git remote URL contains shell metacharacters like $(), backticks, or semicolons, they would be executed by eval. Fix: strip all characters except [a-zA-Z0-9._-] from both SLUG and BRANCH before output. This preserves normal values while neutralizing any injection payload in malicious remote URLs. Before: eval $(gstack-slug) with remote "foo/bar$(rm -rf /)" → executes rm After: eval $(gstack-slug) with remote "foo/bar$(rm -rf /)" → SLUG=foo-barrm-rf- * fix(security): redact sensitive values in storage command output The browse `storage` command dumps all localStorage and sessionStorage as JSON. This can expose tokens, API keys, JWTs, and session credentials in QA reports and agent transcripts. Fix: redact values where the key matches sensitive patterns (token, secret, key, password, auth, jwt, csrf) or the value starts with known credential prefixes (eyJ for JWT, sk- for Stripe, ghp_ for GitHub, etc.). Redacted values show length to aid debugging: [REDACTED — 128 chars] * fix(browse): kill old server before restart to prevent orphaned chromium processes When the health check fails or the server connection drops, `ensureServer()` and `sendCommand()` would call `startServer()` without first killing the previous server process. This left orphaned `chrome-headless-shell` renderer processes running at ~120% CPU each. After several reconnect cycles (e.g. pages that crash during hydration or trigger hard navigations via `window.location.href`), dozens of zombie chromium processes accumulate and exhaust system resources. Fix: call `killServer()` on the stale PID before spawning a new server in both the `ensureServer()` unhealthy path and the `sendCommand()` connection- lost retry path. Fixes #294 * Fix YAML linter error: nested mapping in compact sequence entries Having "Run: bun" inside a plain scalar is not allowed per YAML spec which states: Plain scalars must never contain the “: ” and “ #” character combinations. This simple fix switches to block scalars (|) to eliminate the ambiguity without changing runtime behavior. * fix(security): add Azure metadata endpoint to SSRF blocklist Add metadata.azure.internal to BLOCKED_METADATA_HOSTS alongside the existing AWS/GCP endpoints. Closes the coverage gap identified in #125. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: add coverage for storage redaction Test key-based redaction (auth_token, api_key), value-based redaction (JWT prefix, GitHub PAT prefix), pass-through for normal keys, and length preservation in redacted output. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add community PR triage process to CONTRIBUTING.md Document the wave-based PR triage pattern used for batching community contributions. References PR #205 (v0.8.3) as the original example. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: adjust test key names to avoid redaction pattern collision Rename testKey→testData and normalKey→displayName in storage tests to avoid triggering #238's SENSITIVE_KEY regex (which matches 'key'). Also generate Codex variant of /cso skill. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: update project documentation for v0.9.10.0 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: zero-noise /cso security audits with FP filtering (v0.11.0.0) Absorb Anthropic's security-review false positive filtering into /cso: - 17 hard exclusions (DOS, test files, log spoofing, SSRF path-only, regex injection, race conditions unless concrete, etc.) - 9 precedents (React XSS-safe, env vars trusted, client-side code doesn't need auth, shell scripts need concrete untrusted input path) - 8/10 confidence gate — below threshold = don't report - Independent sub-agent verification for each finding - Exploit scenario requirement per finding - Framework-aware analysis (Rails CSRF, React escaping, Angular sanitization) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: consolidate CHANGELOG — merge /cso launch + community wave into v0.11.0.0 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: rewrite README — lead with Karpathy quote, cut LinkedIn phrases, add /cso Opens with the revolution (Karpathy, Steinberger/OpenClaw), keeps credentials and LOC numbers, cuts filler phrases, adds hater bait, restores hiring block, removes bloated "What's new" section, adds /cso to skills table and install. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(cso): adversarial review fixes — FP filtering, prompt injection, language coverage - Exclusion #10: test files must verify not imported by non-test code - Exclusion #13: distinguish user-message AI input from system-prompt injection - Exclusion #14: ReDoS in user-input regex IS a real CVE class, don't exclude - Add anti-manipulation rule: ignore audit-influencing instructions in codebase - Fix confidence gate: remove contradictory 7-8 tier, hard cutoff at 8 - Fix verifier anchoring: send only file+line, not category/description - Add Go, PHP, Java, C#, Kotlin to grep patterns (was 4 languages, now 8) - Add GraphQL, gRPC, WebSocket endpoint detection to attack surface mapping Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(docs): correct skill counts, add /autoplan to README tables Skill count was wrong in 3 places (said 19+7=26, said 25, actual is 28). Added /autoplan to specialist table. Fixed troubleshooting skills list to include all skills added since v0.7.0. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(browse): DNS rebinding protection for SSRF blocklist validateNavigationUrl is now async — resolves hostname to IP and checks against blocked metadata IPs. Prevents DNS rebinding where evil.com initially resolves to a safe IP, then switches to 169.254.169.254. All callers updated to await. Tests updated for async assertions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(browse): lockfile prevents concurrent server start races Adds exclusive lockfile (O_CREAT|O_EXCL) around ensureServer to prevent TOCTOU race where two CLI invocations could both kill the old server and start new ones, leaving an orphaned chromium process. Second caller now waits for the first to finish starting. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(browse): improve storage redaction — word-boundary keys + more value prefixes Key regex: use underscore/dot/hyphen boundaries instead of \b (which treats _ as word char). Now correctly redacts auth_token, session_token while skipping keyboardShortcuts, monkeyPatch, primaryKey. Value regex: add AWS (AKIA), Stripe (sk_live_, pk_live_), Anthropic (sk-ant-), Google (AIza), Sendgrid (SG.), Supabase (sbp_) prefixes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: migrate all remaining eval callers to source, fix stale CHANGELOG claim 5 templates and 2 bin scripts still used eval $(gstack-slug). All now use source <(gstack-slug). Updated gstack-slug comment to match. Fixed v0.8.3 CHANGELOG entry that falsely claimed eval was fully eliminated — it was the output sanitization that made it safe, not a calling convention change. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(docs): add /autoplan to install instructions, regen skill docs The install instruction blocks and troubleshooting section were missing /autoplan. All three skill list locations now include the complete 28-skill set. Regenerated codex/agents SKILL.md files to match template changes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: update project documentation for v0.11.0.0 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs(cso): add disclaimer — not a substitute for professional security audits LLMs can miss subtle vulns and produce false negatives. For production systems with sensitive data, hire a real firm. /cso is a first pass, not your only line of defense. Disclaimer appended to every report. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Arun Kumar Thiagarajan <arunkt.bm14@gmail.com> Co-authored-by: Tyrone Robb <tyrone.robb@icloud.com> Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: Orkun Duman <orkun1675@gmail.com>
docs: v0.9.8.0 — deploy pipeline docs + pre-merge readiness gate (#306) * docs: v0.9.8.0 — deploy pipeline + E2E performance + pre-merge gate CHANGELOG: added v0.9.8.0 entry covering /land-and-deploy, /canary, /benchmark, /setup-deploy, /review perf pass, E2E model pinning, and 3 test fixes. README: added 4 new skills to tables and install instructions, updated specialist/tool counts (18+7), added deploy pipeline to "What's new" section. /land-and-deploy: added Step 3.5 pre-merge readiness gate that checks review dashboard, E2E results, free tests, and doc-release status before merging. Uses AskUserQuestion for explicit confirmation. VERSION: 0.9.7.0 → 0.9.8.0 TODOS: updated deploy pipeline to Completed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: comprehensive pre-merge readiness gate in /land-and-deploy Step 3.5 now checks 5 dimensions before allowing merge: 1. Review staleness — compares review commit hash against HEAD, flags if significant code changes happened after last review 2. Tests — runs free tests inline, checks today's E2E and LLM eval results from ~/.gstack-dev/evals/ 3. PR body accuracy — compares PR description against actual commits, flags missing features or stale descriptions 4. Document-release — checks if CHANGELOG/VERSION were updated when new features are present in the diff 5. Full readiness report — ASCII dashboard with warnings/blockers, explicit AskUserQuestion confirmation required before merge Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
feat: Search Before Building — builder ethos + skill integrations (v0.9.5.0) (#298) * feat: ETHOS.md — gstack builder philosophy Standalone document capturing the four principles: The Golden Age, Boil the Lake, Search Before Building, and Build for Yourself. Introduces the three-layer knowledge framework (tried-and-true, new-and-popular, first-principles) and the Eureka Moment concept — when first-principles reasoning reveals conventional wisdom is wrong. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: Search Before Building preamble section + CLAUDE.md Add generateSearchBeforeBuildingSection(ctx) to gen-skill-docs.ts. Every workflow skill now gets a compact router section covering: - Three layers of knowledge (tried-and-true, new-and-popular, first-principles) - Eureka moment format and jq-based JSONL logging - WebSearch fallback clause - ETHOS.md reference via ctx.paths.skillRoot resolver Also adds compact "Search before building" section to CLAUDE.md. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: skill-specific Search Before Building integrations 8 template changes: - /office-hours: Phase 2.75 Landscape Awareness (WebSearch + three-layer synthesis) - /plan-eng-review: Step 0 search check with layer provenance annotations - /investigate: external pattern search + search escalation on hypothesis failure - /plan-ceo-review: Landscape Check before scope challenge - /review: search-before-recommending for fix patterns - /qa-only: WebSearch in allowed-tools - /design-consultation: three-layer synthesis backport in Phase 2 Step 3 - /retro: eureka moment tracking from ~/.gstack/analytics/eureka.jsonl All search steps include WebSearch fallback clause. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: v0.9.5.0 — Builder Ethos (CHANGELOG + VERSION + TODOS) ETHOS.md + Search Before Building across all workflow skills. Deferred: first-time intro flow (blocked on blog post). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address Codex review — sanitize search, privacy gate, ETHOS.md sidecar Three fixes from adversarial Codex review: - /investigate: sanitize error messages before searching (strip hostnames, IPs, file paths, SQL, customer data). Skip search if unsanitizable. - /office-hours: add privacy gate before landscape search. Use generalized category terms, never the user's specific product name or stealth idea. - setup: link ETHOS.md into .agents/skills/gstack/ sidecar so workspace- local Codex sessions can find the builder philosophy. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: sanitize Phase 2 external pattern search in /investigate The Phase 2 external search also sent raw error messages to WebSearch. Apply same sanitization rule as Phase 3 search escalation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: sync documentation with shipped changes - ARCHITECTURE.md: preamble now handles 5 things (add Search Before Building) - CLAUDE.md: add ETHOS.md to project structure tree - README.md: add ETHOS.md to docs table Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
fix: Windows support — Node.js server fallback for Playwright (#255) * fix: Windows support — Node.js server fallback for Playwright Setup hangs on Windows 11 because Bun's child_process can't handle Playwright's --remote-debugging-pipe (fd 3/4 pipe handles). Fall back to Node.js on Windows for both the setup verification and server runtime. macOS/Linux completely unaffected — all Windows code behind IS_WINDOWS / process.platform === 'win32' guards. Based on community PR #194 by @sozairali. Fixed sed -i portability (perl -pi -e) in build-node-server.sh for macOS compatibility. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: cross-platform path handling for Windows compatibility Replace hardcoded '/tmp' and 'dir + "/"' path checks with platform-aware constants from new platform.ts module. On macOS/Linux this evaluates identically ('/tmp', '/'); on Windows it uses os.tmpdir() and path.sep. Zero behavior change on Unix. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: add tests for Windows polyfill, platform constants, and Node server resolution Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: Windows support in README + CHANGELOG (v0.9.1.1) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: bump version and changelog (v0.9.3.0) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
feat: multi-agent support — gstack works on Codex, Gemini CLI, and Cursor (v0.9.0) (#226) * refactor: host-aware gen-skill-docs + --host codex generation Refactor gen-skill-docs.ts for multi-agent support: - Add Host type, HostPaths interface, HOST_PATHS config - Decompose generatePreamble() into 7 composable sub-functions - Replace all hardcoded .claude/skills/gstack paths with ctx.paths - Replace static findTemplates() list with dynamic filesystem scan - Add --host codex|agents flag (aliases, same output) - Add processTemplate host routing to .agents/skills/gstack-*/ - Add codexSkillName() with double-prefix prevention - Add transformFrontmatter() — keeps only name + description for Codex - Add extractHookSafetyProse() — converts hooks to inline advisory - Add body text path rewriting for remaining hardcoded paths - Exclude /codex skill from Codex generation (self-referential) Claude output is unchanged (verified via --dry-run). SKILL.md is an open standard: .agents/skills/ works on Codex, Gemini CLI, and Cursor. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: generate Codex/Gemini/Cursor skills into .agents/skills/ Generated 21 skill files for the open SKILL.md standard: - Output: .agents/skills/gstack-*/SKILL.md (one per skill) - Frontmatter: name + description only (no allowed-tools/version) - No .claude/skills/ paths in any generated file - /codex skill excluded (Claude wrapper, self-referential on Codex) - Hook skills (careful/freeze/guard) get inline safety prose - Build script generates both hosts: bun run build Supported agents (all read .agents/skills/): - Codex CLI - Gemini CLI - Cursor Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: dual-host setup + find-browse for Codex/Gemini/Cursor - setup: add --host codex|claude|auto flag, install to ~/.codex/skills/ when targeting Codex, auto-detect installed agents - find-browse: priority chain .codex > .agents > .claude (both workspace-local and global) - dev-setup/teardown: create .agents/skills/gstack symlinks for dev mode Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: Codex generation tests + CI + docs for multi-agent support Tests (28 new): - Codex output path routing, frontmatter validation (name+description only) - No .claude/skills/ path leaks in Codex output (regression guard) - /codex skill exclusion, hook→prose conversion, multiline YAML - --host agents alias, dynamic template discovery - Codex skill validation + $B command validation - find-browse priority chain verification - Replace static ALL_SKILLS list with dynamic filesystem scan CI: - Add Codex freshness check to skill-docs workflow Docs: - AGENTS.md: Codex-facing project instructions - README: multi-agent installation section - CONTRIBUTING: dual-host development workflow - CHANGELOG: v0.9.0 multi-agent support entry Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: Codex E2E test harness — verify skills work on Codex CLI New test infrastructure: - CodexSessionRunner: spawns codex exec, parses JSONL stream, returns structured results (output, reasoning, toolCalls, tokens) - JSONL parser ported from Python (codex/SKILL.md.tmpl) to TypeScript - Temp HOME skill installation for Codex discovery testing E2E tests (gated behind EVALS=1 + codex + OPENAI_API_KEY): - codex-discover-skill: installs skill, verifies Codex finds it - codex-review-findings: runs gstack-review via Codex, validates output Integrates with existing eval infrastructure: - Diff-based test selection via touchfiles - Eval persistence via EvalCollector - bun run test:codex / test:codex:all convenience scripts Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: bump VERSION to 0.9.0 to match CHANGELOG Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: Codex sidecar paths + setup installs generated skills Two bugs found by Codex adversarial review: 1. Sidecar path mismatch: generated Codex skills referenced .agents/skills/gstack-review/checklist.md but setup creates sidecars at .agents/skills/gstack/review/. Fixed path rewriter to emit .agents/skills/gstack/review/ (matching setup layout). 2. Setup installed Claude-format source dirs for Codex global install instead of the generated Codex-format skills. Split link_skill_dirs into link_claude_skill_dirs (source dirs for Claude) and link_codex_skill_dirs (generated .agents/skills/ gstack-* dirs for Codex). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: comprehensive Codex path rewriting + setup install tests 17 new tests covering: - Sidecar path rewriting: .claude/skills/review → .agents/skills/gstack/review/ (catches the bug where checklist.md was unreachable at gstack-review/) - All 4 path rewrite rules tested individually across all skills - Greptile triage sidecar path correctness - Ship skill sidecar paths for pre-landing review - Claude output regression guard: zero Codex paths in any Claude skill - Setup script validation: separate link functions for Claude vs Codex, link_codex_skill_dirs reads from .agents/skills/, create_agents_sidecar links runtime assets (bin, browse, review, qa) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: regenerate Codex skills after investigate rename merge Remove stale gstack-debug, add gstack-investigate, regenerate all Codex skills to pick up changes merged from main (investigate rename, platform-agnostic templates, review helpers). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: Codex E2E uses ~/.codex/ auth, not OPENAI_API_KEY - Remove OPENAI_API_KEY gate from test prerequisites - Copy real ~/.codex/ auth config into temp HOME so codex can authenticate - Increase review test timeout to 540s (codex does thorough 60+ tool call reviews) - Document in CLAUDE.md that Codex uses its own auth config Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
feat: opt-in usage telemetry + community intelligence platform (v0.8.6) (#210) * feat: add gstack-telemetry-log and gstack-analytics scripts Local telemetry infrastructure for gstack usage tracking. gstack-telemetry-log appends JSONL events with skill name, duration, outcome, session ID, and platform info. Supports off/anonymous/community privacy tiers. gstack-analytics renders a personal usage dashboard from local data. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add telemetry preamble injection + opt-in prompt + epilogue Extends generatePreamble() with telemetry start block (config read, timer, session ID, .pending marker), opt-in prompt (gated by .telemetry-prompted), and epilogue instructions for Claude to log events after skill completion. Adds 5 telemetry tests. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: regenerate all SKILL.md files with telemetry blocks Automated regeneration from gen-skill-docs.ts changes. All skills now include telemetry start block, opt-in prompt, and epilogue. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add Supabase schema, edge functions, and SQL views Telemetry backend infrastructure: telemetry_events table with RLS (insert-only), installations table for retention tracking, update_checks for install pings. Edge functions for update-check (version + ping), telemetry-ingest (batch insert), and community-pulse (weekly active count). SQL views for crash clustering and skill co-occurrence sequences. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add telemetry-sync, community-dashboard, and integration tests gstack-telemetry-sync: fire-and-forget JSONL → Supabase sync with privacy tier field stripping, batch limits, and cursor tracking. gstack-community-dashboard: CLI tool querying Supabase for skill popularity, crash clusters, and version distribution. 19 integration tests covering all telemetry scripts. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: session-specific .pending markers + crash_clusters view fix Addresses Codex review findings: - .pending race condition: use .pending-$SESSION_ID instead of shared .pending file to prevent concurrent session interference - crash_clusters view: add total_occurrences and anonymous_occurrences columns since anonymous tier has no installation_id - Added test: own session pending marker is not finalized Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: dual-attempt update check with Supabase install ping Fires a parallel background curl to Supabase during the slow-path version fetch. Logs upgrade_prompted event only on fresh fetches (not cached replays) to avoid overcounting. GitHub remains the primary version source — Supabase ping is fire-and-forget. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: integrate telemetry usage stats into /retro output Retro now reads ~/.gstack/analytics/skill-usage.jsonl and includes gstack usage metrics (skill run counts, top skills, success rate) in the weekly retrospective output. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: move 'Skill usage telemetry' to Completed in TODOS.md Implemented in this branch: local JSONL logging, opt-in prompt, privacy tiers, Supabase backend, community dashboard, /retro integration. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: wire Supabase credentials and expose tables via Data API Add supabase/config.sh with project URL and publishable key (safe to commit — RLS restricts to INSERT only). Update telemetry-sync, community-dashboard, and update-check to source the config and include proper auth headers for the Supabase REST API. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: add SELECT RLS policies to migration for community dashboard reads All telemetry data is anonymous (no PII), so public reads via the publishable key are safe. Needed for the community dashboard to query skill popularity, crash clusters, and version distribution. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: bump version and changelog (v0.8.6) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: analytics backward-compatible with old JSONL format Handle old-format events (no event_type field) alongside new format. Skip hook_fire events. Fix grep -c whitespace issues and unbound variable errors. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: map JSONL field names to Postgres columns in telemetry-sync Local JSONL uses short names (v, ts, sessions) but the Supabase table expects full names (schema_version, event_timestamp, concurrent_sessions). Add sed mapping during field stripping. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address Codex adversarial findings — cursor, opt-out, queries - Sync cursor now advances on HTTP 2xx (not grep for "inserted") - Update-check respects telemetry opt-out before pinging Supabase - Dashboard queries use correct view column names (total_occurrences) - Sync strips old-format "repo" field to prevent privacy leak Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add Privacy & Telemetry section to README Transparent disclosure of what telemetry collects, what it never sends, how to opt out, and a link to the schema so users can verify. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
fix: security hardening + issue triage (v0.8.3) (#205) * fix: check for bun before running setup (#147) Users without bun installed got a cryptic "command not found" error. Now prints a clear message with install instructions. Closes #147 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: block SSRF via URL validation in browse commands (#17) Adds validateNavigationUrl() that blocks non-HTTP(S) schemes (file://, javascript:, data:) and cloud metadata endpoints (169.254.169.254, metadata.google.internal). Applied to goto, diff, and newTab commands. Localhost and private IPs remain allowed for local dev QA. Closes #17 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: replace eval $(gstack-slug) with source <(...) (#133) Eliminates unnecessary use of eval across all skill templates and generated files. source <(...) has identical behavior without the shell injection surface. Also hardens gstack-diff-scope usage. Closes #133 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: rename /debug to /investigate to avoid Claude Code conflict (#190) Claude Code has a built-in /debug command that shadows the gstack skill. Renaming to /investigate which better reflects the systematic root-cause investigation methodology. Closes #190 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: add unit tests for path validation helpers validateOutputPath() and validateReadPath() are security-critical functions with zero test coverage. Adds 14 tests covering safe paths, traversal attacks, and prefix collision edge cases. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: bump version and changelog (v0.8.3) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: update /debug → /investigate references in docs CLAUDE.md, README.md, and docs/skills.md still referenced the old /debug skill name after the rename. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: harden URL validation against hostname bypasses (Codex P1) Codex review found that metadata IPs could be reached via hex (0xA9FEA9FE), decimal (2852039166), octal, trailing dot, and IPv6 bracket forms. Now normalizes hostnames before checking the blocklist and probes numeric IP representations via URL constructor. Also moves URL validation before page allocation in newTab() to prevent zombie tabs on rejection (Codex P3). 5 new test cases for bypass variants. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
docs: rewrite README + skills docs, auto-invoke /document-release (v0.8.4) (#207) * docs: add 6 missing skills to proactive suggestion list Add /codex, /careful, /freeze, /guard, /unfreeze, /gstack-upgrade to the root SKILL.md.tmpl proactive suggestion list so Claude suggests them at the appropriate workflow stages. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add 6 new skill entries + browse handoff to docs - docs/skills.md: add /codex, /careful, /freeze, /guard, /unfreeze, /gstack-upgrade to skill table with deep-dive sections. Group safety skills into one "Safety & Guardrails" section. Add browse handoff subsection to /browse deep-dive. - BROWSER.md: add handoff/resume to command reference table + section. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add power tools section + update skill lists in README - Update prose: "Fifteen specialists and six power tools" - Add power tools table after sprint specialists: /codex, /careful, /freeze, /guard, /unfreeze, /gstack-upgrade - Update all 4 skill list locations (install Step 1, Step 2, troubleshooting CLAUDE.md example) to include all 21 skills Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add v0.7-v0.8.2 features to README "What's new" section Add paragraphs for browse handoff, /codex multi-AI review, safety guardrails (/careful, /freeze, /guard), proactive skill suggestions, and /ship auto-invoking /document-release. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: auto-invoke /document-release after /ship PR creation Add Step 8.5 to /ship that automatically reads document-release/SKILL.md and executes the doc update workflow after creating the PR. This prevents documentation drift — /ship now keeps docs current without a separate command. Completes P1 TODO: "Auto-invoke /document-release from /ship" Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: bump version and changelog (v0.8.4) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
docs: frame skills as sprint process, rewrite /office-hours examples (#188) * docs: rewrite /office-hours examples with real session showing premise challenge and reframe Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: anonymize /office-hours examples — remove identifying details Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: tighten See it work example — keep reframe hook, compress details Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: soften user pain description in See it work example Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: reorder skills tables and sections to match sprint workflow Think → plan → review → test → ship → reflect → utilities. /office-hours is now first in both tables and on the page. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: frame skills as a sprint process, not a tool collection Think → Plan → Build → Review → Test → Ship → Reflect. Each skill feeds into the next. 10-15 parallel sprints is the practical max. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
feat: founder discovery engine + /debug skill — v0.7.0 (#185) * feat: add escalation protocol to preamble — all skills get DONE/BLOCKED/NEEDS_CONTEXT Every skill now reports completion status (DONE, DONE_WITH_CONCERNS, BLOCKED, NEEDS_CONTEXT) and has escalation rules: 3 failed attempts → STOP, security uncertainty → STOP, scope exceeds verification → STOP. "It is always OK to stop and say 'this is too hard for me.'" Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add verification gate to /ship (Step 6.5) — no push without fresh evidence Before pushing, re-verify tests if code changed during review fixes. Rationalization prevention: "Should work now" → RUN IT. "I'm confident" → Confidence is not evidence. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add scope drift detection + verification of claims to /review Step 1.5: Before reviewing code quality, check if the diff matches stated intent. Flags scope creep and missing requirements (INFORMATIONAL). Step 5 addition: Every review claim must cite evidence — "this pattern is safe" needs a line reference, "tests cover this" needs a test name. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: mandatory implementation alternatives + design doc lookup in /plan-ceo-review Step 0C-bis: Every plan must consider 2-3 approaches (minimal viable vs ideal architecture) before mode selection. RECOMMENDATION required. Pre-Review System Audit now checks ~/.gstack/projects/ for /brainstorm design docs (branch-filtered with fallback). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: design doc lookup in /plan-eng-review + fix branch name sanitization Step 0 now checks ~/.gstack/projects/ for /brainstorm design docs (branch-filtered with fallback, reads Supersedes: for revision context). Fix: branch names with '/' (e.g. garrytan/better-process) now get sanitized via tr '/' '-' in test plan artifact filenames. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: new /brainstorm and /debug skills /brainstorm: Socratic design exploration before planning. Context gathering, clarifying questions (smart-skip), related design discovery (keyword grep), premise challenge, forced alternatives, design doc artifact with lineage tracking (Supersedes: field). Writes to ~/.gstack/projects/$SLUG/. /debug: Systematic root-cause debugging. Iron Law: no fixes without root cause investigation. Pattern analysis, hypothesis testing with 3-strike escalation, structured DEBUG REPORT output. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: structural tests for new skills + escalation protocol assertions Add brainstorm + debug to skillsWithUpdateCheck and skillsWithPreamble arrays. Add structural tests: brainstorm (Phase 1-6, Design Doc, Supersedes, Smart-skip), debug (Iron Law, Root Cause, Pattern Analysis, Hypothesis, DEBUG REPORT, 3-strike). Add escalation protocol tests (DONE_WITH_CONCERNS, BLOCKED, NEEDS_CONTEXT) for all preamble skills. Also: 2 new TODOs (design docs → Supabase sync, /plan-design-review skill), update CLAUDE.md project structure with new skill directories. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: bump version and changelog (v0.6.0) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * chore: rename /brainstorm → /office-hours across references Update CHANGELOG, CLAUDE.md, TODOS, design-consultation, plan-ceo-review, and gen-skill-docs to reference the new office-hours skill name. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: YC Office Hours — dual-mode product diagnostic + builder brainstorm Rewrite /office-hours with two modes: Startup mode: six forcing questions (Demand Reality, Status Quo, Desperate Specificity, Narrowest Wedge, Observation & Surprise, Future-Fit) that push founders toward radical honesty about demand, users, and product decisions. Includes smart routing by product stage, intrapreneurship adaptation, and YC apply CTA for strong-signal founders. Builder mode: generative brainstorming for side projects, hackathons, learning, and open source. Enthusiastic collaborator tone, design thinking questions, no business interrogation. Mode is determined by an explicit question in Phase 1 — no guessing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: add 14 assertions for YC Office Hours content coverage Validates dual-mode structure (Startup/Builder), all six forcing questions, builder brainstorming content, intrapreneurship adaptation, YC apply CTA, and operating principles for both modes. 192 tests total, all passing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: update project documentation for v0.6.1 - README.md: added /office-hours and /debug to skills table, updated skill count from 13 to 15, added both to install instructions - docs/skills.md: added /office-hours and /debug deep dive sections - CLAUDE.md: updated office-hours description to reflect dual-mode - CONTRIBUTING.md: updated skill count from 13 to 15 - CHANGELOG.md: added YC Office Hours and /debug entries to 0.6.0 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: founder discovery engine in /office-hours (v0.7.0) Turn /office-hours into a YC founder discovery engine. Every session now ends with three beats: signal reflection (specific callbacks to what the user said), "One more thing." transition, and a personal plea from Garry Tan with three tiers based on founder signal strength. Top tier uses AskUserQuestion to ask directly and opens ycombinator.com/apply?ref=gstack. Adds Phase 4.5 (Founder Signal Synthesis), "What I noticed about how you think" section to both design doc templates, anti-slop GOOD/BAD examples, and emotional targets per tier. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: add validation assertions for founder discovery engine 8 new assertions covering: YC apply CTA with ref=gstack tracking, "What I noticed" design doc section, golden age framing, Garry Tan personal plea, founder signal synthesis phase, three-tier decision rubric, anti-slop GOOD/BAD examples, "One more thing" transition beat. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: update project documentation for v0.7.0 VERSION: 0.6.4.1 → 0.7.0 CHANGELOG: new entry — Office Hours Gets Personal README: updated /office-hours and /plan-design-review descriptions docs/skills.md: updated /office-hours table + deep dive section TODOS.md: added /yc-prep skill TODO (P2) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: remove duplicate Install section, fix stale skills lists, deduplicate CHANGELOG entries Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
feat: interactive /plan-design-review + CEO invokes designer + 100% coverage (v0.6.4) (#149) * refactor: rename qa-design-review → design-review The "qa-" prefix was confusing — this is the live-site design audit with fix loop, not a QA-only report. Rename directory and update all references across docs, tests, scripts, and skill templates. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: interactive /plan-design-review + CEO invokes designer Rewrite /plan-design-review from report-only grading to an interactive plan-fixer that rates each design dimension 0-10, explains what a 10 looks like, and edits the plan to get there. Parallel structure with /plan-ceo-review and /plan-eng-review — one issue = one AskUserQuestion. CEO review now detects UI scope and invokes the designer perspective when the plan has frontend/UX work, so you get design review automatically when it matters. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: validation + touchfile entries for 100% coverage Add design-consultation to command/snapshot flag validation. Add 4 skills to contributor mode validation (plan-design-review, design-review, design-consultation, document-release). Add 2 templates to hardcoded branch check. Register touchfile entries for 10 new LLM-judge tests and 1 new E2E test. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: LLM-judge for 10 skills + gstack-upgrade E2E Add LLM-judge quality evals for all uncovered skills using a DRY runWorkflowJudge helper with section marker guards. Add real E2E test for gstack-upgrade using mock git remote (replaces test.todo). Add plan-edit assertion to plan-design-review E2E. 14/15 skills now at full coverage. setup-browser-cookies remains deferred (needs real browser). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add bisect commit style to CLAUDE.md All commits should be single logical changes, split before pushing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: bump version and changelog (v0.6.4.0) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
docs: restructure README for faster conversion (#146) * docs: restructure README for faster conversion Move install block to top (first scroll) for visitors who are already sold. Add "Who this is for" self-identification section and "Quick start: 10 minutes" funnel to guide first-time users. Condense install instructions to bold literal commands. Move "See it work" and "The team" below install. Relocate hiring box after "Come ride the wave" section. Strip redundant section dividers. Add governance framing to final CTA and new troubleshooting case for missing skills. Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com> * docs: update "Who this is for" to lead with founders and new CC users Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>
feat: Completeness Principle — Boil the Lake (v0.6.1) (#140) * feat: Completeness Principle — Boil the Lake (WIP, pre-merge) Add Completeness Principle to all skill preambles, dual-time estimates, compression table, anti-pattern gallery, Lake Score, and completeness gaps review category. VERSION/CHANGELOG will be rebased after merge. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: update stale version reference in TODOS.md (v0.5.3 → v0.6.1) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: update CHANGELOG date + README for v0.6.1 features - Add date to CHANGELOG 0.6.1 entry - Add Completeness Principle to README intro - Add SELECTIVE EXPANSION mode to CEO review section - Add test bootstrap mention to /ship section - Fix uninstall command missing design-consultation in project uninstall - Add "recommends shortcuts" and "no tests" to Without gstack list Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: split README into lean intro + docs/ directory (gh CLI pattern) README: 875 → 243 lines. Keeps intro, skill table, demo, install, and troubleshooting. All per-skill deep dives, Greptile integration guide, and contributor mode docs moved to docs/ directory. - docs/skills.md — full philosophy and examples for all 13 skills - docs/greptile.md — Greptile setup and triage workflow - docs/contributor-mode.md — how to enable and use contributor mode - README now links to docs/ via Documentation table - Updated skill table entries with latest features (fix-first, regression tests, test health, completeness gaps) - Updated demo transcript with AUTO-FIXED, coverage audit, regression test Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: remove "competitor" language, rewrite README in Garry's voice Replace "browses competitors" with "knows the landscape" / "what's out there" throughout all user-facing copy. Trim README from 243 to 167 lines — tighter, more opinionated, less listicle energy. Remove Completeness Principle from README top (it lives in CLAUDE.md and the skill preambles where Claude actually reads it). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: rewrite README in Garry's raw voice — AGI era, L8 factory, real stories The README now sounds like Garry, not a product page. Leads with the live experiment, the 16k LOC/day reality, the real-life coding stories (Austin, hospital bedside). Highlights the newest unlocks (design at the heart, /qa parallelism, smart review routing, test bootstrap). Closes with an open invitation — free MIT, fork it, let's all ride the wave together. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add Garry's bonafides to README intro — Palantir, Posterous, YC, 600k LOC Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add real /retro numbers — 140k lines, 362 commits across 3 projects Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add "in the last 60 days" timeframe to 600k LOC claim Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add GitHub contribution graphs — 2026 vs 2013 side by side Same person, different era. 2013: 772 contributions building Bookface. 2026: 1,237 contributions and accelerating. The difference is the tooling. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: clarify /retro stats are from last 7 days Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add designer/PM/eng manager roles to intro Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: remove Josh/L8 reference from README Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: move demo up, make it dramatically more impressive Show the actual architecture diagram, auto-fixed issues, 100% coverage, regression test generation. Punch line: "That is not a copilot. That is a team." Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: remove "My journey" section — intro already covers it Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: prefix all skill commands with You: in demo transcript Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: collapse You/Claude lines in demo — no gap between command and response Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: clarify plan mode flow in demo — approve, exit, Claude implements Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: move /ship to end of demo — review → QA → ship is the real flow Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add /plan-design-review to demo, tighten CEO response Shorter CEO reply, compressed eng diagram, added design audit with AI Slop score. Seven commands now: plan → eng → build → design → review → QA → ship. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: move design review before implementation — it's part of planning Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: reorder demo — design before eng, after CEO Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: remove URL from /plan-design-review in demo Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add [...] annotations showing what actually happens at each step Each step now shows what the agent does under the hood: 8 expansion proposals cherry-picked, 80-item design audit, ASCII diagrams for every flow, 2400 lines written in 8 minutes, real browser QA, bug found and fixed. Makes the demo feel real, not abstract. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: rename Contributor Mode to How to Contribute in docs table Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add Coinbase, Instacart, Rippling to YC bonafides Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add "one or two people in a garage" to founder story Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add skill table to top of skills.md with anchor links Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: consolidate — roll contributor-mode into CONTRIBUTING, greptile into skills - docs/contributor-mode.md → merged into CONTRIBUTING.md (session awareness section) - docs/greptile.md → merged into docs/skills.md (Greptile integration section) - Reordered docs table: Skills > Architecture > Browser > Contributing > Changelog Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>