feat: composable skills — INVOKE_SKILL resolver + factoring infrastructure (v0.13.7.0) (#644)
* feat: add parameterized resolver support to gen-skill-docs
Extend the placeholder regex from {{WORD}} to {{WORD:arg1:arg2}},
enabling parameterized resolvers like {{INVOKE_SKILL:plan-ceo-review}}.
- Widen ResolverFn type to accept optional args?: string[]
- Update RESOLVERS record to use ResolverFn type
- Both replacement and unresolved-check regexes updated
- Fully backward compatible: existing {{WORD}} patterns unchanged
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: add INVOKE_SKILL resolver for composable skill loading
New composition.ts resolver module that emits prose instructing Claude
to read another skill's SKILL.md and follow it, skipping preamble
sections. Supports optional skip= parameter for additional sections.
Usage: {{INVOKE_SKILL:plan-ceo-review}} or
{{INVOKE_SKILL:plan-ceo-review:skip=Outside Voice}}
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: use frontmatter name: for skill symlinks and Codex paths
Patch all 3 name-derivation paths to read name: from SKILL.md
frontmatter instead of relying solely on directory basenames.
This enables directory names that differ from invocation names
(e.g., run-tests/ directory with name: test).
- setup: link_claude_skill_dirs reads name: via grep, falls back to basename
- gen-skill-docs.ts: codexSkillName uses frontmatter name for Codex output paths
- gen-skill-docs.ts: moved frontmatter extraction before Codex path logic
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: extract CHANGELOG_WORKFLOW resolver from /ship
Move changelog generation logic into a reusable resolver. The resolver
is changelog-only (no version bump per Codex review recommendation).
Adds voice rules inline. /ship Step 5 now uses {{CHANGELOG_WORKFLOW}}.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor: use INVOKE_SKILL resolver for plan-ceo-review office-hours fallback
Replace inline skill loading prose (read file, skip sections) with
{{INVOKE_SKILL:office-hours}} in the mid-session detection path.
The BENEFITS_FROM prerequisite offer is unchanged (separate use case).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor: BENEFITS_FROM resolver delegates to INVOKE_SKILL
Eliminate duplicated skip-list logic by having generateBenefitsFrom
call generateInvokeSkill internally. The wrapper (AskUserQuestion,
design doc re-check) stays in BENEFITS_FROM. The loading instructions
(read file, skip sections, error handling) come from INVOKE_SKILL.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* test: add resolver tests for INVOKE_SKILL, CHANGELOG_WORKFLOW, parameterized args
12 new tests covering:
- INVOKE_SKILL: template placeholder, default skip list, error handling,
BENEFITS_FROM delegation
- CHANGELOG_WORKFLOW: content, cross-check, voice guidance, format
- Parameterized resolver infra: colon-separated args processing,
no unresolved placeholders across all generated SKILL.md files
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: bump version and changelog (v0.13.7.0)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: journey routing tests — CLAUDE.md routing rules + stronger descriptions
Three journey E2E tests (ideation, ship, debug) were failing because
Claude answered directly instead of invoking the Skill tool. Root cause:
skill descriptions in system-reminder are too weak to override Claude's
default behavior for tasks it can handle natively.
Fix has two parts:
1. CLAUDE.md routing rules in test workdir — Claude weighs project-level
instructions higher than skill description metadata
2. "Proactively invoke" (not "suggest") in office-hours, investigate,
ship descriptions — reinforces the routing signal
10/10 journey tests now pass (was 7/10).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: one-time CLAUDE.md routing injection prompt
Add a preamble section that checks if the project's CLAUDE.md has
skill routing rules. If not (and user hasn't declined), asks once
via AskUserQuestion to inject a "## Skill routing" section.
Root cause: skill descriptions in system-reminder metadata are too
weak to reliably trigger proactive Skill tool invocation. CLAUDE.md
project instructions carry higher weight in Claude's decision making.
- Preamble bash checks for "## Skill routing" in CLAUDE.md
- Stores decline in gstack-config (routing_declined=true)
- Only asks once per project (HAS_ROUTING check + config check)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: annotated config file + routing injection tests
gstack-config now writes a documented header on first config creation
with every supported key explained (proactive, telemetry, auto_upgrade,
skill_prefix, routing_declined, codex_reviews, skip_eng_review, etc.).
Users can edit ~/.gstack/config.yaml directly, anytime.
Also fixes grep to use ^KEY: anchoring so commented header lines don't
shadow real config values.
Tests added:
- 7 new gstack-config tests (annotated header, no duplication, comment
safety, routing_declined get/set/reset)
- 6 new gen-skill-docs tests (preamble routing injection: bash checks,
config reads, AskUserQuestion, decline persistence, routing rules)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: bump to v0.13.9.0, separate CHANGELOG from main's releases
Split our branch's changes into a new 0.13.9.0 entry instead of
jamming them into 0.13.7.0 which already landed on main as
"Community Wave."
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: clarify branch-scoped VERSION/CHANGELOG after merging main
Add explicit rules: merging main doesn't mean adopting main's version.
Branch always gets its own entry on top with a higher version number.
Three-point checklist after every merge.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: put our 0.13.9.0 entry on top of CHANGELOG
Newest version goes on top. Our branch lands next, so our entry
must be above main's 0.13.8.0.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: restore missing 0.13.7.0 Community Wave entry
Accidentally dropped the 0.13.7.0 entry when reordering.
All entries now present: 0.13.9.0 > 0.13.8.0 > 0.13.7.0 > 0.13.6.0.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: add CHANGELOG integrity check rule
After any edit that moves/adds/removes entries, grep for version
headers and verify no gaps or duplicates before committing.
Prevents accidentally dropping entries during reordering.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
feat: community wave — 7 fixes, relink, sidebar Write, discoverability (v0.13.5.0) (#641)
* test: add 16 failing tests for 6 community fixes
Tests-first for all fixes in this PR wave:
- #594 discoverability: gstack tag in descriptions, 120-char first line
- #573 feature signals: ship/SKILL.md Step 4 detection
- #510 context warnings: no preemptive warnings in generated files
- #474 Safety Net: no find -delete in generated files
- #467 telemetry: JSONL writes gated by _TEL conditional
- #584 sidebar: Write in allowedTools, stderr capture
- #578 relink: prefixed/flat symlinks, cleanup, error, config hook
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: replace find -delete with find -exec rm for Safety Net (#474)
-delete is a non-POSIX extension that fails on Safety Net environments.
-exec rm {} + is POSIX-compliant and works everywhere.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: gate local JSONL writes by telemetry setting (#467)
When telemetry is off, nothing is written anywhere — not just remote,
but local JSONL too. Clean trust contract: off means off everywhere.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: remove preemptive context warnings from plan-eng-review (#510)
The system handles context compaction automatically. Preemptive warnings
waste tokens and create false urgency. Skills should not warn about
context limits — just describe the compression priority order.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: add (gstack) tag to skill descriptions for discoverability (#594)
Every SKILL.md.tmpl description now contains "gstack" on the last line,
making skills findable in Claude Code's command palette. First-line hooks
stay under 120 chars. Split ship description to fix wrapping.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: auto-relink skill symlinks on prefix config change (#578)
New bin/gstack-relink creates prefixed (gstack-*) or flat symlinks
based on skill_prefix config. gstack-config auto-triggers relink
when skill_prefix changes. Setup guards against recursive calls
with GSTACK_SETUP_RUNNING env var.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: add feature signal detection to version bump heuristic (#573)
/ship Step 4 now checks for feature signals (new routes, migrations,
test+source pairs, feat/ branches) when deciding version bumps.
PATCH requires no feature signals. MINOR asks the user if any signal
is detected or 500+ lines changed.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: sidebar Write tool, stderr capture, cross-platform URL opener (#584)
Add Write to sidebar allowedTools (both sidebar-agent.ts and server.ts).
Write doesn't expand attack surface beyond what Bash already provides.
Replace empty stderr handler with buffer capture for better error
diagnostics. New bin/gstack-open-url for cross-platform URL opening.
Does NOT include Search Before Building intro flow (deferred).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: update sidebar-security test for Write tool addition
The fallback allowedTools string now includes Write, matching the
sidebar-agent.ts change from commit 68dc957.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: bump version and changelog (v0.13.5.0)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: prevent gstack-relink from double-prefixing gstack-upgrade
gstack-relink now checks if a skill directory is already named gstack-*
before prepending the prefix. Previously, setting skill_prefix=true would
create gstack-gstack-upgrade, breaking the /gstack-upgrade command.
Matches setup script behavior (setup:260) which already has this guard.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: add double-prefix fix to changelog
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: remove .factory/ from git tracking and add to .gitignore
Generated Factory Droid skills are build output, same as .agents/.
They should not be committed to the repo.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
feat: GStack Learns — per-project self-learning infrastructure (v0.13.4.0) (#622)
* feat: learnings + confidence resolvers — cross-skill memory infrastructure
Three new resolvers for the self-learning system:
- LEARNINGS_SEARCH: tells skills to load prior learnings before analysis
- LEARNINGS_LOG: tells skills to capture discoveries after completing work
- CONFIDENCE_CALIBRATION: adds 1-10 confidence scoring to all review findings
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: learnings bin scripts — append-only JSONL read/write
gstack-learnings-log: validates JSON, auto-injects timestamp, appends to
~/.gstack/projects/$SLUG/learnings.jsonl. Append-only (no mutation).
gstack-learnings-search: reads/filters/dedupes learnings with confidence
decay (observed/inferred lose 1pt/30d), cross-project discovery, and
"latest winner" resolution per key+type.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: learnings count in preamble output
Every skill now prints "LEARNINGS: N entries loaded" during preamble,
making the compounding loop visible to the user.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: integrate learnings + confidence into 9 skill templates
Add {{LEARNINGS_SEARCH}}, {{LEARNINGS_LOG}}, and {{CONFIDENCE_CALIBRATION}}
placeholders to review, ship, plan-eng-review, plan-ceo-review, office-hours,
investigate, retro, and cso templates. Regenerated all SKILL.md files.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: /learn skill — manage project learnings
New skill for reviewing, searching, pruning, and exporting what gstack
has learned across sessions. Commands: /learn, /learn search, /learn prune,
/learn export, /learn stats, /learn add.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: self-learning roadmap — 5-release design doc
Covers: R1 GStack Learns (v0.14), R2 Review Army (v0.15), R3 Smart Ceremony
(v0.16), R4 /autoship (v0.17), R5 Studio (v0.18). Inspired by Compound
Engineering, adapted to GStack's architecture.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* test: learnings bin script unit tests — 13 tests, free
Tests gstack-learnings-log (valid/invalid JSON, timestamp injection,
append-only) and gstack-learnings-search (dedup, type/query/limit filters,
confidence decay, user-stated no-decay, malformed JSONL skip).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: bump version and changelog (v0.13.4.0)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* test: learnings resolver + bin script edge case tests — 21 new tests, free
Adds gen-skill-docs coverage for LEARNINGS_SEARCH, LEARNINGS_LOG, and
CONFIDENCE_CALIBRATION resolvers. Adds bin script edge cases: timestamp
preservation, special characters, files array, sort order, type grouping,
combined filtering, missing fields, confidence floor at 0.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: sync package.json version with VERSION file (0.13.4.0)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: gitignore .factory/ — generated output, not source
Same pattern as .claude/skills/ and .agents/. These SKILL.md files are
generated from .tmpl templates by gen:skill-docs --host factory.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* test: /learn E2E — seed 3 learnings, verify agent surfaces them
Seeds N+1 query pattern, stale cache pitfall, and rubocop preference
into learnings.jsonl, then runs /learn and checks that at least 2/3
appear in the agent's output. Gate tier, ~$0.25/run.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
feat: Factory Droid compatibility — works across Claude Code, Codex, and Factory (v0.13.5.0) (#621)
* refactor: extract processExternalHost() shared helper for multi-host generation
Refactor the Codex-specific output routing block in gen-skill-docs.ts into
a shared processExternalHost() function. Both Codex and future external hosts
(Factory Droid) will use this helper for output routing, symlink loop detection,
frontmatter transformation, path rewrites, and metadata generation.
- Rename codexSkillName() to externalSkillName() everywhere
- Extract ExternalHostConfig interface with per-host settings
- Codex output is byte-identical (verified via --dry-run)
- Skip /codex skill for all non-Claude hosts (not just codex)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: add Factory Droid host type, preamble, and co-author trailer
- Add 'factory' to Host union type with .factory/skills/gstack paths
- Extend preamble runtime root detection for Factory ($HOME/.factory/)
- Add GSTACK_DESIGN env var to preamble (was missing for Codex too)
- Add Factory Droid co-author trailer for git commits
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: Factory Droid generation, --host all, and host-aware frontmatter
- Add --host factory (alias: --host droid) to gen-skill-docs
- Add --host all: generates for claude, codex, and factory in one invocation
with fault-tolerant per-host error handling (only fails if claude fails)
- Factory frontmatter: name + description + user-invocable: true
- Factory sensitive skills: disable-model-invocation: true (from sensitive: field)
- Claude: strips sensitive: field from output (only Factory uses it)
- Factory tool name translation: Claude tool names → generic phrasing
- Replace chained gen:skill-docs calls with --host all in package.json build
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: sensitive frontmatter for Factory Droid auto-invocation safety
Add sensitive: true to 6 skill templates with side effects that Factory
Droids shouldn't auto-invoke (ship, land-and-deploy, guard, careful,
freeze, unfreeze). The field is:
- Factory: emitted as disable-model-invocation: true
- Claude/Codex: stripped from output by transformFrontmatter()
Also fix Claude host path: call transformFrontmatter() for Claude to
strip the sensitive: field from Claude output.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: gstack-platform-detect binary for multi-host debugging
Bash script that prints a table of installed AI coding agents (Claude,
Codex, Factory Droid, Kiro) with versions, skill paths, and gstack
installation status. Useful for debugging multi-host setups.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: Factory Droid support in setup script
- Add factory to --host values (auto-detected via command -v droid)
- Add .factory/ skill doc generation step alongside .agents/
- Add create_factory_runtime_root() and link_factory_skill_dirs()
helpers mirroring the Codex equivalents
- Factory install section creates ~/.factory/skills/ with symlinks
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: Factory Droid awareness in skill-check and uninstall
- skill-check.ts: add Factory skills validation and freshness check
- gstack-uninstall: add Factory artifact cleanup (~/.factory/skills/gstack*
and per-project .factory/ sidecar)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* test: Factory Droid generation + --host all test suites
Add 13 new tests:
- Factory output paths, frontmatter (user-invocable, disable-model-invocation)
- Sensitive vs non-sensitive skill classification
- Path rewrites (no .claude/skills/ in Factory output)
- /codex skill exclusion, openai.yaml absence
- Factory keeps Codex integration blocks (for second opinions)
- --host droid alias, --dry-run freshness, preamble paths
- --host all generates for all 3 hosts
- Setup script host validation updated for factory
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: Factory Droid install instructions + CI freshness check
- README: add Factory Droid section with install instructions and
restart note (Factory requires restart to rescan skills)
- CI: add Factory skill doc freshness verification to skill-docs.yml
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: generated Factory Droid skill output (.factory/skills/)
29 skills generated for Factory Droid with:
- user-invocable: true on all skills
- disable-model-invocation: true on 6 sensitive skills
- .factory/skills/ paths (no .claude/skills/ references)
- $GSTACK_ROOT env vars for runtime root detection
- Tool name translation (Claude tool names → generic phrasing)
Committed to git for CI freshness checks and direct consumption.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: add Factory Droid P1 TODO for browse MCP server
Add 3 TODOs under new ## Factory Droid section:
- P1: Browse MCP server (Option B, deeper Factory integration)
- P3: .agent/skills/ dual output for cross-agent compatibility
- P3: Custom Droid definitions alongside skills
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: bump version and changelog (v0.13.5.0)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
fix: 6 critical fixes + community PR guardrails (v0.13.2.0) (#602)
* fix(security): commit bun.lock to pin dependency versions
Remove bun.lock from .gitignore and commit the lockfile. Every bun install
now uses exact pinned versions instead of resolving floating ^ ranges from
npm fresh. Closes the supply-chain vector from #566.
Co-Authored-By: boinger <boinger@users.noreply.github.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: gstack-slug falls back to dirname/unknown when git context is absent
Add || true to git commands and fallback defaults so gstack-slug works
outside git repos. Prevents unbound variable crash that kills every
review skill when no git context exists.
Co-Authored-By: collinstraka-clov <collinstraka-clov@users.noreply.github.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: setup auto-selects default after 10s timeout to prevent CI hangs
Add -t 10 to the read command in the skill-prefix prompt. In CI, Docker,
and Conductor workspaces where a TTY exists but nobody is watching, the
prompt now auto-selects short names after 10 seconds instead of blocking
forever.
Co-Authored-By: stedfn <stedfn@users.noreply.github.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: browse CLI Windows lockfile — use string flag instead of numeric constants
Bun compiled binaries on Windows don't handle numeric fs.constants
correctly. The string flag 'wx' is semantically identical to
O_CREAT | O_EXCL | O_WRONLY per Node docs and works on all platforms.
Fixes #599
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: add ~/.gstack/projects/ to plan file search path
/office-hours writes design docs to ~/.gstack/projects/$SLUG/ but /ship
and /review only searched ~/.claude/plans, ~/.codex/plans, and .gstack/plans.
Add the project-scoped directory as the first search location so plan
validation finds design docs created by the standard workflow.
Fixes #591
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: autoplan dual-voice — sequential foreground execution instead of broken parallel
Background subagents don't inherit tool permissions in Claude Code, so the
Claude subagent in dual-voice mode was silently failing on every invocation.
Every autoplan run was degrading to single-reviewer mode without warning.
Change all three phases (CEO, Design, Eng) from "simultaneously" to
sequential foreground execution: Claude subagent first (Agent tool,
foreground), then Codex (Bash). Both complete before the consensus table.
Fixes #497
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: regenerate SKILL.md files from updated templates
Regenerated from autoplan/SKILL.md.tmpl (dual-voice fix) and
scripts/resolvers/review.ts (plan search path fix).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: add community PR guardrails — protect ETHOS.md and voice
Add explicit CLAUDE.md rule requiring AskUserQuestion before accepting
any community PR that touches ETHOS.md, removes promotional material,
or changes Garry's voice. No exceptions, no auto-merging.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: bump version and changelog (v0.13.2.0)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: gen-skill-docs detects symlink loop, skips codex write that overwrites Claude SKILL.md
When .agents/skills/gstack is symlinked to the repo root (vendored dev mode),
gen-skill-docs --host codex was writing the Codex-transformed SKILL.md through
the symlink, overwriting the Claude version. This caused SKILL.md and
agents/openai.yaml to silently revert to Codex paths after every build.
Now detects when the codex output path resolves to the same real file as the
Claude output and skips the write. Content is still generated for token budget
tracking. The openai.yaml write is also skipped for the same symlink case.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: resolve all 7 test failures — version sync, zsh glob guard, symlink-aware codex tests
1. package.json version synced with VERSION file (0.13.3.0)
2. design-shotgun/SKILL.md.tmpl: added setopt +o nomatch guard to
bash block with variant-*.png glob
3. Codex generation tests: skip skills where .agents/skills/{name}
is a symlink back to repo root (vendored dev mode). These can't
have proper codex content since gen-skill-docs skips the write
to avoid overwriting the Claude SKILL.md.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: boinger <boinger@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: collinstraka-clov <collinstraka-clov@users.noreply.github.com>
Co-authored-by: stedfn <stedfn@users.noreply.github.com>
feat: user sovereignty — AI models recommend, users decide (v0.13.2.0) (#603)
* feat: user sovereignty — AI models recommend, users decide
When Claude and Codex agree on a scope change, they now present it to the
user instead of auto-incorporating it. Adds User Sovereignty as the third
core principle in ETHOS.md. Fixes the cross-model tension template in
review.ts to present both perspectives neutrally instead of judging. Adds
User Challenge category to autoplan with proper contract updates (intro,
important rules, audit trail, gate handling). Adds Outside Voice Integration
Rule to CEO and eng review templates.
* chore: regenerate SKILL.md files from updated templates
* chore: bump version and changelog (v0.13.2.0)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: proper gstack description in openai.yaml + block Codex from rewriting it
Codex kept overwriting agents/openai.yaml with a browse-only description.
Two fixes: (1) better description covering full PM/dev/eng/CEO/QA scope,
(2) add agents/ to the filesystem boundary so Codex stops modifying it.
* chore: regenerate SKILL.md files with updated filesystem boundary
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
fix: security audit compliance — credentials, telemetry, bun pin, untrusted warning (v0.12.12.0) (#574)
* fix: replace hardcoded credentials with env vars in documentation
Addresses Snyk W007 (HIGH). Replaces test@example.com/password123 with
$TEST_EMAIL/$TEST_PASSWORD env vars. Adds credential safety and cookie
safety notes.
* fix: make telemetry binary calls conditional on _TEL and binary existence
Addresses Socket's 14 MEDIUM findings for opaque telemetry binary.
Adds local JSONL fallback (always available, inspectable). Remote
binary only runs if _TEL != "off" and binary exists.
* fix: pin bun install to v1.3.10 with existence check
Addresses Snyk W012 (MEDIUM). Pins BUN_VERSION in browse.ts resolver,
Dockerfile.ci, and setup script error message. Adds command -v check
to skip install if bun already present.
* docs: add data flow documentation to review.ts
Addresses Socket HIGH finding (98% confidence). Documents what data
is sent to external review services and what is NOT sent.
* test: add audit compliance regression tests
6 tests enforce Snyk/Socket fixes stay in place: no hardcoded creds,
conditional telemetry, version-pinned bun, untrusted content warning,
data flow docs, all SKILL.md telemetry conditional.
* refactor: remove 2017 lines of dead code from gen-skill-docs.ts
The Placeholder Resolvers section (lines 77-2092) contained duplicate
functions that were superseded by scripts/resolvers/*.ts. The RESOLVERS
map from resolvers/index.ts is the sole resolution path. Verified: zero
call sites outside self-references.
* chore: regenerate SKILL.md files from updated templates
Reflects: conditional telemetry, version-pinned bun install,
untrusted content warning after Navigation commands.
* chore: bump version and changelog (v0.12.12.0)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
feat: skill prefix is now a persistent user choice (v0.12.11.0) (#571)
* feat: make skill prefix a persistent, interactive user setting
- Add --prefix flag alongside --no-prefix
- Read/write skill_prefix from ~/.gstack/config.yaml (true/false)
- Interactive prompt on first setup when no preference saved
- Non-TTY environments default to flat names (no prefix)
- Add cleanup_prefixed_claude_symlinks() for reverse direction
- Fix gstack-config sed portability (mktemp+mv instead of BSD sed -i '')
- Add SKILL_PREFIX to preamble output with namespace-aware instruction
* test: add prefix config tests + README switching instructions
8 structural tests for persistent prefix setting:
config reading, --prefix flag, config persistence, interactive
prompt, TTY fallback, reverse cleanup, cleanup ordering, welcome.
* chore: regenerate SKILL.md files with SKILL_PREFIX preamble
* chore: bump version and changelog (v0.12.11.0)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* docs: reframe changelog as feature, not mea culpa
* docs: update CONTRIBUTING + CLAUDE.md for prefix-aware vendoring
- CONTRIBUTING: vendoring now includes ./setup step for per-skill symlinks
- CONTRIBUTING: prefix choice documented in contributor workflow + dev diagram
- CONTRIBUTING: switching prefix mode section added
- CLAUDE.md: vendored symlink awareness section covers prefix setting
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
fix: Codex filesystem boundary — prevent skill-file prompt injection (v0.12.10.0) (#570)
* fix: add filesystem boundary to all codex prompts
Codex CLI can read files outside the repo root despite -s read-only.
It discovers ~/.claude/skills/ and ~/.agents/skills/, treats SKILL.md
files as instructions, and executes preamble scripts instead of
reviewing code. Fix: prepend a boundary instruction to all 11 codex
exec/review callsites across codex/SKILL.md.tmpl (3), autoplan/
SKILL.md.tmpl (3), and scripts/resolvers/review.ts (5). Add rabbit-
hole detection rule and 5 regression tests.
* chore: bump version and changelog (v0.12.10.0)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
fix: zsh glob compatibility across all skill templates (v0.12.8.1) (#559)
* fix: replace zsh-incompatible raw globs with find-based alternatives and setopt guards
Zsh's NOMATCH option (on by default) causes raw globs like `*.yaml` and
`*deploy*` to throw errors when no files match, instead of silently expanding
to nothing as bash does. The preamble resolver already handled this correctly
with find, but 38 glob instances across 13 templates and 2 resolvers still
used raw shell globs.
Two fix approaches based on complexity:
- find-based replacement for cat/for/ls-with-pipes patterns (.github/workflows/)
- setopt +o nomatch guard for simple ls -t patterns (~/.gstack/, ~/.claude/)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: regenerate SKILL.md files from updated templates
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: bump version and changelog (v0.12.8.1)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* test: add zsh glob safety test + fix 2 missed resolver globs
Adds a test that scans all generated SKILL.md bash blocks for raw glob
patterns and verifies they have either a find-based replacement or a
setopt +o nomatch guard. The test immediately caught 2 unguarded blocks
in review.ts (design doc re-check and plan file discovery).
Also syncs package.json version to 0.12.8.1.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
fix: resolve codex exec -C repo root eagerly to prevent wrong-project reviews (v0.12.6.0) (#549)
* refactor: remove 6 dead resolver function copies from gen-skill-docs.ts
These functions were moved to scripts/resolvers/{review,design}.ts but the
old copies in gen-skill-docs.ts were never deleted. They are defined but
never called — the RESOLVERS map from resolvers/index.ts is the live
dispatch. The dead copies had already diverged from the live versions.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: resolve codex exec -C repo root eagerly to prevent wrong-project reviews
When codex exec commands run in background bash tasks (e.g., Conductor
workspaces), $(git rev-parse --show-toplevel) evaluates in whatever cwd
the background shell inherits, which may be a different project. Fix by
resolving _REPO_ROOT once at the top of each bash block and referencing
the stored value in -C.
12 occurrences fixed across 4 source files:
- codex/SKILL.md.tmpl (3)
- autoplan/SKILL.md.tmpl (3)
- scripts/resolvers/review.ts (3)
- scripts/resolvers/design.ts (3)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* test: regression guard for codex exec inline git rev-parse in -C flag
Scans all .tmpl and resolver .ts source files for codex exec commands
that use inline $(git rev-parse --show-toplevel) in the -C flag. This
pattern causes wrong-project reviews in Conductor workspaces. The test
ensures nobody reintroduces the old pattern.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: bump version and changelog (v0.12.6.0)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: address adversarial review findings — codex review cwd, test scope, fail-loud
1. codex review commands now cd to $_REPO_ROOT (review doesn't support -C)
2. Autoplan codex commands converted from prose "Prerequisite" to fenced bash blocks
3. || pwd fallback replaced with hard fail — silent wrong-dir is worse than error
4. Regression test now scans all resolver .ts files + generated SKILL.md files
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* test: harden regression test — Bun.Glob, SKILL.md scan, codex review check
Fixes three gaps found by adversarial review:
1. fs.readdirSync recursive hits ELOOP on .claude/skills/gstack symlink.
Switched to Bun.Glob with followSymlinks:false.
2. Generated SKILL.md files now scanned (not just .tmpl sources).
3. New test: codex review commands must not use inline git rev-parse
(codex review doesn't support -C, so cd "$_REPO_ROOT" is the fix).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
fix: community PRs + security hardening + E2E stability (v0.12.7.0) (#552)
* fix(security): skip hidden directories in skill template discovery
discoverTemplates() scans subdirectories for SKILL.md.tmpl files but
only skips node_modules, .git, and dist. Hidden directories like
.claude/, .agents/, and .codex/ (which contain symlinked skill
installs) were being scanned, allowing a malicious .tmpl in a
symlinked skill to inject into the generation pipeline.
Fix: add !d.name.startsWith('.') to the subdirs() filter. This skips
all dot-prefixed directories, matching the standard convention that
hidden dirs are not source code.
* fix(security): sanitize telemetry JSONL inputs against injection
SKILL, OUTCOME, SESSION_ID, SOURCE, and EVENT_TYPE values go directly
into printf %s for JSONL output. If any contain double quotes,
backslashes, or newlines, the JSON breaks — or worse, injects
arbitrary fields.
Fix: strip quotes, backslashes, and control characters from all
string fields before JSONL construction via json_safe() helper.
* fix(security): validate JSON input in gstack-review-log
gstack-review-log appends its argument directly to a JSONL file with
no validation. Malformed or crafted input could corrupt the review log
or inject arbitrary content.
Fix: validate input is parseable JSON via python3 before appending.
Reject with exit 1 and stderr message if invalid.
* fix: treat relative dot-paths as file paths in screenshot command
Closes #495
* fix: use host-specific co-author trailer in /ship and /document-release
Codex-generated skills hardcoded a Claude co-author trailer in commit
messages. Users running gstack under Codex pushed commits attributed
to the wrong AI assistant.
Add {{CO_AUTHOR_TRAILER}} resolver that emits the correct trailer
based on ctx.host:
- claude: Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- codex: Co-Authored-By: OpenAI Codex <noreply@openai.com>
Replace hardcoded trailers in ship/SKILL.md.tmpl and
document-release/SKILL.md.tmpl with the resolver placeholder.
Fixes #282. Fixes #383.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: auto-upgrade marker no longer masks newer remote versions
When a just-upgraded-from marker persists across sessions, the update
check would write UP_TO_DATE to cache and exit immediately — never
fetching the remote VERSION. Users silently miss updates that landed
after their last upgrade.
Remove the early exit and premature cache write so the script falls
through to the remote check after consuming the marker. This ensures
JUST_UPGRADED is still emitted for the preamble, while also detecting
any newer versions available upstream.
Fixes #515
* fix: decouple doc generation from binary compilation in build script
The build script chains gen:skill-docs and bun build --compile with &&,
so a doc generation failure (e.g. missing Codex host config, template
error) prevents the browse binary from being compiled. Users end up
with a broken install where setup reports the binary is missing.
Replace && with ; for the two gen:skill-docs steps so they run
independently of the compilation chain. Doc generation errors are still
visible in stderr, but no longer block binary compilation.
Fixes #482
* fix: extend security sanitization + add 10 tests for merged community PRs
- Extend json_safe() to ERROR_CLASS and FAILED_STEP fields
- Improve ERROR_MESSAGE escaping to handle backslashes and newlines
- Replace python3 with bun for JSON validation in gstack-review-log
- Add 7 telemetry injection prevention tests
- Add 2 review-log JSON validation tests
- Add 1 discover-skills hidden directory filtering test
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: stabilize flaky E2E tests (browse-basic, ship-base-branch, dashboard-via)
browse-basic: bump maxTurns 5→7 (agent reads PNG per SKILL.md instruction)
ship-base-branch: extract Step 0 only instead of full 1900-line ship/SKILL.md
dashboard-via: extract dashboard section only + increase timeout 90s→180s
Root cause: copying full SKILL.md files into test fixtures caused context bloat,
leading to timeouts and flaky turn limits. Extracting only the relevant section
cut dashboard-via from timing out at 240s to finishing in 38s.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: add E2E fixture extraction rule to CLAUDE.md
Never copy full SKILL.md files into E2E test fixtures. Extract only
the section the test needs. Also: run targeted evals in foreground,
never pkill and restart mid-run.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: stabilize journey-think-bigger routing test
Use exact trigger phrases from plan-ceo-review skill description
("think bigger", "expand scope", "ambitious enough") instead of
the ambiguous "thinking too small". Reduce maxTurns 5→3 to cut
cost per attempt ($0.12 vs $0.25). Test remains periodic tier
since LLM routing is inherently non-deterministic.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* remove: delete journey-think-bigger routing test
Never passed reliably. Tests ambiguous routing ("think bigger" →
plan-ceo-review) but Claude legitimately answers directly instead
of invoking a skill. The other 10 journey tests cover routing
with clear, actionable signals.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: bump version and changelog (v0.12.7.0)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Arun Kumar Thiagarajan <arunkt.bm14@gmail.com>
Co-authored-by: bluzername <bluzer@gmail.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Greg Jackson <gregario@users.noreply.github.com>
fix: Codex hang fixes — plan visibility, stdout buffering, reasoning effort (v0.12.4.0) (#536)
* fix: unbuffer Python stdout in codex --json streaming
Python fully buffers stdout when piped (not a TTY). The
`codex exec --json | python3 -c "..."` pattern meant zero output
visible until process exit — users saw nothing for 30+ minutes.
Add PYTHONUNBUFFERED=1 env var, python3 -u flag, and flush=True
to all print() calls in all three Python parser blocks (Challenge,
Consult new session, Consult resumed session).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: per-mode reasoning effort defaults, add --xhigh override
xhigh reasoning uses ~23x more tokens and causes 50+ minute hangs
on large context tasks (OpenAI issues #8545, #8402, #6931).
Per-mode defaults for /codex skill:
- Review: high (bounded diff, needs thoroughness)
- Challenge: high (adversarial but bounded by diff)
- Consult: medium (large context, interactive, needs speed)
Also changes all Outside Voice / adversarial codex invocations
across gstack (resolvers, gen-skill-docs) from xhigh to high.
Users can override with --xhigh flag when they want max reasoning.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: explicit plan content embedding for codex sandbox visibility
Codex runs sandboxed to repo root (-C) and cannot access
~/.claude/plans/. The template already instructed content embedding
but wasn't explicit enough — Claude sometimes shortcut to
referencing the file path, causing Codex to waste 10+ tool calls
searching before giving up.
Strengthen the instruction to make embedding unambiguous: "embed
FULL CONTENT, do NOT reference the file path." Also extract
referenced source file paths from the plan so Codex reads them
directly instead of discovering via rg/find.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: add --xhigh reminder to challenge and consult modes
The --xhigh override was only documented in Step 2A (review).
Steps 2B (challenge) and 2C (consult) lacked the reminder,
so the flag would silently do nothing for those modes.
Found by adversarial review.
* chore: bump version and changelog (v0.12.4.0)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
fix: /ship CHANGELOG and PR body now cover all branch commits (v0.12.4.0) (#535)
* fix: /ship CHANGELOG and PR body now cover all branch commits
Step 5 (CHANGELOG generation) restructured to force explicit commit
enumeration, theme grouping, and cross-check before writing. Step 8
(PR body) changed from "bullet points from CHANGELOG" to commit-by-commit
coverage with logical sections. Fixes recency bias that dropped early
commits on long branches.
* chore: bump version and changelog (v0.12.3.0)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
feat: voice directive for all skills (v0.12.3.0) (#520)
* feat: add voice directive to skill preamble with tiered context/concreteness/humor
Adds a Voice section to all skill preambles via the template resolver.
Three new subsections: context-dependent tone (YC partner / senior eng /
blog post), concreteness standard (exact commands, line numbers, real
numbers), and connect-to-user-outcomes guidance. Humor calibrated to dry
observations about software absurdity.
Includes eval test for voice directive presence and banned-word filtering.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: regenerate SKILL.md files with voice directive
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: sync package.json version with VERSION file (0.12.2.0)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: regenerate connect-chrome SKILL.md with voice directive
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: bump version and changelog (v0.12.3.0)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
feat: headed mode + sidebar agent + Chrome extension (v0.12.0) (#517)
* feat: CDP connect — control real Chrome/Comet via Playwright
Add `connectCDP()` to BrowserManager: connects to a running browser via
Chrome DevTools Protocol. All existing browse commands work unchanged
through Playwright's abstraction layer.
- chrome-launcher.ts: browser discovery, CDP probe, auto-relaunch with rollback
- browser-manager.ts: connectCDP(), mode guards (close/closeTab/recreateContext/handoff),
auto-reconnect on browser restart, getRefMap() for extension API
- server.ts: CDP branch in start(), /health gains mode field, /refs endpoint,
idle timer only resets on /command (not passive endpoints)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: browse connect/disconnect/focus CLI commands
- connect: pre-server command that discovers browser, starts server in CDP mode
- disconnect: drops CDP connection, restarts in headless mode
- focus: brings browser window to foreground via osascript (macOS)
- status: now shows Mode: cdp | launched | headed
- startServer() accepts extra env vars for CDP URL/port passthrough
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: CDP-aware skill templates — skip cookie import in real browser mode
Skills now check `$B status` for CDP mode and skip:
- /qa: cookie import prompt, user-agent override, headless workarounds
- /design-review: cookie import for authenticated pages
- /setup-browser-cookies: returns "not needed" in CDP mode
Regenerated SKILL.md files from updated templates.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: activity streaming — SSE endpoint for Chrome extension Side Panel
Real-time browse command feed via Server-Sent Events:
- activity.ts: ActivityEntry type, CircularBuffer (capacity 1000), privacy
filtering (redacts passwords, auth tokens, sensitive URL params),
cursor-based gap detection, async subscriber notification
- server.ts: /activity/stream SSE, /activity/history REST, handleCommand
instrumented with command_start/command_end events
- 18 unit tests for filterArgs privacy, emitActivity, subscribe lifecycle
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: Chrome extension Side Panel + Conductor API proposal
Chrome extension (Manifest V3, sideload):
- Side Panel with live activity feed, @ref overlays, dark terminal aesthetic
- Background worker: health polling, SSE relay, ref fetching
- Popup: port config, connection status, side panel launcher
- Content script: floating ref panel with @ref badges
Conductor API proposal (docs/designs/CONDUCTOR_SESSION_API.md):
- SSE endpoint for full Claude Code session mirroring in Side Panel
- Discovery via HTTP endpoint (not filesystem — extensions can't read files)
TODOS.md: add $B watch, multi-agent tabs, cross-platform CDP, Web Store publishing.
Mark CDP mode as shipped.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: detect Conductor runtime, skip osascript quit for sandboxed apps
macOS App Management blocks Electron apps (Conductor) from quitting
other apps via osascript. Now detects the runtime environment:
- terminal/claude-code/codex: can manage apps freely
- conductor: prints manual restart instructions + polls for 60s
detectRuntime() checks env vars and parent process. When Chrome needs
restart but we can't quit it, prints step-by-step instructions and
waits for the user to restart Chrome with --remote-debugging-port.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: detect Conductor via actual env vars (CONDUCTOR_WORKSPACE_NAME)
Previous detection checked CONDUCTOR_WORKSPACE_ID which doesn't exist.
Conductor sets CONDUCTOR_WORKSPACE_NAME, CONDUCTOR_BIN_DIR, CONDUCTOR_PORT,
and __CFBundleIdentifier=com.conductor.app. Check these FIRST because
Conductor sessions also have ANTHROPIC_API_KEY (which was matching claude-code).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: connection status pill — floating indicator when gstack controls Chrome
Small pill in bottom-right corner of every page: "● gstack · 3 refs"
Shows when connected via CDP, fades to 30% opacity after 3s, full on hover.
Disappears entirely when disconnected.
Background worker now notifies content scripts on connect/disconnect state
changes so the pill appears/disappears without polling.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: Chrome requires --user-data-dir for remote debugging
Chrome refuses --remote-debugging-port without an explicit --user-data-dir.
Add userDataDir to BrowserBinary registry (macOS Application Support paths)
and pass it in both auto-launch and manual restart instructions.
Fix double-quoting in CLI manual restart instructions.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: Chrome must be fully quit before launching with --remote-debugging-port
Chrome refuses to enable CDP on its default profile when another instance
is running (even with explicit --user-data-dir). The only reliable path:
fully quit Chrome first, then relaunch with the flag.
Updated instructions to emphasize this clearly with verification step.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: bin/chrome-cdp — quit Chrome and relaunch with CDP in one command
Quits Chrome gracefully, waits for full exit, relaunches with
--remote-debugging-port, polls until CDP is ready. Usage: chrome-cdp [port]
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: use Playwright channel:chrome instead of broken connectOverCDP
Playwright's connectOverCDP hangs with Chrome 146 due to CDP protocol
version mismatch. Switch to channel:'chrome' which uses Playwright's
native pipe protocol to launch the system Chrome binary directly.
This is simpler and more reliable:
- No CDP port discovery needed
- No --remote-debugging-port or --user-data-dir hassles
- $B connect just works — launches real Chrome headed window
- All Playwright APIs (snapshot, click, fill) work unchanged
bin/chrome-cdp updated with symlinked profile approach (kept for
manual CDP use cases, but $B connect no longer needs it).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: green border + gstack label on controlled Chrome window
Injects a 2px green border and small "gstack" label on every page
loaded in the controlled Chrome window via context.addInitScript().
Users can instantly tell which Chrome window Claude controls.
Also fixes close() for channel:chrome mode (uses browser.close()
not browser.disconnect() which doesn't exist).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: cleanup chrome-launcher runtime detection, remove puppeteer-core dep
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* style(design): redesign controlled Chrome indicator
Replace crude green border + label with polished indicator:
- 2px shimmer gradient at top edge (green→cyan→green, 3s loop)
- Floating pill bottom-right with frosted glass bg, fades to 25%
opacity after 4s so it doesn't compete with page content
- prefers-reduced-motion disables shimmer animation
- Much more subtle — looks like a developer tool, not broken CSS
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: document real browser mode + Chrome extension in BROWSER.md and README.md
BROWSER.md: new sections for connect/disconnect/focus commands,
Chrome extension Side Panel install, CDP-aware skills, activity streaming.
Updated command reference table, key components, env vars, source map.
README.md: updated /browse description, added "Real browser mode" to
What's New section.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: step-by-step Chrome extension install guide in BROWSER.md
Replace terse bullet points with numbered walkthrough covering:
developer mode toggle, load unpacked, macOS file picker tip (Cmd+Shift+G),
pin extension, configure port, open side panel. Added troubleshooting section.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: add Cmd+Shift+. tip for hidden folders in macOS file picker
macOS hides folders starting with . by default. Added both shortcuts:
Cmd+Shift+G (paste path directly) and Cmd+Shift+. (show hidden files).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: integrate hidden folder tips into the install flow naturally
Move Cmd+Shift+G and Cmd+Shift+. tips inline with the file picker
step instead of as a separate tip block after it.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: auto-load Chrome extension when $B connect launches Chrome
Extension auto-loads via --load-extension flag — no manual chrome://extensions
install needed. findExtensionPath() checks repo root, global install, and dev
paths. Also adds bin/gstack-extension helper for manual install in regular
Chrome, and rewrites BROWSER.md install docs with auto-load as primary path.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: /connect-chrome skill — one command to launch Chrome with Side Panel
New skill that runs $B connect, verifies the connection, guides the user
to open the Side Panel, and demos the live activity feed. Extension auto-loads
via --load-extension so no manual chrome://extensions install needed.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: use launchPersistentContext for Chrome extension loading
Playwright's chromium.launch() silently ignores --load-extension.
Switch to launchPersistentContext with ignoreDefaultArgs to remove
--disable-extensions flag. Use bundled Chromium (real Chrome blocks
unpacked extensions). Fixed port 34567 for CDP mode so the extension
auto-connects.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: sync extension to DESIGN.md — amber accent, zinc neutrals, grain texture
Import design system from gstack-website. Update all extension colors:
green (#4ade80) → amber (#F59E0B/#FBBF24), zinc gray neutrals, grain
texture overlay. Regenerate icons as amber "G" monogram on dark background.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: sidebar chat with Claude Code — icon opens side panel directly
Replace popup flyout with direct side panel open on icon click. Primary
UI is now a chat interface that sends messages to Claude Code via file
queue. Activity/Refs tabs moved behind a debug toggle in the footer.
Command bar with history, auto-poll for responses, amber design system.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: sidebar agent — Claude-powered chat backend via file queue
Add /sidebar-command, /sidebar-response, and /sidebar-chat endpoints
to the browse server. sidebar-agent.ts watches the command queue file,
spawns claude -p with browse context for each message, and streams
responses back to the sidebar chat.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: remove duplicate gstack pill overlay, hide crash restore bubble
The addInitScript indicator and the extension's content script were both
injecting bottom-right pills, causing duplicates. Remove the pill from
addInitScript (extension handles it). Replace --restore-last-session with
--hide-crash-restore-bubble to suppress the "Chromium didn't shut down
correctly" dialog.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: state file authority — CDP server cannot be silently replaced
Hardens the connect/disconnect lifecycle:
- ensureServer() refuses to auto-start headless when CDP server is alive
- $B connect does full cleanup: SIGTERM → 2s → SIGKILL, profile locks, state
- shutdown() cleans Chromium SingletonLock/Socket/Cookie files
- uncaughtException/unhandledRejection handlers do emergency cleanup
This prevents the bug where a headless server overwrites the CDP server's
state file, causing $B commands to hit the wrong browser.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: sidebar agent streaming events + session state management
Enhance sidebar-agent.ts with:
- Live streaming of claude -p events (tool_use, text, result) to sidebar
- Session state file for BROWSE_STATE_FILE propagation to claude subprocess
- Improved logging (stderr, exit codes, event types)
- stdin.end() to prevent claude waiting for input
- summarizeToolInput() with path shortening for compact sidebar display
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: sidebar chat UI — streaming events, agent status, reconnect retry
Sidebar panel improvements:
- Chat tab renders streaming agent events (tool_use, text, result)
- Thinking dots animation while agent processes
- Agent error display with styled error blocks
- tryConnect() with 2s retry loop for initial connection
- Debug tabs (Activity/Refs) hidden behind gear toggle
- Clear chat button
- Compact tool call display with path shortening
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: server-integrated sidebar agent with sessions and message queue
Move the sidebar agent from a separate bun process into server.ts:
- Agent spawns claude -p directly when messages arrive via /sidebar-command
- In-memory chat buffer backed by per-session chat.jsonl on disk
- Session manager: create, load, persist, list sessions
- Message queue (cap 5) with agent status tracking (idle/processing/hung)
- Stop/kill endpoints with queue dismiss support
- /health now returns agent status + session info
- All sidebar endpoints require Bearer auth
- Agent killed on server shutdown
- 120s timeout detects hung claude processes
Eliminates: file-queue polling, separate sidebar-agent.ts process,
stale auth tokens, state file conflicts between processes.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: extension auth + token flow for server-integrated agent
Update Chrome extension to use Bearer auth on all sidebar endpoints:
- background.js captures auth token from /health, exposes via getToken msg
- background.js sets openPanelOnActionClick for direct side panel access
- sidepanel.js gets token from background, sends in all fetch headers
- Health broadcasts include token so sidebar auto-authenticates
- Removes popup from manifest — icon click opens side panel directly
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: self-healing sidebar — reconnect banner, state machine, copy button
Sidebar UI now handles disconnection gracefully:
- Connection state machine: connected → reconnecting → dead
- Amber pulsing banner during reconnect (2s retry, 30 attempts)
- Red "Server offline" banner with Reconnect + Copy /connect-chrome buttons
- Green "Reconnected" toast that fades after 3s on successful reconnect
- Copy button lets user paste /connect-chrome into any Claude Code session
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: crash handling — save session, kill agent, distinct exit codes
Hardened shutdown/crash behavior:
- Browser disconnect exits with code 2 (distinct from crash code 1)
- emergencyCleanup kills agent subprocess and saves session state
- Clean shutdown saves session before exit (chat history persists)
- Clear user message on browser disconnect: "Run $B connect to reconnect"
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: worktree-per-session isolation for sidebar agent
Each sidebar session gets an isolated git worktree so the agent's file
operations don't conflict with the user's working directory:
- createWorktree() creates detached HEAD worktree in ~/.gstack/worktrees/
- Falls back to main cwd for non-git repos or on creation failure
- Handles collision cleanup from prior crashes
- removeWorktree() cleans up on session switch and shutdown
- worktreePath persisted in session.json
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(qa): ISSUE-001 — disconnect blocked by CDP guard in ensureServer
$B disconnect was routed through ensureServer() which refused to start a
headless server when a CDP state file existed. Disconnect is now handled
before ensureServer() (like connect), with force-kill + cleanup fallback
when the CDP server is unresponsive.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: resolve claude binary path for daemon-spawned agent
The browse server runs as a daemon and may not inherit the user's shell
PATH. Add findClaudeBin() that checks ~/.local/bin/claude (standard
install location), which claude, and common system paths. Shows a clear
error in the sidebar chat if claude CLI is not found.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: resolve claude symlinks + check Conductor bundled binary
posix_spawn fails on symlinks in compiled bun binaries. Now:
- Checks Conductor app's bundled binary first (not a symlink)
- Scans ~/.local/share/claude/versions/ for direct versioned binaries
- Uses fs.realpathSync() to resolve symlinks before spawning
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: compiled bun binary cannot posix_spawn — use external agent process
Compiled bun binaries fail posix_spawn on ALL executables (even /bin/bash).
The server now writes to an agent queue file, and a separate non-compiled
bun process (sidebar-agent.ts) reads the queue, spawns claude, and POSTs
events back via /sidebar-agent/event.
Changes:
- server.ts: spawnClaude writes to queue file instead of spawning directly
- server.ts: new /sidebar-agent/event endpoint for agent → server relay
- server.ts: fix result event field name (event.text vs event.result)
- sidebar-agent.ts: rewritten to poll queue file, relay events via HTTP
- cli.ts: $B connect auto-starts sidebar-agent as non-compiled bun process
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: loading spinner on sidebar open while connecting to server
Shows an amber spinner with "Connecting..." when the sidebar first opens,
replacing the empty state. After the first successful /sidebar-chat poll:
- If chat history exists: renders it immediately
- If no history: shows the welcome message
Prevents the jarring empty-then-populated flash on sidebar open.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: zero-friction side panel — auto-open on install, pill is clickable
Three changes to eliminate manual side panel setup:
- Auto-open side panel on extension install/update (onInstalled listener)
- gstack pill (bottom-right) is now clickable — opens the side panel
- Pill has pointer-events: auto so clicks always register (was: none)
User no longer needs to find the puzzle piece icon, pin the extension,
or know the side panel exists. It opens automatically on first launch
and can be re-opened by clicking the floating gstack pill.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor: kill CDP naming, delete chrome-launcher.ts dead code
The connectCDP() method and connectionMode: 'cdp' naming was a legacy
artifact — real Chrome was tried but failed (silently blocks
--load-extension), so the implementation already used Playwright's
bundled Chromium via launchPersistentContext(). The naming was
misleading.
Changes:
- Delete chrome-launcher.ts (361 LOC) — only import was in unreachable
attemptReconnect() method
- Delete dead attemptReconnect() and reconnecting field
- Delete preExistingTabIds (was for protecting real Chrome tabs we
never connect to)
- Rename connectCDP() → launchHeaded()
- Rename connectionMode: 'cdp' → 'headed' across all files
- Replace BROWSE_CDP_URL/BROWSE_CDP_PORT env vars with BROWSE_HEADED=1
- Regenerate SKILL.md files for updated command descriptions
- Move BrowserManager unit tests to browser-manager-unit.test.ts
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: converge handoff into connect — extension loads on handoff
Handoff now uses launchPersistentContext() with extension auto-loading,
same as the connect/launchHeaded() path. This means when the agent
gets stuck (2FA, CAPTCHA) and hands off to the user, the Chrome
extension + side panel are available automatically.
Before: handoff used chromium.launch() + newContext() — no extension
After: handoff uses chromium.launchPersistentContext() — extension loads
Also sets connectionMode to 'headed' and disables dialog auto-accept
on handoff, matching connect behavior.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: gate sidebar chat behind --chat flag
$B connect (default): headed Chromium + extension with Activity + Refs
tabs only. No separate agent spawned. Clean, no confusion.
$B connect --chat: same + Chat tab with standalone claude -p agent.
Shows experimental banner: "Standalone mode — this is a separate
agent from your workspace."
Implementation:
- cli.ts: parse --chat, set BROWSE_SIDEBAR_CHAT env, conditionally
spawn sidebar-agent
- server.ts: gate /sidebar-* routes behind chatEnabled, return 403
when disabled, include chatEnabled in /health response
- sidepanel.js: applyChatEnabled() hides/shows Chat tab + banner
- background.js: forward chatEnabled from health response
- sidepanel.html/css: experimental banner with amber styling
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: file drop relay + $B inbox command
Sidebar agent now writes structured messages to .context/sidebar-inbox/
when processing user input. The workspace agent can read these via
$B inbox to see what the user reported from the browser.
File drop format:
.context/sidebar-inbox/{timestamp}-observation.json
{ type, timestamp, page: {url}, userMessage, sidebarSessionId }
Atomic writes (tmp + rename) prevent partial reads. $B inbox --clear
removes messages after display.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: $B watch — passive observation mode
Claude enters read-only mode and captures periodic snapshots (every 5s)
while the user browses. Mutation commands (click, fill, etc.) are
blocked during watch. $B watch stop exits and returns a summary with
the last snapshot.
Requires headed mode ($B connect). This is the inverse of the scout
pattern — the workspace agent watches through the browser instead of
the sidebar relaying to it.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* test: add coverage for sidebar-agent, file-drop, and watch mode
33 new tests covering:
- Sidebar agent queue parsing (valid/malformed/empty JSONL)
- writeToInbox file drop (directory creation, atomic writes, JSON format)
- Inbox command (display, sorting, --clear, malformed file handling)
- Watch mode state machine (start/stop cycles, snapshots, duration)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: TODOS cleanup + Chrome vs Chromium exploration doc
- Update TODOS.md: mark CDP mode, $B watch, sidebar scout as SHIPPED
- Delete dead "cross-platform CDP browser discovery" TODO
- Rename dependencies from "CDP connect" to "headed mode"
- Add docs/designs/CHROME_VS_CHROMIUM_EXPLORATION.md memorializing
the architecture exploration and decision to use Playwright Chromium
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: add Conductor Chrome sidebar integration design doc
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: sidebar-agent validates cwd before spawning claude
The queue entry may reference a worktree that was cleaned up between
sessions. Now falls back to process.cwd() if the path doesn't exist,
preventing silent spawn failures.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: gen-skill-docs resolver merge + preamble tier gate + plan file discovery
The local RESOLVERS record in gen-skill-docs.ts was shadowing the imported
canonical resolvers, causing stale test coverage and preamble generators
to be used instead of the authoritative versions in resolvers/.
Changes:
- Merge imported RESOLVERS with local overrides (spread + override pattern)
- Fix preamble tier gate: tier 1 skills no longer get AskUserQuestion format
- Make plan file discovery host-agnostic (search multiple plan dirs)
- Add missing E2E tier entries for ship/review plan completion tests
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: ungate sidebar agent + raise timeout to 5 minutes (v0.12.0)
Sidebar chat is now always available in headed mode — no --chat flag needed.
Agent tasks get 5 minutes instead of 2, enabling multi-page workflows like
navigating directories and filling forms across pages.
Changes:
- cli.ts: remove --chat flag, always set BROWSE_SIDEBAR_CHAT=1, always spawn agent
- server.ts: remove chatEnabled gate (403 response), raise AGENT_TIMEOUT_MS to 300s
- sidebar-agent.ts: raise child process timeout from 120s to 300s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: headed mode + sidebar agent documentation (v0.12.0)
- README: sidebar agent section, personal automation example (school parent
portal), two auth paths (manual login + cookie import), DevTools MCP mention
- BROWSER.md: sidebar agent section with usage, timeout, session isolation,
authentication, and random delay documentation
- connect-chrome template: add sidebar chat onboarding step
- CHANGELOG: v0.12.0 entry covering headed mode, sidebar agent, extension
- VERSION: bump to 0.12.0.0
- TODOS: Chrome DevTools MCP integration as P0
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: regenerate SKILL.md files
Generated from updated templates + resolver merge. Key changes:
- Tier 1 skills no longer include AskUserQuestion format section
- Ship/review skills now include coverage gate with thresholds
- Connect-chrome skill includes sidebar chat onboarding step
- Plan file discovery uses host-agnostic paths
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: regenerate Codex connect-chrome skill
Updated preamble with proactive prompt and sidebar chat onboarding step.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: network idle, state persistence, iframe support, chain pipe format (v0.12.1.0) (#516)
* feat: network idle detection + chain pipe format
- Upgrade click/fill/select from domcontentloaded to networkidle wait
(2s timeout, best-effort). Catches XHR/fetch triggered by interactions.
- Add pipe-delimited format to chain as JSON fallback:
$B chain 'goto url | click @e5 | snapshot -ic'
- Add post-loop networkidle wait in chain when last command was a write.
- Frame-aware: commands use target (getActiveFrameOrPage) for locator ops,
page-only ops (goto/back/forward/reload) guard against frame context.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: $B state save/load + $B frame — new browse commands
- state save/load: persist cookies + URLs to .gstack/browse-states/{name}.json
File perms 0o600, name sanitized to [a-zA-Z0-9_-]. V1 skips localStorage
(breaks on load-before-navigate). Load replaces session via closeAllPages().
- frame: switch command context to iframe via CSS selector, @ref, --name, or
--url. 'frame main' returns to main frame. Execution target abstraction
(getActiveFrameOrPage) across read-commands, snapshot, and write-commands.
- Frame context cleared on tab switch, navigation, resume, and handoff.
- Snapshot shows [Context: iframe src="..."] header when in frame.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* test: add tests for network idle, chain pipe format, state, and frame
- Network idle: click on fetch button waits for XHR, static click is fast
- Chain pipe: pipe-delimited commands, quoted args, JSON still works
- State: save/load round-trip, name sanitization, missing state error
- Frame: switch to iframe + back, snapshot context header, fill in frame,
goto-in-frame guard, usage error
New fixtures: network-idle.html (fetch + static buttons), iframe.html (srcdoc)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: review fixes — iframe ref scoping, detached frame recovery, state validation
- snapshot.ts: ref locators, cursor-interactive scan, and cursor locator
now use target (frame-aware) instead of page — fixes @ref clicking in iframes
- browser-manager.ts: getActiveFrameOrPage auto-recovers from detached frames
via isDetached() check
- meta-commands.ts: state load resets activeFrame, elementHandle disposed after
contentFrame(), state file schema validation (cookies + pages arrays),
filter empty pipe segments in chain tokenizer
- write-commands.ts: upload command uses target.locator() for frame support
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: regenerate SKILL.md files + rebuild binary
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: bump version and changelog (v0.12.1.0)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
fix: review log architecture — close gaps, add attribution (v0.11.21.0) (#512)
* fix: review log architecture — close gaps, fix orphans, add attribution
- Ship Step 3.5 now logs its code review to the review log (via:"ship")
- Remove eng review gate — ship runs its own review in Step 3.5
- Dashboard Outside Voice row mapped to codex-plan-review
- Dashboard shows via source attribution (e.g., "via /autoplan")
- land-and-deploy checks all 8 review skill types (was 5)
- codex-review log gets commit field for staleness detection
- autoplan uses placeholder tokens instead of hardcoded "clean"
- Document autoplan-voices as audit-trail-only in review.ts
- E2E test for dashboard via attribution
* chore: bump version and changelog (v0.11.21.0)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
feat: GitLab support for /retro, /ship, and /document-release (v0.11.20.0) (#508)
* feat: multi-platform BASE_BRANCH_DETECT (GitHub + GitLab + GHE + git-native)
Update the shared BASE_BRANCH_DETECT resolver to support GitHub, GitLab,
GitHub Enterprise, self-hosted GitLab, and a git-native fallback chain.
Platform detection uses remote URL matching plus CLI auth status for
custom domains. Add glab issue create alternative in test failure triage.
Add 7 new test assertions covering GitLab CLI presence, git symbolic-ref
fallback, and platform-specific output in retro and ship generated files.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: GitLab support in /retro — use shared BASE_BRANCH_DETECT resolver
Replace retro's custom gh-only default branch detection with the shared
BASE_BRANCH_DETECT resolver (DRY — same as 10 other skills). Update
PR/MR number extraction to match both GitHub #NNN and GitLab !NNN
patterns. Remove hardcoded github.com URL from the personal card footer.
Regenerate all SKILL.md files affected by the resolver update.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: GitLab MR creation in /ship + /document-release
Ship Step 1.5 now checks .gitlab-ci.yml for release workflows alongside
GitHub Actions. Step 8 routes to glab mr create on GitLab repos with
correct flag mapping (-b, -t, -d). Falls back to manual instructions
when no CLI is available. Document-release now reads MR body via
glab mr view -F json and updates via glab mr update on GitLab repos.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: add P2 TODO for land-and-deploy GitLab support
Track the remaining work to support GitLab in /land-and-deploy — MR
merge, CI polling, and deploy workflow detection using glab equivalents.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: adversarial review — GitLab gate, shell safety, MR prefix preservation
Three fixes from adversarial review:
1. land-and-deploy: add GitLab gate after Step 0 — prevents detection/
execution mismatch where agent detects GitLab but all subsequent
steps are GitHub-only
2. document-release: use heredoc for glab mr update body to avoid shell
metacharacter mangling ($, backticks, !) in MR descriptions
3. retro: preserve original #/! prefix in PR/MR number extraction —
GitLab !42 stays as !42, not incorrectly converted to #42
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: resolve merge conflicts — deduplicate gen-skill-docs resolvers
The merge from main created duplicate RESOLVERS records in gen-skill-docs.ts
(inline functions shadowing the imported module versions). Removed the inline
duplicates so the modular resolvers from scripts/resolvers/ are used.
Also added missing E2E_TIERS entries for plan-completion/verification tests.
* chore: bump version and changelog (v0.11.20.0)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
feat: universal 'one decision per question' AskUserQuestion rule (v0.11.12.1) (#427)
* feat: universal "one decision per question" rule for AskUserQuestion
Add item 5 to the shared AskUserQuestion Format in generateAskUserFormat():
"NEVER combine multiple independent decisions into a single AskUserQuestion."
Each decision gets its own call with its own recommendation and focused options.
Batching multiple calls in rapid succession is fine and often preferred.
This promotes a rule already enforced by 3 plan-review skills (eng, ceo, design)
to the universal baseline, covering all 23+ skills via the shared preamble.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* chore: bump version and changelog (v0.11.12.1)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: add missing OPENAI_SHORT_DESCRIPTION_LIMIT constant
The merge from main dropped this constant (defined in resolvers/codex-helpers.ts
on main's modular version, but needed inline in our monolithic version). Caused
CI check-freshness to fail on `--host codex` generation.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* docs: update project documentation for v0.11.14.1
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* docs: update project documentation for v0.11.16.2
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* docs: update project documentation for v0.11.18.1
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: add missing PLAN_COMPLETION_AUDIT resolvers to monolithic gen-skill-docs
The merge from main brought review/SKILL.md.tmpl with {{PLAN_COMPLETION_AUDIT_REVIEW}},
{{PLAN_COMPLETION_AUDIT_SHIP}}, and {{PLAN_VERIFICATION_EXEC}} placeholders, but the
local RESOLVERS map in the monolithic gen-skill-docs.ts didn't have entries for them.
Import the functions from scripts/resolvers/review.ts and register them.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
feat: test coverage gate + plan completion audit + auto-verification (v0.11.13.0) (#428)
* feat: test coverage gate + plan completion audit + auto-verification
Three new gates in /ship and /review:
1. Test coverage gate: configurable thresholds (60%/80% default), hard stop
below minimum with user override
2. Plan completion audit: discovers plan file, extracts actionable items,
cross-references against diff, gates on NOT DONE items
3. Auto-verification: invokes /qa-only inline with plan's verification
section, conditional on localhost reachability
Also: coverage warning in /review, plan completion data in /retro,
shared plan file discovery helper (DRY), ship metrics logging.
* chore: regenerate SKILL.md files
* chore: bump version and changelog (v0.11.13.0)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>