M CHANGELOG.md => CHANGELOG.md +13 -0
@@ 1,5 1,18 @@
# Changelog
+## [0.8.5] - 2026-03-19
+
+### Fixed
+
+- **Review log no longer breaks on branch names with `/`.** Branch names like `garrytan/design-system` caused review log writes to fail because Claude Code runs multi-line bash blocks as separate shell invocations, losing variables between commands. New `gstack-review-log` and `gstack-review-read` atomic helpers encapsulate the entire operation in a single command.
+- **All skill templates are now platform-agnostic.** Removed Rails-specific patterns (`bin/test-lane`, `RAILS_ENV`, `.includes()`, `rescue StandardError`, etc.) from `/ship`, `/review`, `/plan-ceo-review`, and `/plan-eng-review`. The review checklist now shows examples for Rails, Node, Python, and Django side-by-side.
+- **`/ship` reads CLAUDE.md to discover test commands** instead of hardcoding `bin/test-lane` and `npm run test`. If no test commands are found, it asks the user and persists the answer to CLAUDE.md.
+
+### Added
+
+- **Platform-agnostic design principle** codified in CLAUDE.md — skills must read project config, never hardcode framework commands.
+- **`## Testing` section** in CLAUDE.md for `/ship` test command discovery.
+
## [0.8.4] - 2026-03-19
### Added
M CLAUDE.md => CLAUDE.md +23 -0
@@ 30,6 30,17 @@ on `git diff` against the base branch. Each test declares its file dependencies
llm-judge, gen-skill-docs) trigger all tests. Use `EVALS_ALL=1` or the `:all` script
variants to force all tests. Run `eval:select` to preview which tests would run.
+## Testing
+
+```bash
+bun test # run before every commit — free, <2s
+bun run test:evals # run before shipping — paid, diff-based (~$4/run max)
+```
+
+`bun test` runs skill validation, gen-skill-docs quality checks, and browse
+integration tests. `bun run test:evals` runs LLM-judge quality evals and E2E
+tests via `claude -p`. Both must pass before creating a PR.
+
## Project structure
```
@@ 79,6 90,18 @@ SKILL.md files are **generated** from `.tmpl` templates. To update docs:
To add a new browse command: add it to `browse/src/commands.ts` and rebuild.
To add a snapshot flag: add it to `SNAPSHOT_FLAGS` in `browse/src/snapshot.ts` and rebuild.
+## Platform-agnostic design
+
+Skills must NEVER hardcode framework-specific commands, file patterns, or directory
+structures. Instead:
+
+1. **Read CLAUDE.md** for project-specific config (test commands, eval commands, etc.)
+2. **If missing, AskUserQuestion** — let the user tell you or let gstack search the repo
+3. **Persist the answer to CLAUDE.md** so we never have to ask again
+
+This applies to test commands, eval commands, deploy commands, and any other
+project-specific behavior. The project owns its config; gstack reads it.
+
## Writing SKILL templates
SKILL.md.tmpl files are **prompt templates read by Claude**, not bash scripts.
M VERSION => VERSION +1 -1
@@ 1,1 1,1 @@
-0.8.4
+0.8.5
A bin/gstack-review-log => bin/gstack-review-log +9 -0
@@ 0,0 1,9 @@
+#!/usr/bin/env bash
+# gstack-review-log — atomically log a review result
+# Usage: gstack-review-log '{"skill":"...","timestamp":"...","status":"..."}'
+set -euo pipefail
+SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
+eval $("$SCRIPT_DIR/gstack-slug" 2>/dev/null)
+GSTACK_HOME="${GSTACK_HOME:-$HOME/.gstack}"
+mkdir -p "$GSTACK_HOME/projects/$SLUG"
+echo "$1" >> "$GSTACK_HOME/projects/$SLUG/$BRANCH-reviews.jsonl"
A bin/gstack-review-read => bin/gstack-review-read +12 -0
@@ 0,0 1,12 @@
+#!/usr/bin/env bash
+# gstack-review-read — read review log and config for dashboard
+# Usage: gstack-review-read
+set -euo pipefail
+SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
+eval $("$SCRIPT_DIR/gstack-slug" 2>/dev/null)
+GSTACK_HOME="${GSTACK_HOME:-$HOME/.gstack}"
+cat "$GSTACK_HOME/projects/$SLUG/$BRANCH-reviews.jsonl" 2>/dev/null || echo "NO_REVIEWS"
+echo "---CONFIG---"
+"$SCRIPT_DIR/gstack-config" get skip_eng_review 2>/dev/null || echo "false"
+echo "---HEAD---"
+git rev-parse --short HEAD 2>/dev/null || echo "unknown"
M codex/SKILL.md => codex/SKILL.md +1 -4
@@ 279,10 279,7 @@ CROSS-MODEL ANALYSIS:
7. Persist the review result:
```bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
-BRANCH_SLUG=$(git rev-parse --abbrev-ref HEAD 2>/dev/null | tr '/' '-')
-mkdir -p ~/.gstack/projects/"$SLUG"
-echo '{"skill":"codex-review","timestamp":"TIMESTAMP","status":"STATUS","gate":"GATE","findings":N}' >> ~/.gstack/projects/"$SLUG"/"$BRANCH_SLUG"-reviews.jsonl
+~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"codex-review","timestamp":"TIMESTAMP","status":"STATUS","gate":"GATE","findings":N}'
```
Substitute: TIMESTAMP (ISO 8601), STATUS ("clean" if PASS, "issues_found" if FAIL),
M codex/SKILL.md.tmpl => codex/SKILL.md.tmpl +1 -4
@@ 126,10 126,7 @@ CROSS-MODEL ANALYSIS:
7. Persist the review result:
```bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
-BRANCH_SLUG=$(git rev-parse --abbrev-ref HEAD 2>/dev/null | tr '/' '-')
-mkdir -p ~/.gstack/projects/"$SLUG"
-echo '{"skill":"codex-review","timestamp":"TIMESTAMP","status":"STATUS","gate":"GATE","findings":N}' >> ~/.gstack/projects/"$SLUG"/"$BRANCH_SLUG"-reviews.jsonl
+~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"codex-review","timestamp":"TIMESTAMP","status":"STATUS","gate":"GATE","findings":N}'
```
Substitute: TIMESTAMP (ISO 8601), STATUS ("clean" if PASS, "issues_found" if FAIL),
M design-review/SKILL.md => design-review/SKILL.md +2 -4
@@ 635,8 635,7 @@ Compare screenshots and observations across pages for:
**Project-scoped:**
```bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
-mkdir -p ~/.gstack/projects/$SLUG
+source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null) && mkdir -p ~/.gstack/projects/$SLUG
```
Write to: `~/.gstack/projects/{slug}/{user}-{branch}-design-audit-{datetime}.md`
@@ 854,8 853,7 @@ Write the report to both local and project-scoped locations:
**Project-scoped:**
```bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
-mkdir -p ~/.gstack/projects/$SLUG
+source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null) && mkdir -p ~/.gstack/projects/$SLUG
```
Write to `~/.gstack/projects/{slug}/{user}-{branch}-design-audit-{datetime}.md`
M design-review/SKILL.md.tmpl => design-review/SKILL.md.tmpl +1 -2
@@ 220,8 220,7 @@ Write the report to both local and project-scoped locations:
**Project-scoped:**
```bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
-mkdir -p ~/.gstack/projects/$SLUG
+source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null) && mkdir -p ~/.gstack/projects/$SLUG
```
Write to `~/.gstack/projects/{slug}/{user}-{branch}-design-audit-{datetime}.md`
M office-hours/SKILL.md => office-hours/SKILL.md +1 -2
@@ 445,10 445,9 @@ Count the signals. You'll use this count in Phase 6 to determine which tier of c
Write the design document to the project directory.
```bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
+source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null) && mkdir -p ~/.gstack/projects/$SLUG
USER=$(whoami)
DATETIME=$(date +%Y%m%d-%H%M%S)
-mkdir -p ~/.gstack/projects/$SLUG
```
**Design lineage:** Before writing, check for existing design docs on this branch:
M office-hours/SKILL.md.tmpl => office-hours/SKILL.md.tmpl +1 -2
@@ 309,10 309,9 @@ Count the signals. You'll use this count in Phase 6 to determine which tier of c
Write the design document to the project directory.
```bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
+source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null) && mkdir -p ~/.gstack/projects/$SLUG
USER=$(whoami)
DATETIME=$(date +%Y%m%d-%H%M%S)
-mkdir -p ~/.gstack/projects/$SLUG
```
**Design lineage:** Before writing, check for existing design docs on this branch:
M plan-ceo-review/SKILL.md => plan-ceo-review/SKILL.md +16 -24
@@ 190,7 190,7 @@ Do NOT make any code changes. Do NOT start implementation. Your only job right n
## Prime Directives
1. Zero silent failures. Every failure mode must be visible — to the system, to the team, to the user. If a failure can happen silently, that is a critical defect in the plan.
-2. Every error has a name. Don't say "handle errors." Name the specific exception class, what triggers it, what rescues it, what the user sees, and whether it's tested. rescue StandardError is a code smell — call it out.
+2. Every error has a name. Don't say "handle errors." Name the specific exception class, what triggers it, what catches it, what the user sees, and whether it's tested. Catch-all error handling (e.g., catch Exception, rescue StandardError, except Exception) is a code smell — call it out.
3. Data flows have shadow paths. Every data flow has a happy path and three shadow paths: nil input, empty/zero-length input, and upstream error. Trace all four for every new flow.
4. Interactions have edge cases. Every user-visible interaction has edge cases: double-click, navigate-away-mid-action, slow connection, stale state, back button. Map them.
5. Observability is scope, not afterthought. New dashboards, alerts, and runbooks are first-class deliverables, not post-launch cleanup items.
@@ 248,8 248,8 @@ Run the following commands:
git log --oneline -30 # Recent history
git diff <base> --stat # What's already changed
git stash list # Any stashed work
-grep -r "TODO\|FIXME\|HACK\|XXX" --include="*.rb" --include="*.js" -l
-find . -name "*.rb" -newer Gemfile.lock | head -20 # Recently touched files
+grep -r "TODO\|FIXME\|HACK\|XXX" -l --exclude-dir=node_modules --exclude-dir=vendor --exclude-dir=.git . | head -30
+git log --since=30.days --name-only --format="" | sort | uniq -c | sort -rn | head -20 # Recently touched files
```
Then read CLAUDE.md, TODOS.md, and any existing architecture docs.
@@ 362,8 362,7 @@ Rules:
After the opt-in/cherry-pick ceremony, write the plan to disk so the vision and decisions survive beyond this conversation. Only run this step for EXPANSION and SELECTIVE EXPANSION modes.
```bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
-mkdir -p ~/.gstack/projects/$SLUG/ceo-plans
+source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null) && mkdir -p ~/.gstack/projects/$SLUG/ceo-plans
```
Before writing, check for existing CEO plans in the ceo-plans/ directory. If any are >30 days old or their branch has been merged/deleted, offer to archive them:
@@ 478,24 477,24 @@ For every new method, service, or codepath that can fail, fill in this table:
```
METHOD/CODEPATH | WHAT CAN GO WRONG | EXCEPTION CLASS
-------------------------|-----------------------------|-----------------
- ExampleService#call | API timeout | Faraday::TimeoutError
+ ExampleService#call | API timeout | TimeoutError
| API returns 429 | RateLimitError
- | API returns malformed JSON | JSON::ParserError
- | DB connection pool exhausted| ActiveRecord::ConnectionTimeoutError
- | Record not found | ActiveRecord::RecordNotFound
+ | API returns malformed JSON | JSONParseError
+ | DB connection pool exhausted| ConnectionPoolExhausted
+ | Record not found | RecordNotFound
-------------------------|-----------------------------|-----------------
EXCEPTION CLASS | RESCUED? | RESCUE ACTION | USER SEES
-----------------------------|-----------|------------------------|------------------
- Faraday::TimeoutError | Y | Retry 2x, then raise | "Service temporarily unavailable"
+ TimeoutError | Y | Retry 2x, then raise | "Service temporarily unavailable"
RateLimitError | Y | Backoff + retry | Nothing (transparent)
- JSON::ParserError | N ← GAP | — | 500 error ← BAD
- ConnectionTimeoutError | N ← GAP | — | 500 error ← BAD
- ActiveRecord::RecordNotFound | Y | Return nil, log warning | "Not found" message
+ JSONParseError | N ← GAP | — | 500 error ← BAD
+ ConnectionPoolExhausted | N ← GAP | — | 500 error ← BAD
+ RecordNotFound | Y | Return nil, log warning | "Not found" message
```
Rules for this section:
-* `rescue StandardError` is ALWAYS a smell. Name the specific exceptions.
-* `rescue => e` with only `Rails.logger.error(e.message)` is insufficient. Log the full context: what was being attempted, with what arguments, for what user/request.
+* Catch-all error handling (`rescue StandardError`, `catch (Exception e)`, `except Exception`) is ALWAYS a smell. Name the specific exceptions.
+* Catching an error with only a generic log message is insufficient. Log the full context: what was being attempted, with what arguments, for what user/request.
* Every rescued error must either: retry with backoff, degrade gracefully with a user-visible message, or re-raise with added context. "Swallow and continue" is almost never acceptable.
* For each GAP (unrescued error that should be rescued): specify the rescue action and what the user should see.
* For LLM/AI service calls specifically: what happens when the response is malformed? When it's empty? When it hallucinates invalid JSON? When the model returns a refusal? Each of these is a distinct failure mode.
@@ 792,9 791,7 @@ If any AskUserQuestion goes unanswered, note it here. Never silently default.
After producing the Completion Summary above, persist the review result:
```bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
-mkdir -p ~/.gstack/projects/$SLUG
-echo '{"skill":"plan-ceo-review","timestamp":"TIMESTAMP","status":"STATUS","unresolved":N,"critical_gaps":N,"mode":"MODE","commit":"COMMIT"}' >> ~/.gstack/projects/$SLUG/$BRANCH-reviews.jsonl
+~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"plan-ceo-review","timestamp":"TIMESTAMP","status":"STATUS","unresolved":N,"critical_gaps":N,"mode":"MODE","commit":"COMMIT"}'
```
Before running this command, substitute the placeholder values from the Completion Summary you just produced:
@@ 810,12 807,7 @@ Before running this command, substitute the placeholder values from the Completi
After completing the review, read the review log and config to display the dashboard.
```bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
-cat ~/.gstack/projects/$SLUG/$BRANCH-reviews.jsonl 2>/dev/null || echo "NO_REVIEWS"
-echo "---CONFIG---"
-~/.claude/skills/gstack/bin/gstack-config get skip_eng_review 2>/dev/null || echo "false"
-echo "---HEAD---"
-git rev-parse --short HEAD 2>/dev/null || echo "unknown"
+~/.claude/skills/gstack/bin/gstack-review-read
```
Parse the output. Find the most recent entry for each skill (plan-ceo-review, plan-eng-review, plan-design-review, design-review-lite, codex-review). Ignore entries with timestamps older than 7 days. For Design Review, show whichever is more recent between `plan-design-review` (full visual audit) and `design-review-lite` (code-level check). Append "(FULL)" or "(LITE)" to the status to distinguish. Display:
M plan-ceo-review/SKILL.md.tmpl => plan-ceo-review/SKILL.md.tmpl +15 -18
@@ 37,7 37,7 @@ Do NOT make any code changes. Do NOT start implementation. Your only job right n
## Prime Directives
1. Zero silent failures. Every failure mode must be visible — to the system, to the team, to the user. If a failure can happen silently, that is a critical defect in the plan.
-2. Every error has a name. Don't say "handle errors." Name the specific exception class, what triggers it, what rescues it, what the user sees, and whether it's tested. rescue StandardError is a code smell — call it out.
+2. Every error has a name. Don't say "handle errors." Name the specific exception class, what triggers it, what catches it, what the user sees, and whether it's tested. Catch-all error handling (e.g., catch Exception, rescue StandardError, except Exception) is a code smell — call it out.
3. Data flows have shadow paths. Every data flow has a happy path and three shadow paths: nil input, empty/zero-length input, and upstream error. Trace all four for every new flow.
4. Interactions have edge cases. Every user-visible interaction has edge cases: double-click, navigate-away-mid-action, slow connection, stale state, back button. Map them.
5. Observability is scope, not afterthought. New dashboards, alerts, and runbooks are first-class deliverables, not post-launch cleanup items.
@@ 95,8 95,8 @@ Run the following commands:
git log --oneline -30 # Recent history
git diff <base> --stat # What's already changed
git stash list # Any stashed work
-grep -r "TODO\|FIXME\|HACK\|XXX" --include="*.rb" --include="*.js" -l
-find . -name "*.rb" -newer Gemfile.lock | head -20 # Recently touched files
+grep -r "TODO\|FIXME\|HACK\|XXX" -l --exclude-dir=node_modules --exclude-dir=vendor --exclude-dir=.git . | head -30
+git log --since=30.days --name-only --format="" | sort | uniq -c | sort -rn | head -20 # Recently touched files
```
Then read CLAUDE.md, TODOS.md, and any existing architecture docs.
@@ 209,8 209,7 @@ Rules:
After the opt-in/cherry-pick ceremony, write the plan to disk so the vision and decisions survive beyond this conversation. Only run this step for EXPANSION and SELECTIVE EXPANSION modes.
```bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
-mkdir -p ~/.gstack/projects/$SLUG/ceo-plans
+source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null) && mkdir -p ~/.gstack/projects/$SLUG/ceo-plans
```
Before writing, check for existing CEO plans in the ceo-plans/ directory. If any are >30 days old or their branch has been merged/deleted, offer to archive them:
@@ 325,24 324,24 @@ For every new method, service, or codepath that can fail, fill in this table:
```
METHOD/CODEPATH | WHAT CAN GO WRONG | EXCEPTION CLASS
-------------------------|-----------------------------|-----------------
- ExampleService#call | API timeout | Faraday::TimeoutError
+ ExampleService#call | API timeout | TimeoutError
| API returns 429 | RateLimitError
- | API returns malformed JSON | JSON::ParserError
- | DB connection pool exhausted| ActiveRecord::ConnectionTimeoutError
- | Record not found | ActiveRecord::RecordNotFound
+ | API returns malformed JSON | JSONParseError
+ | DB connection pool exhausted| ConnectionPoolExhausted
+ | Record not found | RecordNotFound
-------------------------|-----------------------------|-----------------
EXCEPTION CLASS | RESCUED? | RESCUE ACTION | USER SEES
-----------------------------|-----------|------------------------|------------------
- Faraday::TimeoutError | Y | Retry 2x, then raise | "Service temporarily unavailable"
+ TimeoutError | Y | Retry 2x, then raise | "Service temporarily unavailable"
RateLimitError | Y | Backoff + retry | Nothing (transparent)
- JSON::ParserError | N ← GAP | — | 500 error ← BAD
- ConnectionTimeoutError | N ← GAP | — | 500 error ← BAD
- ActiveRecord::RecordNotFound | Y | Return nil, log warning | "Not found" message
+ JSONParseError | N ← GAP | — | 500 error ← BAD
+ ConnectionPoolExhausted | N ← GAP | — | 500 error ← BAD
+ RecordNotFound | Y | Return nil, log warning | "Not found" message
```
Rules for this section:
-* `rescue StandardError` is ALWAYS a smell. Name the specific exceptions.
-* `rescue => e` with only `Rails.logger.error(e.message)` is insufficient. Log the full context: what was being attempted, with what arguments, for what user/request.
+* Catch-all error handling (`rescue StandardError`, `catch (Exception e)`, `except Exception`) is ALWAYS a smell. Name the specific exceptions.
+* Catching an error with only a generic log message is insufficient. Log the full context: what was being attempted, with what arguments, for what user/request.
* Every rescued error must either: retry with backoff, degrade gracefully with a user-visible message, or re-raise with added context. "Swallow and continue" is almost never acceptable.
* For each GAP (unrescued error that should be rescued): specify the rescue action and what the user should see.
* For LLM/AI service calls specifically: what happens when the response is malformed? When it's empty? When it hallucinates invalid JSON? When the model returns a refusal? Each of these is a distinct failure mode.
@@ 639,9 638,7 @@ If any AskUserQuestion goes unanswered, note it here. Never silently default.
After producing the Completion Summary above, persist the review result:
```bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
-mkdir -p ~/.gstack/projects/$SLUG
-echo '{"skill":"plan-ceo-review","timestamp":"TIMESTAMP","status":"STATUS","unresolved":N,"critical_gaps":N,"mode":"MODE","commit":"COMMIT"}' >> ~/.gstack/projects/$SLUG/$BRANCH-reviews.jsonl
+~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"plan-ceo-review","timestamp":"TIMESTAMP","status":"STATUS","unresolved":N,"critical_gaps":N,"mode":"MODE","commit":"COMMIT"}'
```
Before running this command, substitute the placeholder values from the Completion Summary you just produced:
M plan-design-review/SKILL.md => plan-design-review/SKILL.md +2 -9
@@ 422,9 422,7 @@ If any AskUserQuestion goes unanswered, note it here. Never silently default to
After producing the Completion Summary above, persist the review result:
```bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
-mkdir -p ~/.gstack/projects/$SLUG
-echo '{"skill":"plan-design-review","timestamp":"TIMESTAMP","status":"STATUS","overall_score":N,"unresolved":N,"decisions_made":N,"commit":"COMMIT"}' >> ~/.gstack/projects/$SLUG/$BRANCH-reviews.jsonl
+~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"plan-design-review","timestamp":"TIMESTAMP","status":"STATUS","overall_score":N,"unresolved":N,"decisions_made":N,"commit":"COMMIT"}'
```
Substitute values from the Completion Summary:
@@ 440,12 438,7 @@ Substitute values from the Completion Summary:
After completing the review, read the review log and config to display the dashboard.
```bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
-cat ~/.gstack/projects/$SLUG/$BRANCH-reviews.jsonl 2>/dev/null || echo "NO_REVIEWS"
-echo "---CONFIG---"
-~/.claude/skills/gstack/bin/gstack-config get skip_eng_review 2>/dev/null || echo "false"
-echo "---HEAD---"
-git rev-parse --short HEAD 2>/dev/null || echo "unknown"
+~/.claude/skills/gstack/bin/gstack-review-read
```
Parse the output. Find the most recent entry for each skill (plan-ceo-review, plan-eng-review, plan-design-review, design-review-lite, codex-review). Ignore entries with timestamps older than 7 days. For Design Review, show whichever is more recent between `plan-design-review` (full visual audit) and `design-review-lite` (code-level check). Append "(FULL)" or "(LITE)" to the status to distinguish. Display:
M plan-design-review/SKILL.md.tmpl => plan-design-review/SKILL.md.tmpl +1 -3
@@ 269,9 269,7 @@ If any AskUserQuestion goes unanswered, note it here. Never silently default to
After producing the Completion Summary above, persist the review result:
```bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
-mkdir -p ~/.gstack/projects/$SLUG
-echo '{"skill":"plan-design-review","timestamp":"TIMESTAMP","status":"STATUS","overall_score":N,"unresolved":N,"decisions_made":N,"commit":"COMMIT"}' >> ~/.gstack/projects/$SLUG/$BRANCH-reviews.jsonl
+~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"plan-design-review","timestamp":"TIMESTAMP","status":"STATUS","overall_score":N,"unresolved":N,"decisions_made":N,"commit":"COMMIT"}'
```
Substitute values from the Completion Summary:
M plan-eng-review/SKILL.md => plan-eng-review/SKILL.md +4 -12
@@ 273,7 273,7 @@ Evaluate:
**STOP.** For each issue found in this section, call AskUserQuestion individually. One issue per call. Present options, state your recommendation, explain WHY. Do NOT batch multiple issues into one AskUserQuestion. Only proceed to the next section after ALL issues in this section are resolved.
### 3. Test review
-Make a diagram of all new UX, new data flow, new codepaths, and new branching if statements or outcomes. For each, note what is new about the features discussed in this branch and plan. Then, for each new item in the diagram, make sure there is a JS or Rails test.
+Make a diagram of all new UX, new data flow, new codepaths, and new branching if statements or outcomes. For each, note what is new about the features discussed in this branch and plan. Then, for each new item in the diagram, make sure there is a corresponding test.
For LLM/prompt changes: check the "Prompt/LLM changes" file patterns listed in CLAUDE.md. If this plan touches ANY of those patterns, state which eval suites must be run, which cases should be added, and what baselines to compare against. Then use AskUserQuestion to confirm the eval scope with the user.
@@ 284,10 284,9 @@ For LLM/prompt changes: check the "Prompt/LLM changes" file patterns listed in C
After producing the test diagram, write a test plan artifact to the project directory so `/qa` and `/qa-only` can consume it as primary test input (replacing the lossy git-diff heuristic):
```bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
+source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null) && mkdir -p ~/.gstack/projects/$SLUG
USER=$(whoami)
DATETIME=$(date +%Y%m%d-%H%M%S)
-mkdir -p ~/.gstack/projects/$SLUG
```
Write to `~/.gstack/projects/{slug}/{user}-{branch}-test-plan-{datetime}.md`:
@@ 393,9 392,7 @@ Check the git log for this branch. If there are prior commits suggesting a previ
After producing the Completion Summary above, persist the review result:
```bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
-mkdir -p ~/.gstack/projects/$SLUG
-echo '{"skill":"plan-eng-review","timestamp":"TIMESTAMP","status":"STATUS","unresolved":N,"critical_gaps":N,"mode":"MODE","commit":"COMMIT"}' >> ~/.gstack/projects/$SLUG/$BRANCH-reviews.jsonl
+~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"plan-eng-review","timestamp":"TIMESTAMP","status":"STATUS","unresolved":N,"critical_gaps":N,"mode":"MODE","commit":"COMMIT"}'
```
Substitute values from the Completion Summary:
@@ 411,12 408,7 @@ Substitute values from the Completion Summary:
After completing the review, read the review log and config to display the dashboard.
```bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
-cat ~/.gstack/projects/$SLUG/$BRANCH-reviews.jsonl 2>/dev/null || echo "NO_REVIEWS"
-echo "---CONFIG---"
-~/.claude/skills/gstack/bin/gstack-config get skip_eng_review 2>/dev/null || echo "false"
-echo "---HEAD---"
-git rev-parse --short HEAD 2>/dev/null || echo "unknown"
+~/.claude/skills/gstack/bin/gstack-review-read
```
Parse the output. Find the most recent entry for each skill (plan-ceo-review, plan-eng-review, plan-design-review, design-review-lite, codex-review). Ignore entries with timestamps older than 7 days. For Design Review, show whichever is more recent between `plan-design-review` (full visual audit) and `design-review-lite` (code-level check). Append "(FULL)" or "(LITE)" to the status to distinguish. Display:
M plan-eng-review/SKILL.md.tmpl => plan-eng-review/SKILL.md.tmpl +3 -6
@@ 137,7 137,7 @@ Evaluate:
**STOP.** For each issue found in this section, call AskUserQuestion individually. One issue per call. Present options, state your recommendation, explain WHY. Do NOT batch multiple issues into one AskUserQuestion. Only proceed to the next section after ALL issues in this section are resolved.
### 3. Test review
-Make a diagram of all new UX, new data flow, new codepaths, and new branching if statements or outcomes. For each, note what is new about the features discussed in this branch and plan. Then, for each new item in the diagram, make sure there is a JS or Rails test.
+Make a diagram of all new UX, new data flow, new codepaths, and new branching if statements or outcomes. For each, note what is new about the features discussed in this branch and plan. Then, for each new item in the diagram, make sure there is a corresponding test.
For LLM/prompt changes: check the "Prompt/LLM changes" file patterns listed in CLAUDE.md. If this plan touches ANY of those patterns, state which eval suites must be run, which cases should be added, and what baselines to compare against. Then use AskUserQuestion to confirm the eval scope with the user.
@@ 148,10 148,9 @@ For LLM/prompt changes: check the "Prompt/LLM changes" file patterns listed in C
After producing the test diagram, write a test plan artifact to the project directory so `/qa` and `/qa-only` can consume it as primary test input (replacing the lossy git-diff heuristic):
```bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
+source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null) && mkdir -p ~/.gstack/projects/$SLUG
USER=$(whoami)
DATETIME=$(date +%Y%m%d-%H%M%S)
-mkdir -p ~/.gstack/projects/$SLUG
```
Write to `~/.gstack/projects/{slug}/{user}-{branch}-test-plan-{datetime}.md`:
@@ 257,9 256,7 @@ Check the git log for this branch. If there are prior commits suggesting a previ
After producing the Completion Summary above, persist the review result:
```bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
-mkdir -p ~/.gstack/projects/$SLUG
-echo '{"skill":"plan-eng-review","timestamp":"TIMESTAMP","status":"STATUS","unresolved":N,"critical_gaps":N,"mode":"MODE","commit":"COMMIT"}' >> ~/.gstack/projects/$SLUG/$BRANCH-reviews.jsonl
+~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"plan-eng-review","timestamp":"TIMESTAMP","status":"STATUS","unresolved":N,"critical_gaps":N,"mode":"MODE","commit":"COMMIT"}'
```
Substitute values from the Completion Summary:
M qa-only/SKILL.md => qa-only/SKILL.md +1 -2
@@ 502,8 502,7 @@ Write the report to both local and project-scoped locations:
**Project-scoped:** Write test outcome artifact for cross-session context:
```bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
-mkdir -p ~/.gstack/projects/$SLUG
+source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null) && mkdir -p ~/.gstack/projects/$SLUG
```
Write to `~/.gstack/projects/{slug}/{user}-{branch}-test-outcome-{datetime}.md`
M qa-only/SKILL.md.tmpl => qa-only/SKILL.md.tmpl +1 -2
@@ 73,8 73,7 @@ Write the report to both local and project-scoped locations:
**Project-scoped:** Write test outcome artifact for cross-session context:
```bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
-mkdir -p ~/.gstack/projects/$SLUG
+source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null) && mkdir -p ~/.gstack/projects/$SLUG
```
Write to `~/.gstack/projects/{slug}/{user}-{branch}-test-outcome-{datetime}.md`
M qa/SKILL.md => qa/SKILL.md +1 -2
@@ 874,8 874,7 @@ Write the report to both local and project-scoped locations:
**Project-scoped:** Write test outcome artifact for cross-session context:
```bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
-mkdir -p ~/.gstack/projects/$SLUG
+source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null) && mkdir -p ~/.gstack/projects/$SLUG
```
Write to `~/.gstack/projects/{slug}/{user}-{branch}-test-outcome-{datetime}.md`
M qa/SKILL.md.tmpl => qa/SKILL.md.tmpl +1 -2
@@ 277,8 277,7 @@ Write the report to both local and project-scoped locations:
**Project-scoped:** Write test outcome artifact for cross-session context:
```bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
-mkdir -p ~/.gstack/projects/$SLUG
+source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null) && mkdir -p ~/.gstack/projects/$SLUG
```
Write to `~/.gstack/projects/{slug}/{user}-{branch}-test-outcome-{datetime}.md`
M review/SKILL.md => review/SKILL.md +2 -7
@@ 294,9 294,7 @@ source <(~/.claude/skills/gstack/bin/gstack-diff-scope <base> 2>/dev/null)
6. **Log the result** for the Review Readiness Dashboard:
```bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
-mkdir -p ~/.gstack/projects/$SLUG
-echo '{"skill":"design-review-lite","timestamp":"TIMESTAMP","status":"STATUS","findings":N,"auto_fixed":M,"commit":"COMMIT"}' >> ~/.gstack/projects/$SLUG/$BRANCH-reviews.jsonl
+~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"design-review-lite","timestamp":"TIMESTAMP","status":"STATUS","findings":N,"auto_fixed":M,"commit":"COMMIT"}'
```
Substitute: TIMESTAMP = ISO 8601 datetime, STATUS = "clean" if 0 findings or "issues_found", N = total findings, M = auto-fixed count, COMMIT = output of `git rev-parse --short HEAD`.
@@ 453,10 451,7 @@ Present the full output verbatim under a `CODEX SAYS (adversarial challenge):` h
**Only if a code review ran (user chose A or C):** Persist the Codex review result to the review log:
```bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
-BRANCH_SLUG=$(git rev-parse --abbrev-ref HEAD 2>/dev/null | tr '/' '-')
-mkdir -p ~/.gstack/projects/"$SLUG"
-echo '{"skill":"codex-review","timestamp":"'"$(date -u +%Y-%m-%dT%H:%M:%SZ)"'","status":"STATUS","gate":"GATE"}' >> ~/.gstack/projects/"$SLUG"/"$BRANCH_SLUG"-reviews.jsonl
+~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"codex-review","timestamp":"'"$(date -u +%Y-%m-%dT%H:%M:%SZ)"'","status":"STATUS","gate":"GATE"}'
```
Substitute: STATUS ("clean" if PASS, "issues_found" if FAIL), GATE ("pass" or "fail").
M review/SKILL.md.tmpl => review/SKILL.md.tmpl +1 -4
@@ 267,10 267,7 @@ Present the full output verbatim under a `CODEX SAYS (adversarial challenge):` h
**Only if a code review ran (user chose A or C):** Persist the Codex review result to the review log:
```bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
-BRANCH_SLUG=$(git rev-parse --abbrev-ref HEAD 2>/dev/null | tr '/' '-')
-mkdir -p ~/.gstack/projects/"$SLUG"
-echo '{"skill":"codex-review","timestamp":"'"$(date -u +%Y-%m-%dT%H:%M:%SZ)"'","status":"STATUS","gate":"GATE"}' >> ~/.gstack/projects/"$SLUG"/"$BRANCH_SLUG"-reviews.jsonl
+~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"codex-review","timestamp":"'"$(date -u +%Y-%m-%dT%H:%M:%SZ)"'","status":"STATUS","gate":"GATE"}'
```
Substitute: STATUS ("clean" if PASS, "issues_found" if FAIL), GATE ("pass" or "fail").
M review/checklist.md => review/checklist.md +7 -7
@@ 35,16 35,16 @@ Be terse. For each issue: one line describing the problem, one line with the fix
### Pass 1 — CRITICAL
#### SQL & Data Safety
-- String interpolation in SQL (even if values are `.to_i`/`.to_f` — use `sanitize_sql_array` or Arel)
+- String interpolation in SQL (even if values are `.to_i`/`.to_f` — use parameterized queries (Rails: sanitize_sql_array/Arel; Node: prepared statements; Python: parameterized queries))
- TOCTOU races: check-then-set patterns that should be atomic `WHERE` + `update_all`
-- `update_column`/`update_columns` bypassing validations on fields that have or should have constraints
-- N+1 queries: `.includes()` missing for associations used in loops/views (especially avatar, attachments)
+- Bypassing model validations for direct DB writes (Rails: update_column; Django: QuerySet.update(); Prisma: raw queries)
+- N+1 queries: Missing eager loading (Rails: .includes(); SQLAlchemy: joinedload(); Prisma: include) for associations used in loops/views
#### Race Conditions & Concurrency
-- Read-check-write without uniqueness constraint or `rescue RecordNotUnique; retry` (e.g., `where(hash:).first` then `save!` without handling concurrent insert)
-- `find_or_create_by` on columns without unique DB index — concurrent calls can create duplicates
+- Read-check-write without uniqueness constraint or catch duplicate key error and retry (e.g., `where(hash:).first` then `save!` without handling concurrent insert)
+- find-or-create without unique DB index — concurrent calls can create duplicates
- Status transitions that don't use atomic `WHERE old_status = ? UPDATE SET new_status` — concurrent updates can skip or double-apply transitions
-- `html_safe` on user-controlled data (XSS) — check any `.html_safe`, `raw()`, or string interpolation into `html_safe` output
+- Unsafe HTML rendering (Rails: .html_safe/raw(); React: dangerouslySetInnerHTML; Vue: v-html; Django: |safe/mark_safe) on user-controlled data (XSS)
#### LLM Output Trust Boundary
- LLM-generated values (emails, URLs, names) written to DB or passed to mailers without format validation. Add lightweight guards (`EMAIL_REGEXP`, `URI.parse`, `.strip`) before persisting.
@@ 141,7 141,7 @@ the agent auto-fixes a finding or asks the user.
```
AUTO-FIX (agent fixes without asking): ASK (needs human judgment):
├─ Dead code / unused variables ├─ Security (auth, XSS, injection)
-├─ N+1 queries (missing .includes()) ├─ Race conditions
+├─ N+1 queries (missing eager loading) ├─ Race conditions
├─ Stale comments contradicting code ├─ Design decisions
├─ Magic numbers → named constants ├─ Large fixes (>20 lines)
├─ Missing LLM output validation ├─ Enum completeness
M scripts/gen-skill-docs.ts => scripts/gen-skill-docs.ts +3 -11
@@ 590,9 590,7 @@ source <(~/.claude/skills/gstack/bin/gstack-diff-scope <base> 2>/dev/null)
6. **Log the result** for the Review Readiness Dashboard:
\`\`\`bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
-mkdir -p ~/.gstack/projects/$SLUG
-echo '{"skill":"design-review-lite","timestamp":"TIMESTAMP","status":"STATUS","findings":N,"auto_fixed":M,"commit":"COMMIT"}' >> ~/.gstack/projects/$SLUG/$BRANCH-reviews.jsonl
+~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"design-review-lite","timestamp":"TIMESTAMP","status":"STATUS","findings":N,"auto_fixed":M,"commit":"COMMIT"}'
\`\`\`
Substitute: TIMESTAMP = ISO 8601 datetime, STATUS = "clean" if 0 findings or "issues_found", N = total findings, M = auto-fixed count, COMMIT = output of \`git rev-parse --short HEAD\`.`;
@@ 850,8 848,7 @@ Compare screenshots and observations across pages for:
**Project-scoped:**
\`\`\`bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
-mkdir -p ~/.gstack/projects/$SLUG
+source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null) && mkdir -p ~/.gstack/projects/$SLUG
\`\`\`
Write to: \`~/.gstack/projects/{slug}/{user}-{branch}-design-audit-{datetime}.md\`
@@ 940,12 937,7 @@ function generateReviewDashboard(_ctx: TemplateContext): string {
After completing the review, read the review log and config to display the dashboard.
\`\`\`bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
-cat ~/.gstack/projects/$SLUG/$BRANCH-reviews.jsonl 2>/dev/null || echo "NO_REVIEWS"
-echo "---CONFIG---"
-~/.claude/skills/gstack/bin/gstack-config get skip_eng_review 2>/dev/null || echo "false"
-echo "---HEAD---"
-git rev-parse --short HEAD 2>/dev/null || echo "unknown"
+~/.claude/skills/gstack/bin/gstack-review-read
\`\`\`
Parse the output. Find the most recent entry for each skill (plan-ceo-review, plan-eng-review, plan-design-review, design-review-lite, codex-review). Ignore entries with timestamps older than 7 days. For Design Review, show whichever is more recent between \`plan-design-review\` (full visual audit) and \`design-review-lite\` (code-level check). Append "(FULL)" or "(LITE)" to the status to distinguish. Display:
M ship/SKILL.md => ship/SKILL.md +3 -13
@@ 213,12 213,7 @@ You are running the `/ship` workflow. This is a **non-interactive, fully automat
After completing the review, read the review log and config to display the dashboard.
```bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
-cat ~/.gstack/projects/$SLUG/$BRANCH-reviews.jsonl 2>/dev/null || echo "NO_REVIEWS"
-echo "---CONFIG---"
-~/.claude/skills/gstack/bin/gstack-config get skip_eng_review 2>/dev/null || echo "false"
-echo "---HEAD---"
-git rev-parse --short HEAD 2>/dev/null || echo "unknown"
+~/.claude/skills/gstack/bin/gstack-review-read
```
Parse the output. Find the most recent entry for each skill (plan-ceo-review, plan-eng-review, plan-design-review, design-review-lite, codex-review). Ignore entries with timestamps older than 7 days. For Design Review, show whichever is more recent between `plan-design-review` (full visual audit) and `design-review-lite` (code-level check). Append "(FULL)" or "(LITE)" to the status to distinguish. Display:
@@ 714,9 709,7 @@ source <(~/.claude/skills/gstack/bin/gstack-diff-scope <base> 2>/dev/null)
6. **Log the result** for the Review Readiness Dashboard:
```bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
-mkdir -p ~/.gstack/projects/$SLUG
-echo '{"skill":"design-review-lite","timestamp":"TIMESTAMP","status":"STATUS","findings":N,"auto_fixed":M,"commit":"COMMIT"}' >> ~/.gstack/projects/$SLUG/$BRANCH-reviews.jsonl
+~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"design-review-lite","timestamp":"TIMESTAMP","status":"STATUS","findings":N,"auto_fixed":M,"commit":"COMMIT"}'
```
Substitute: TIMESTAMP = ISO 8601 datetime, STATUS = "clean" if 0 findings or "issues_found", N = total findings, M = auto-fixed count, COMMIT = output of `git rev-parse --short HEAD`.
@@ 811,10 804,7 @@ Present the full output verbatim under a `CODEX SAYS:` header. Check for `[P1]`
to determine pass/fail gate. Persist the result:
```bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
-BRANCH_SLUG=$(git rev-parse --abbrev-ref HEAD 2>/dev/null | tr '/' '-')
-mkdir -p ~/.gstack/projects/$SLUG
-echo '{"skill":"codex-review","timestamp":"TIMESTAMP","status":"STATUS","gate":"GATE"}' >> ~/.gstack/projects/$SLUG/$BRANCH_SLUG-reviews.jsonl
+~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"codex-review","timestamp":"TIMESTAMP","status":"STATUS","gate":"GATE"}'
```
If GATE is FAIL, use AskUserQuestion: "Codex found critical issues. Ship anyway?"
M ship/SKILL.md.tmpl => ship/SKILL.md.tmpl +1 -4
@@ 428,10 428,7 @@ Present the full output verbatim under a `CODEX SAYS:` header. Check for `[P1]`
to determine pass/fail gate. Persist the result:
```bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
-BRANCH_SLUG=$(git rev-parse --abbrev-ref HEAD 2>/dev/null | tr '/' '-')
-mkdir -p ~/.gstack/projects/$SLUG
-echo '{"skill":"codex-review","timestamp":"TIMESTAMP","status":"STATUS","gate":"GATE"}' >> ~/.gstack/projects/$SLUG/$BRANCH_SLUG-reviews.jsonl
+~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"codex-review","timestamp":"TIMESTAMP","status":"STATUS","gate":"GATE"}'
```
If GATE is FAIL, use AskUserQuestion: "Codex found critical issues. Ship anyway?"
M test/gen-skill-docs.test.ts => test/gen-skill-docs.test.ts +1 -1
@@ 354,7 354,7 @@ describe('REVIEW_DASHBOARD resolver', () => {
for (const skill of REVIEW_SKILLS) {
test(`review dashboard appears in ${skill} generated file`, () => {
const content = fs.readFileSync(path.join(ROOT, skill, 'SKILL.md'), 'utf-8');
- expect(content).toContain('reviews.jsonl');
+ expect(content).toContain('gstack-review');
expect(content).toContain('REVIEW READINESS DASHBOARD');
});
}
M test/skill-validation.test.ts => test/skill-validation.test.ts +1 -1
@@ 1167,7 1167,7 @@ describe('Codex skill', () => {
test('codex/SKILL.md contains review log persistence', () => {
const content = fs.readFileSync(path.join(ROOT, 'codex', 'SKILL.md'), 'utf-8');
expect(content).toContain('codex-review');
- expect(content).toContain('reviews.jsonl');
+ expect(content).toContain('gstack-review-log');
});
test('codex/SKILL.md uses which for binary discovery, not hardcoded path', () => {