M CHANGELOG.md => CHANGELOG.md +11 -0
@@ 1,5 1,16 @@
# Changelog
+## [0.11.21.0] - 2026-03-26
+
+### Fixed
+
+- **`/autoplan` reviews now count toward the ship readiness gate.** When `/autoplan` ran full CEO + Design + Eng reviews, `/ship` still showed "0 runs" for Eng Review because autoplan-logged entries weren't being read correctly. Now the dashboard shows source attribution (e.g., "CLEAR (PLAN via /autoplan)") so you can see exactly which tool satisfied each review.
+- **`/ship` no longer tells you to "run /review first."** Ship runs its own pre-landing review in Step 3.5 — asking you to run the same review separately was redundant. The gate is removed; ship just does it.
+- **`/land-and-deploy` now checks all 8 review types.** Previously missed `review`, `adversarial-review`, and `codex-plan-review` — if you only ran `/review` (not `/plan-eng-review`), land-and-deploy wouldn't see it.
+- **Dashboard Outside Voice row now works.** Was showing "0 runs" even after outside voices ran in `/plan-ceo-review` or `/plan-eng-review`. Now correctly maps to `codex-plan-review` entries.
+- **`/codex review` now tracks staleness.** Added the `commit` field to codex review log entries so the dashboard can detect when a codex review is outdated.
+- **`/autoplan` no longer hardcodes "clean" status.** Review log entries from autoplan used to always record `status:"clean"` even when issues were found. Now uses proper placeholder tokens that Claude substitutes with real values.
+
## [0.11.20.0] - 2026-03-26
### Added
M VERSION => VERSION +1 -1
@@ 1,1 1,1 @@
-0.11.20.0
+0.11.21.0
M autoplan/SKILL.md => autoplan/SKILL.md +6 -6
@@ 929,24 929,24 @@ AskUserQuestion options:
## Completion: Write Review Logs
-On approval, write 3 separate review log entries so /ship's dashboard recognizes them:
+On approval, write 3 separate review log entries so /ship's dashboard recognizes them.
+Replace TIMESTAMP, STATUS, and N with actual values from each review phase.
+STATUS is "clean" if no unresolved issues, "issues_open" otherwise.
```bash
COMMIT=$(git rev-parse --short HEAD 2>/dev/null)
TIMESTAMP=$(date -u +%Y-%m-%dT%H:%M:%SZ)
-~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"plan-ceo-review","timestamp":"'"$TIMESTAMP"'","status":"clean","unresolved":0,"critical_gaps":0,"mode":"SELECTIVE_EXPANSION","via":"autoplan","commit":"'"$COMMIT"'"}'
+~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"plan-ceo-review","timestamp":"'"$TIMESTAMP"'","status":"STATUS","unresolved":N,"critical_gaps":N,"mode":"SELECTIVE_EXPANSION","via":"autoplan","commit":"'"$COMMIT"'"}'
-~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"plan-eng-review","timestamp":"'"$TIMESTAMP"'","status":"clean","unresolved":0,"critical_gaps":0,"issues_found":0,"mode":"FULL_REVIEW","via":"autoplan","commit":"'"$COMMIT"'"}'
+~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"plan-eng-review","timestamp":"'"$TIMESTAMP"'","status":"STATUS","unresolved":N,"critical_gaps":N,"issues_found":N,"mode":"FULL_REVIEW","via":"autoplan","commit":"'"$COMMIT"'"}'
```
If Phase 2 ran (UI scope):
```bash
-~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"plan-design-review","timestamp":"'"$TIMESTAMP"'","status":"clean","unresolved":0,"via":"autoplan","commit":"'"$COMMIT"'"}'
+~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"plan-design-review","timestamp":"'"$TIMESTAMP"'","status":"STATUS","unresolved":N,"via":"autoplan","commit":"'"$COMMIT"'"}'
```
-Replace field values with actual counts from the review.
-
Dual voice logs (one per phase that ran):
```bash
~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"autoplan-voices","timestamp":"'"$TIMESTAMP"'","status":"STATUS","source":"SOURCE","phase":"ceo","via":"autoplan","consensus_confirmed":N,"consensus_disagree":N,"commit":"'"$COMMIT"'"}'
M autoplan/SKILL.md.tmpl => autoplan/SKILL.md.tmpl +6 -6
@@ 584,24 584,24 @@ AskUserQuestion options:
## Completion: Write Review Logs
-On approval, write 3 separate review log entries so /ship's dashboard recognizes them:
+On approval, write 3 separate review log entries so /ship's dashboard recognizes them.
+Replace TIMESTAMP, STATUS, and N with actual values from each review phase.
+STATUS is "clean" if no unresolved issues, "issues_open" otherwise.
```bash
COMMIT=$(git rev-parse --short HEAD 2>/dev/null)
TIMESTAMP=$(date -u +%Y-%m-%dT%H:%M:%SZ)
-~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"plan-ceo-review","timestamp":"'"$TIMESTAMP"'","status":"clean","unresolved":0,"critical_gaps":0,"mode":"SELECTIVE_EXPANSION","via":"autoplan","commit":"'"$COMMIT"'"}'
+~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"plan-ceo-review","timestamp":"'"$TIMESTAMP"'","status":"STATUS","unresolved":N,"critical_gaps":N,"mode":"SELECTIVE_EXPANSION","via":"autoplan","commit":"'"$COMMIT"'"}'
-~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"plan-eng-review","timestamp":"'"$TIMESTAMP"'","status":"clean","unresolved":0,"critical_gaps":0,"issues_found":0,"mode":"FULL_REVIEW","via":"autoplan","commit":"'"$COMMIT"'"}'
+~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"plan-eng-review","timestamp":"'"$TIMESTAMP"'","status":"STATUS","unresolved":N,"critical_gaps":N,"issues_found":N,"mode":"FULL_REVIEW","via":"autoplan","commit":"'"$COMMIT"'"}'
```
If Phase 2 ran (UI scope):
```bash
-~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"plan-design-review","timestamp":"'"$TIMESTAMP"'","status":"clean","unresolved":0,"via":"autoplan","commit":"'"$COMMIT"'"}'
+~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"plan-design-review","timestamp":"'"$TIMESTAMP"'","status":"STATUS","unresolved":N,"via":"autoplan","commit":"'"$COMMIT"'"}'
```
-Replace field values with actual counts from the review.
-
Dual voice logs (one per phase that ran):
```bash
~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"autoplan-voices","timestamp":"'"$TIMESTAMP"'","status":"STATUS","source":"SOURCE","phase":"ceo","via":"autoplan","consensus_confirmed":N,"consensus_disagree":N,"commit":"'"$COMMIT"'"}'
M codex/SKILL.md => codex/SKILL.md +1 -1
@@ 423,7 423,7 @@ CROSS-MODEL ANALYSIS:
7. Persist the review result:
```bash
-~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"codex-review","timestamp":"TIMESTAMP","status":"STATUS","gate":"GATE","findings":N,"findings_fixed":N}'
+~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"codex-review","timestamp":"TIMESTAMP","status":"STATUS","gate":"GATE","findings":N,"findings_fixed":N,"commit":"'"$(git rev-parse --short HEAD)"'"}'
```
Substitute: TIMESTAMP (ISO 8601), STATUS ("clean" if PASS, "issues_found" if FAIL),
M codex/SKILL.md.tmpl => codex/SKILL.md.tmpl +1 -1
@@ 127,7 127,7 @@ CROSS-MODEL ANALYSIS:
7. Persist the review result:
```bash
-~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"codex-review","timestamp":"TIMESTAMP","status":"STATUS","gate":"GATE","findings":N,"findings_fixed":N}'
+~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"codex-review","timestamp":"TIMESTAMP","status":"STATUS","gate":"GATE","findings":N,"findings_fixed":N,"commit":"'"$(git rev-parse --short HEAD)"'"}'
```
Substitute: TIMESTAMP (ISO 8601), STATUS ("clean" if PASS, "issues_found" if FAIL),
M land-and-deploy/SKILL.md => land-and-deploy/SKILL.md +3 -2
@@ 447,7 447,8 @@ Collect evidence for each check below. Track warnings (yellow) and blockers (red
```
Parse the output. For each review skill (plan-eng-review, plan-ceo-review,
-plan-design-review, design-review-lite, codex-review):
+plan-design-review, design-review-lite, codex-review, review, adversarial-review,
+codex-plan-review):
1. Find the most recent entry within the last 7 days.
2. Extract its `commit` field.
@@ 594,7 595,7 @@ Use AskUserQuestion:
- C) Merge anyway — I understand the risks (Completeness: 3/10)
If the user chooses B: **STOP.** List exactly what needs to be done:
-- If reviews are stale: "Re-run /plan-eng-review (or /review) to review current code."
+- If reviews are stale: "Re-run `/plan-eng-review`, `/review`, or `/autoplan` to review current code."
- If E2E not run: "Run `bun run test:e2e` to verify."
- If docs not updated: "Run /document-release to update documentation."
- If PR body stale: "Update the PR body to reflect current changes."
M land-and-deploy/SKILL.md.tmpl => land-and-deploy/SKILL.md.tmpl +3 -2
@@ 134,7 134,8 @@ Collect evidence for each check below. Track warnings (yellow) and blockers (red
```
Parse the output. For each review skill (plan-eng-review, plan-ceo-review,
-plan-design-review, design-review-lite, codex-review):
+plan-design-review, design-review-lite, codex-review, review, adversarial-review,
+codex-plan-review):
1. Find the most recent entry within the last 7 days.
2. Extract its `commit` field.
@@ 281,7 282,7 @@ Use AskUserQuestion:
- C) Merge anyway — I understand the risks (Completeness: 3/10)
If the user chooses B: **STOP.** List exactly what needs to be done:
-- If reviews are stale: "Re-run /plan-eng-review (or /review) to review current code."
+- If reviews are stale: "Re-run `/plan-eng-review`, `/review`, or `/autoplan` to review current code."
- If E2E not run: "Run `bun run test:e2e` to verify."
- If docs not updated: "Run /document-release to update documentation."
- If PR body stale: "Update the PR body to reflect current changes."
M plan-ceo-review/SKILL.md => plan-ceo-review/SKILL.md +7 -1
@@ 1262,7 1262,13 @@ After completing the review, read the review log and config to display the dashb
~/.claude/skills/gstack/bin/gstack-review-read
```
-Parse the output. Find the most recent entry for each skill (plan-ceo-review, plan-eng-review, review, plan-design-review, design-review-lite, adversarial-review, codex-review, codex-plan-review). Ignore entries with timestamps older than 7 days. For the Eng Review row, show whichever is more recent between `review` (diff-scoped pre-landing review) and `plan-eng-review` (plan-stage architecture review). Append "(DIFF)" or "(PLAN)" to the status to distinguish. For the Adversarial row, show whichever is more recent between `adversarial-review` (new auto-scaled) and `codex-review` (legacy). For Design Review, show whichever is more recent between `plan-design-review` (full visual audit) and `design-review-lite` (code-level check). Append "(FULL)" or "(LITE)" to the status to distinguish. Display:
+Parse the output. Find the most recent entry for each skill (plan-ceo-review, plan-eng-review, review, plan-design-review, design-review-lite, adversarial-review, codex-review, codex-plan-review). Ignore entries with timestamps older than 7 days. For the Eng Review row, show whichever is more recent between `review` (diff-scoped pre-landing review) and `plan-eng-review` (plan-stage architecture review). Append "(DIFF)" or "(PLAN)" to the status to distinguish. For the Adversarial row, show whichever is more recent between `adversarial-review` (new auto-scaled) and `codex-review` (legacy). For Design Review, show whichever is more recent between `plan-design-review` (full visual audit) and `design-review-lite` (code-level check). Append "(FULL)" or "(LITE)" to the status to distinguish. For the Outside Voice row, show the most recent `codex-plan-review` entry — this captures outside voices from both /plan-ceo-review and /plan-eng-review.
+
+**Source attribution:** If the most recent entry for a skill has a \`"via"\` field, append it to the status label in parentheses. Examples: `plan-eng-review` with `via:"autoplan"` shows as "CLEAR (PLAN via /autoplan)". `review` with `via:"ship"` shows as "CLEAR (DIFF via /ship)". Entries without a `via` field show as "CLEAR (PLAN)" or "CLEAR (DIFF)" as before.
+
+Note: `autoplan-voices` and `design-outside-voices` entries are audit-trail-only (forensic data for cross-model consensus analysis). They do not appear in the dashboard and are not checked by any consumer.
+
+Display:
```
+====================================================================+
M plan-design-review/SKILL.md => plan-design-review/SKILL.md +7 -1
@@ 768,7 768,13 @@ After completing the review, read the review log and config to display the dashb
~/.claude/skills/gstack/bin/gstack-review-read
```
-Parse the output. Find the most recent entry for each skill (plan-ceo-review, plan-eng-review, review, plan-design-review, design-review-lite, adversarial-review, codex-review, codex-plan-review). Ignore entries with timestamps older than 7 days. For the Eng Review row, show whichever is more recent between `review` (diff-scoped pre-landing review) and `plan-eng-review` (plan-stage architecture review). Append "(DIFF)" or "(PLAN)" to the status to distinguish. For the Adversarial row, show whichever is more recent between `adversarial-review` (new auto-scaled) and `codex-review` (legacy). For Design Review, show whichever is more recent between `plan-design-review` (full visual audit) and `design-review-lite` (code-level check). Append "(FULL)" or "(LITE)" to the status to distinguish. Display:
+Parse the output. Find the most recent entry for each skill (plan-ceo-review, plan-eng-review, review, plan-design-review, design-review-lite, adversarial-review, codex-review, codex-plan-review). Ignore entries with timestamps older than 7 days. For the Eng Review row, show whichever is more recent between `review` (diff-scoped pre-landing review) and `plan-eng-review` (plan-stage architecture review). Append "(DIFF)" or "(PLAN)" to the status to distinguish. For the Adversarial row, show whichever is more recent between `adversarial-review` (new auto-scaled) and `codex-review` (legacy). For Design Review, show whichever is more recent between `plan-design-review` (full visual audit) and `design-review-lite` (code-level check). Append "(FULL)" or "(LITE)" to the status to distinguish. For the Outside Voice row, show the most recent `codex-plan-review` entry — this captures outside voices from both /plan-ceo-review and /plan-eng-review.
+
+**Source attribution:** If the most recent entry for a skill has a \`"via"\` field, append it to the status label in parentheses. Examples: `plan-eng-review` with `via:"autoplan"` shows as "CLEAR (PLAN via /autoplan)". `review` with `via:"ship"` shows as "CLEAR (DIFF via /ship)". Entries without a `via` field show as "CLEAR (PLAN)" or "CLEAR (DIFF)" as before.
+
+Note: `autoplan-voices` and `design-outside-voices` entries are audit-trail-only (forensic data for cross-model consensus analysis). They do not appear in the dashboard and are not checked by any consumer.
+
+Display:
```
+====================================================================+
M plan-eng-review/SKILL.md => plan-eng-review/SKILL.md +7 -1
@@ 870,7 870,13 @@ After completing the review, read the review log and config to display the dashb
~/.claude/skills/gstack/bin/gstack-review-read
```
-Parse the output. Find the most recent entry for each skill (plan-ceo-review, plan-eng-review, review, plan-design-review, design-review-lite, adversarial-review, codex-review, codex-plan-review). Ignore entries with timestamps older than 7 days. For the Eng Review row, show whichever is more recent between `review` (diff-scoped pre-landing review) and `plan-eng-review` (plan-stage architecture review). Append "(DIFF)" or "(PLAN)" to the status to distinguish. For the Adversarial row, show whichever is more recent between `adversarial-review` (new auto-scaled) and `codex-review` (legacy). For Design Review, show whichever is more recent between `plan-design-review` (full visual audit) and `design-review-lite` (code-level check). Append "(FULL)" or "(LITE)" to the status to distinguish. Display:
+Parse the output. Find the most recent entry for each skill (plan-ceo-review, plan-eng-review, review, plan-design-review, design-review-lite, adversarial-review, codex-review, codex-plan-review). Ignore entries with timestamps older than 7 days. For the Eng Review row, show whichever is more recent between `review` (diff-scoped pre-landing review) and `plan-eng-review` (plan-stage architecture review). Append "(DIFF)" or "(PLAN)" to the status to distinguish. For the Adversarial row, show whichever is more recent between `adversarial-review` (new auto-scaled) and `codex-review` (legacy). For Design Review, show whichever is more recent between `plan-design-review` (full visual audit) and `design-review-lite` (code-level check). Append "(FULL)" or "(LITE)" to the status to distinguish. For the Outside Voice row, show the most recent `codex-plan-review` entry — this captures outside voices from both /plan-ceo-review and /plan-eng-review.
+
+**Source attribution:** If the most recent entry for a skill has a \`"via"\` field, append it to the status label in parentheses. Examples: `plan-eng-review` with `via:"autoplan"` shows as "CLEAR (PLAN via /autoplan)". `review` with `via:"ship"` shows as "CLEAR (DIFF via /ship)". Entries without a `via` field show as "CLEAR (PLAN)" or "CLEAR (DIFF)" as before.
+
+Note: `autoplan-voices` and `design-outside-voices` entries are audit-trail-only (forensic data for cross-model consensus analysis). They do not appear in the dashboard and are not checked by any consumer.
+
+Display:
```
+====================================================================+
M scripts/resolvers/review.ts => scripts/resolvers/review.ts +7 -1
@@ 9,7 9,13 @@ After completing the review, read the review log and config to display the dashb
~/.claude/skills/gstack/bin/gstack-review-read
\`\`\`
-Parse the output. Find the most recent entry for each skill (plan-ceo-review, plan-eng-review, review, plan-design-review, design-review-lite, adversarial-review, codex-review, codex-plan-review). Ignore entries with timestamps older than 7 days. For the Eng Review row, show whichever is more recent between \`review\` (diff-scoped pre-landing review) and \`plan-eng-review\` (plan-stage architecture review). Append "(DIFF)" or "(PLAN)" to the status to distinguish. For the Adversarial row, show whichever is more recent between \`adversarial-review\` (new auto-scaled) and \`codex-review\` (legacy). For Design Review, show whichever is more recent between \`plan-design-review\` (full visual audit) and \`design-review-lite\` (code-level check). Append "(FULL)" or "(LITE)" to the status to distinguish. Display:
+Parse the output. Find the most recent entry for each skill (plan-ceo-review, plan-eng-review, review, plan-design-review, design-review-lite, adversarial-review, codex-review, codex-plan-review). Ignore entries with timestamps older than 7 days. For the Eng Review row, show whichever is more recent between \`review\` (diff-scoped pre-landing review) and \`plan-eng-review\` (plan-stage architecture review). Append "(DIFF)" or "(PLAN)" to the status to distinguish. For the Adversarial row, show whichever is more recent between \`adversarial-review\` (new auto-scaled) and \`codex-review\` (legacy). For Design Review, show whichever is more recent between \`plan-design-review\` (full visual audit) and \`design-review-lite\` (code-level check). Append "(FULL)" or "(LITE)" to the status to distinguish. For the Outside Voice row, show the most recent \`codex-plan-review\` entry — this captures outside voices from both /plan-ceo-review and /plan-eng-review.
+
+**Source attribution:** If the most recent entry for a skill has a \\\`"via"\\\` field, append it to the status label in parentheses. Examples: \`plan-eng-review\` with \`via:"autoplan"\` shows as "CLEAR (PLAN via /autoplan)". \`review\` with \`via:"ship"\` shows as "CLEAR (DIFF via /ship)". Entries without a \`via\` field show as "CLEAR (PLAN)" or "CLEAR (DIFF)" as before.
+
+Note: \`autoplan-voices\` and \`design-outside-voices\` entries are audit-trail-only (forensic data for cross-model consensus analysis). They do not appear in the dashboard and are not checked by any consumer.
+
+Display:
\`\`\`
+====================================================================+
M ship/SKILL.md => ship/SKILL.md +21 -19
@@ 364,7 364,13 @@ After completing the review, read the review log and config to display the dashb
~/.claude/skills/gstack/bin/gstack-review-read
```
-Parse the output. Find the most recent entry for each skill (plan-ceo-review, plan-eng-review, review, plan-design-review, design-review-lite, adversarial-review, codex-review, codex-plan-review). Ignore entries with timestamps older than 7 days. For the Eng Review row, show whichever is more recent between `review` (diff-scoped pre-landing review) and `plan-eng-review` (plan-stage architecture review). Append "(DIFF)" or "(PLAN)" to the status to distinguish. For the Adversarial row, show whichever is more recent between `adversarial-review` (new auto-scaled) and `codex-review` (legacy). For Design Review, show whichever is more recent between `plan-design-review` (full visual audit) and `design-review-lite` (code-level check). Append "(FULL)" or "(LITE)" to the status to distinguish. Display:
+Parse the output. Find the most recent entry for each skill (plan-ceo-review, plan-eng-review, review, plan-design-review, design-review-lite, adversarial-review, codex-review, codex-plan-review). Ignore entries with timestamps older than 7 days. For the Eng Review row, show whichever is more recent between `review` (diff-scoped pre-landing review) and `plan-eng-review` (plan-stage architecture review). Append "(DIFF)" or "(PLAN)" to the status to distinguish. For the Adversarial row, show whichever is more recent between `adversarial-review` (new auto-scaled) and `codex-review` (legacy). For Design Review, show whichever is more recent between `plan-design-review` (full visual audit) and `design-review-lite` (code-level check). Append "(FULL)" or "(LITE)" to the status to distinguish. For the Outside Voice row, show the most recent `codex-plan-review` entry — this captures outside voices from both /plan-ceo-review and /plan-eng-review.
+
+**Source attribution:** If the most recent entry for a skill has a \`"via"\` field, append it to the status label in parentheses. Examples: `plan-eng-review` with `via:"autoplan"` shows as "CLEAR (PLAN via /autoplan)". `review` with `via:"ship"` shows as "CLEAR (DIFF via /ship)". Entries without a `via` field show as "CLEAR (PLAN)" or "CLEAR (DIFF)" as before.
+
+Note: `autoplan-voices` and `design-outside-voices` entries are audit-trail-only (forensic data for cross-model consensus analysis). They do not appear in the dashboard and are not checked by any consumer.
+
+Display:
```
+====================================================================+
@@ 403,26 409,15 @@ Parse the output. Find the most recent entry for each skill (plan-ceo-review, pl
If the Eng Review is NOT "CLEAR":
-1. **Check for a prior override on this branch:**
- ```bash
- eval "$(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)"
- grep '"skill":"ship-review-override"' ~/.gstack/projects/$SLUG/$BRANCH-reviews.jsonl 2>/dev/null || echo "NO_OVERRIDE"
- ```
- If an override exists, display the dashboard and note "Review gate previously accepted — continuing." Do NOT ask again.
+Print: "No prior eng review found — ship will run its own pre-landing review in Step 3.5."
-2. **If no override exists,** use AskUserQuestion:
- - Show that Eng Review is missing or has open issues
- - RECOMMENDATION: Choose C if the change is obviously trivial (< 20 lines, typo fix, config-only); Choose B for larger changes
- - Options: A) Ship anyway B) Abort — run /review or /plan-eng-review first C) Change is too small to need eng review
- - If CEO Review is missing, mention as informational ("CEO Review not run — recommended for product changes") but do NOT block
- - For Design Review: run `source <(~/.claude/skills/gstack/bin/gstack-diff-scope <base> 2>/dev/null)`. If `SCOPE_FRONTEND=true` and no design review (plan-design-review or design-review-lite) exists in the dashboard, mention: "Design Review not run — this PR changes frontend code. The lite design check will run automatically in Step 3.5, but consider running /design-review for a full visual audit post-implementation." Still never block.
+Check diff size: `git diff <base>...HEAD --stat | tail -1`. If the diff is >200 lines, add: "Note: This is a large diff. Consider running `/plan-eng-review` or `/autoplan` for architecture-level review before shipping."
-3. **If the user chooses A or C,** persist the decision so future `/ship` runs on this branch skip the gate:
- ```bash
- eval "$(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)"
- echo '{"skill":"ship-review-override","timestamp":"'"$(date -u +%Y-%m-%dT%H:%M:%SZ)"'","decision":"USER_CHOICE"}' >> ~/.gstack/projects/$SLUG/$BRANCH-reviews.jsonl
- ```
- Substitute USER_CHOICE with "ship_anyway" or "not_relevant".
+If CEO Review is missing, mention as informational ("CEO Review not run — recommended for product changes") but do NOT block.
+
+For Design Review: run `source <(~/.claude/skills/gstack/bin/gstack-diff-scope <base> 2>/dev/null)`. If `SCOPE_FRONTEND=true` and no design review (plan-design-review or design-review-lite) exists in the dashboard, mention: "Design Review not run — this PR changes frontend code. The lite design check will run automatically in Step 3.5, but consider running /design-review for a full visual audit post-implementation." Still never block.
+
+Continue to Step 1.5 — do NOT block or ask. Ship runs its own review in Step 3.5.
---
@@ 1340,6 1335,13 @@ Present Codex output under a `CODEX (design):` header, merged with the checklist
If no issues found: `Pre-Landing Review: No issues found.`
+9. Persist the review result to the review log:
+```bash
+~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"review","timestamp":"TIMESTAMP","status":"STATUS","issues_found":N,"critical":N,"informational":N,"commit":"'"$(git rev-parse --short HEAD)"'","via":"ship"}'
+```
+Substitute TIMESTAMP (ISO 8601), STATUS ("clean" if no issues, "issues_found" otherwise),
+and N values from the summary counts above. The `via:"ship"` distinguishes from standalone `/review` runs.
+
Save the review output — it goes into the PR body in Step 8.
---
M ship/SKILL.md.tmpl => ship/SKILL.md.tmpl +14 -18
@@ 64,26 64,15 @@ You are running the `/ship` workflow. This is a **non-interactive, fully automat
If the Eng Review is NOT "CLEAR":
-1. **Check for a prior override on this branch:**
- ```bash
- {{SLUG_EVAL}}
- grep '"skill":"ship-review-override"' ~/.gstack/projects/$SLUG/$BRANCH-reviews.jsonl 2>/dev/null || echo "NO_OVERRIDE"
- ```
- If an override exists, display the dashboard and note "Review gate previously accepted — continuing." Do NOT ask again.
+Print: "No prior eng review found — ship will run its own pre-landing review in Step 3.5."
-2. **If no override exists,** use AskUserQuestion:
- - Show that Eng Review is missing or has open issues
- - RECOMMENDATION: Choose C if the change is obviously trivial (< 20 lines, typo fix, config-only); Choose B for larger changes
- - Options: A) Ship anyway B) Abort — run /review or /plan-eng-review first C) Change is too small to need eng review
- - If CEO Review is missing, mention as informational ("CEO Review not run — recommended for product changes") but do NOT block
- - For Design Review: run `source <(~/.claude/skills/gstack/bin/gstack-diff-scope <base> 2>/dev/null)`. If `SCOPE_FRONTEND=true` and no design review (plan-design-review or design-review-lite) exists in the dashboard, mention: "Design Review not run — this PR changes frontend code. The lite design check will run automatically in Step 3.5, but consider running /design-review for a full visual audit post-implementation." Still never block.
+Check diff size: `git diff <base>...HEAD --stat | tail -1`. If the diff is >200 lines, add: "Note: This is a large diff. Consider running `/plan-eng-review` or `/autoplan` for architecture-level review before shipping."
-3. **If the user chooses A or C,** persist the decision so future `/ship` runs on this branch skip the gate:
- ```bash
- {{SLUG_EVAL}}
- echo '{"skill":"ship-review-override","timestamp":"'"$(date -u +%Y-%m-%dT%H:%M:%SZ)"'","decision":"USER_CHOICE"}' >> ~/.gstack/projects/$SLUG/$BRANCH-reviews.jsonl
- ```
- Substitute USER_CHOICE with "ship_anyway" or "not_relevant".
+If CEO Review is missing, mention as informational ("CEO Review not run — recommended for product changes") but do NOT block.
+
+For Design Review: run `source <(~/.claude/skills/gstack/bin/gstack-diff-scope <base> 2>/dev/null)`. If `SCOPE_FRONTEND=true` and no design review (plan-design-review or design-review-lite) exists in the dashboard, mention: "Design Review not run — this PR changes frontend code. The lite design check will run automatically in Step 3.5, but consider running /design-review for a full visual audit post-implementation." Still never block.
+
+Continue to Step 1.5 — do NOT block or ask. Ship runs its own review in Step 3.5.
---
@@ 275,6 264,13 @@ Review the diff for structural issues that tests don't catch.
If no issues found: `Pre-Landing Review: No issues found.`
+9. Persist the review result to the review log:
+```bash
+~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"review","timestamp":"TIMESTAMP","status":"STATUS","issues_found":N,"critical":N,"informational":N,"commit":"'"$(git rev-parse --short HEAD)"'","via":"ship"}'
+```
+Substitute TIMESTAMP (ISO 8601), STATUS ("clean" if no issues, "issues_found" otherwise),
+and N values from the summary counts above. The `via:"ship"` distinguishes from standalone `/review` runs.
+
Save the review output — it goes into the PR body in Step 8.
---
M test/helpers/touchfiles.ts => test/helpers/touchfiles.ts +1 -0
@@ 79,6 79,7 @@ export const E2E_TOUCHFILES: Record<string, string[]> = {
// Ship
'ship-base-branch': ['ship/**', 'bin/gstack-repo-mode'],
'ship-local-workflow': ['ship/**', 'scripts/gen-skill-docs.ts'],
+ 'review-dashboard-via': ['ship/**', 'scripts/resolvers/review.ts', 'codex/**', 'autoplan/**', 'land-and-deploy/**'],
'ship-plan-completion': ['ship/**', 'scripts/gen-skill-docs.ts'],
'ship-plan-verification': ['ship/**', 'scripts/gen-skill-docs.ts'],
M test/skill-e2e-review.test.ts => test/skill-e2e-review.test.ts +113 -0
@@ 529,6 529,119 @@ Analyze the git history and produce the narrative report as described in the SKI
}, 420_000);
});
+// --- Review Dashboard Via Attribution E2E ---
+
+describeIfSelected('Review Dashboard Via Attribution', ['review-dashboard-via'], () => {
+ let dashDir: string;
+
+ beforeAll(() => {
+ dashDir = fs.mkdtempSync(path.join(os.tmpdir(), 'skill-e2e-dashboard-via-'));
+ const run = (cmd: string, args: string[], cwd = dashDir) =>
+ spawnSync(cmd, args, { cwd, stdio: 'pipe', timeout: 5000 });
+
+ // Create git repo with feature branch
+ run('git', ['init', '-b', 'main']);
+ run('git', ['config', 'user.email', 'test@test.com']);
+ run('git', ['config', 'user.name', 'Test']);
+
+ fs.writeFileSync(path.join(dashDir, 'app.ts'), 'console.log("v1");\n');
+ run('git', ['add', 'app.ts']);
+ run('git', ['commit', '-m', 'initial']);
+
+ run('git', ['checkout', '-b', 'feature/dashboard-test']);
+ fs.writeFileSync(path.join(dashDir, 'app.ts'), 'console.log("v2");\n');
+ run('git', ['add', 'app.ts']);
+ run('git', ['commit', '-m', 'feat: update']);
+
+ // Get HEAD commit for review entries
+ const headResult = spawnSync('git', ['rev-parse', '--short', 'HEAD'], { cwd: dashDir, stdio: 'pipe' });
+ const commit = headResult.stdout.toString().trim();
+
+ // Pre-populate review log with autoplan-sourced entries
+ // gstack-review-read reads from ~/.gstack/projects/$SLUG/$BRANCH-reviews.jsonl
+ // For the test, we'll write a mock gstack-review-read script that returns our test data
+ const timestamp = new Date().toISOString().replace(/\.\d{3}Z$/, 'Z');
+ const reviewData = [
+ `{"skill":"plan-eng-review","timestamp":"${timestamp}","status":"clean","unresolved":0,"critical_gaps":0,"issues_found":0,"mode":"FULL_REVIEW","via":"autoplan","commit":"${commit}"}`,
+ `{"skill":"plan-ceo-review","timestamp":"${timestamp}","status":"clean","unresolved":0,"critical_gaps":0,"mode":"SELECTIVE_EXPANSION","via":"autoplan","commit":"${commit}"}`,
+ `{"skill":"codex-plan-review","timestamp":"${timestamp}","status":"clean","source":"codex","commit":"${commit}"}`,
+ ].join('\n');
+
+ // Write a mock gstack-review-read that returns our test data
+ const mockBinDir = path.join(dashDir, '.mock-bin');
+ fs.mkdirSync(mockBinDir, { recursive: true });
+ fs.writeFileSync(path.join(mockBinDir, 'gstack-review-read'), [
+ '#!/usr/bin/env bash',
+ `echo '${reviewData.split('\n').join("'\necho '")}'`,
+ 'echo "---CONFIG---"',
+ 'echo "false"',
+ 'echo "---HEAD---"',
+ `echo "${commit}"`,
+ ].join('\n'));
+ fs.chmodSync(path.join(mockBinDir, 'gstack-review-read'), 0o755);
+
+ // Copy ship skill
+ fs.copyFileSync(path.join(ROOT, 'ship', 'SKILL.md'), path.join(dashDir, 'ship-SKILL.md'));
+ });
+
+ afterAll(() => {
+ try { fs.rmSync(dashDir, { recursive: true, force: true }); } catch {}
+ });
+
+ testConcurrentIfSelected('review-dashboard-via', async () => {
+ const mockBinDir = path.join(dashDir, '.mock-bin');
+
+ const result = await runSkillTest({
+ prompt: `Read ship-SKILL.md. You only need to run the Review Readiness Dashboard section.
+
+Instead of running ~/.claude/skills/gstack/bin/gstack-review-read, run this mock: ${mockBinDir}/gstack-review-read
+
+Parse the output and display the dashboard table. Pay attention to:
+1. The "via" field in entries — show source attribution (e.g., "via /autoplan")
+2. The codex-plan-review entry — it should populate the Outside Voice row
+3. Since Eng Review IS clear, there should be NO gate blocking — just display the dashboard
+
+Skip the preamble, lake intro, telemetry, and all other ship steps.
+Write the dashboard output to ${dashDir}/dashboard-output.md`,
+ workingDirectory: dashDir,
+ maxTurns: 12,
+ timeout: 90_000,
+ testName: 'review-dashboard-via',
+ runId,
+ });
+
+ logCost('/ship dashboard-via', result);
+ recordE2E(evalCollector, '/ship review dashboard via attribution', 'Dashboard via field', result);
+ expect(result.exitReason).toBe('success');
+
+ // Check dashboard output for via attribution
+ const dashPath = path.join(dashDir, 'dashboard-output.md');
+ const allOutput = [
+ result.output || '',
+ ...result.toolCalls.map(tc => tc.output || ''),
+ ].join('\n').toLowerCase();
+
+ // Verify via attribution appears somewhere (conversation or file)
+ let dashContent = '';
+ if (fs.existsSync(dashPath)) {
+ dashContent = fs.readFileSync(dashPath, 'utf-8').toLowerCase();
+ }
+ const combined = allOutput + dashContent;
+
+ // Should mention autoplan attribution
+ expect(combined).toMatch(/autoplan/);
+ // Should show eng review as CLEAR (it has a clean entry)
+ expect(combined).toMatch(/clear/i);
+ // Should NOT contain AskUserQuestion gate (no blocking)
+ const gateQuestions = result.toolCalls.filter(tc =>
+ tc.tool === 'mcp__conductor__AskUserQuestion' ||
+ (tc.tool === 'AskUserQuestion')
+ );
+ // Ship dashboard should not gate when eng review is clear
+ expect(gateQuestions).toHaveLength(0);
+ }, 120_000);
+});
+
// Module-level afterAll — finalize eval collector after all tests complete
afterAll(async () => {
await finalizeEvalCollector(evalCollector);