~cytrogen/gstack

de20228c2ca1f3f32dbbada82a08f1229834a7b5 — Garry Tan 13 days ago 25e971b
fix: /ship CHANGELOG and PR body now cover all branch commits (v0.12.4.0) (#535)

* fix: /ship CHANGELOG and PR body now cover all branch commits

Step 5 (CHANGELOG generation) restructured to force explicit commit
enumeration, theme grouping, and cross-check before writing. Step 8
(PR body) changed from "bullet points from CHANGELOG" to commit-by-commit
coverage with logical sections. Fixes recency bias that dropped early
commits on long branches.

* chore: bump version and changelog (v0.12.3.0)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
5 files changed, 84 insertions(+), 11 deletions(-)

M CHANGELOG.md
M TODOS.md
M VERSION
M ship/SKILL.md
M ship/SKILL.md.tmpl
M CHANGELOG.md => CHANGELOG.md +9 -0
@@ 1,5 1,14 @@
# Changelog

## [0.12.4.0] - 2026-03-26 — Full Commit Coverage in /ship

When you ship a branch with 12 commits spanning performance work, dead code removal, and test infra, the PR should mention all three. It wasn't. The CHANGELOG and PR summary biased toward whatever happened most recently, silently dropping earlier work.

### Fixed

- **/ship Step 5 (CHANGELOG):** Now forces explicit commit enumeration before writing. You list every commit, group by theme, write the entry, then cross-check that every commit maps to a bullet. No more recency bias.
- **/ship Step 8 (PR body):** Changed from "bullet points from CHANGELOG" to explicit commit-by-commit coverage. Groups commits into logical sections. Excludes the VERSION/CHANGELOG metadata commit (bookkeeping, not a change). Every substantive commit must appear somewhere.

## [0.12.3.0] - 2026-03-26 — Voice Directive: Every Skill Sounds Like a Builder

Every gstack skill now has a voice. Not a personality, not a persona, but a consistent set of instructions that make Claude sound like someone who shipped code today and cares whether the thing works for real users. Direct, concrete, sharp. Names the file, the function, the command. Connects technical work to what the user actually experiences.

M TODOS.md => TODOS.md +12 -0
@@ 221,6 221,18 @@ Linux cookie import shipped in v0.11.11.0 (Wave 3). Supports Chrome, Chromium, B
**Priority:** P2
**Depends on:** None (BASE_BRANCH_DETECT multi-platform resolver is already done)

### Multi-commit CHANGELOG completeness eval

**What:** Add a periodic E2E eval that creates a branch with 5+ commits spanning 3+ themes (features, cleanup, infra), runs /ship's Step 5 CHANGELOG generation, and verifies the CHANGELOG mentions all themes.

**Why:** The bug fixed in v0.11.22 (garrytan/ship-full-commit-coverage) showed that /ship's CHANGELOG generation biased toward recent commits on long branches. The prompt fix adds a cross-check, but no test exercises the multi-commit failure mode. The existing `ship-local-workflow` E2E only uses a single-commit branch.

**Context:** Would be a `periodic` tier test (~$4/run, non-deterministic since it tests LLM instruction-following). Setup: create bare remote, clone, add 5+ commits across different themes on a feature branch, run Step 5 via `claude -p`, verify CHANGELOG output covers all themes. Pattern: `ship-local-workflow` in `test/skill-e2e-workflow.test.ts`.

**Effort:** M
**Priority:** P3
**Depends on:** None

### Ship log — persistent record of /ship runs

**What:** Append structured JSON entry to `.gstack/ship-log.json` at end of every /ship run (version, date, branch, PR URL, review findings, Greptile stats, todos completed, test results).

M VERSION => VERSION +1 -1
@@ 1,1 1,1 @@
0.12.3.0
0.12.4.0

M ship/SKILL.md => ship/SKILL.md +31 -5
@@ 1590,10 1590,26 @@ High-confidence findings (agreed on by multiple sources) should be prioritized f

1. Read `CHANGELOG.md` header to know the format.

2. Auto-generate the entry from **ALL commits on the branch** (not just recent ones):
   - Use `git log <base>..HEAD --oneline` to see every commit being shipped
   - Use `git diff <base>...HEAD` to see the full diff against the base branch
   - The CHANGELOG entry must be comprehensive of ALL changes going into the PR
2. **First, enumerate every commit on the branch:**
   ```bash
   git log <base>..HEAD --oneline
   ```
   Copy the full list. Count the commits. You will use this as a checklist.

3. **Read the full diff** to understand what each commit actually changed:
   ```bash
   git diff <base>...HEAD
   ```

4. **Group commits by theme** before writing anything. Common themes:
   - New features / capabilities
   - Performance improvements
   - Bug fixes
   - Dead code removal / cleanup
   - Infrastructure / tooling / tests
   - Refactoring

5. **Write the CHANGELOG entry** covering ALL groups:
   - If existing CHANGELOG entries on the branch already cover some commits, replace them with one unified entry for the new version
   - Categorize changes into applicable sections:
     - `### Added` — new features


@@ 1604,6 1620,11 @@ High-confidence findings (agreed on by multiple sources) should be prioritized f
   - Insert after the file header (line 5), dated today
   - Format: `## [X.Y.Z.W] - YYYY-MM-DD`

6. **Cross-check:** Compare your CHANGELOG entry against the commit list from step 2.
   Every commit must map to at least one bullet point. If any commit is unrepresented,
   add it now. If the branch has N commits spanning K themes, the CHANGELOG must
   reflect all K themes.

**Do NOT ask the user to describe changes.** Infer from the diff and commit history.

---


@@ 1741,7 1762,12 @@ The PR/MR body should contain these sections:

```
## Summary
<bullet points from CHANGELOG>
<Summarize ALL changes being shipped. Run `git log <base>..HEAD --oneline` to enumerate
every commit. Exclude the VERSION/CHANGELOG metadata commit (that's this PR's bookkeeping,
not a substantive change). Group the remaining commits into logical sections (e.g.,
"**Performance**", "**Dead Code Removal**", "**Infrastructure**"). Every substantive commit
must appear in at least one section. If a commit's work isn't reflected in the summary,
you missed it.>

## Test Coverage
<coverage diagram from Step 3.4, or "All new code paths have test coverage.">

M ship/SKILL.md.tmpl => ship/SKILL.md.tmpl +31 -5
@@ 339,10 339,26 @@ For each classified comment:

1. Read `CHANGELOG.md` header to know the format.

2. Auto-generate the entry from **ALL commits on the branch** (not just recent ones):
   - Use `git log <base>..HEAD --oneline` to see every commit being shipped
   - Use `git diff <base>...HEAD` to see the full diff against the base branch
   - The CHANGELOG entry must be comprehensive of ALL changes going into the PR
2. **First, enumerate every commit on the branch:**
   ```bash
   git log <base>..HEAD --oneline
   ```
   Copy the full list. Count the commits. You will use this as a checklist.

3. **Read the full diff** to understand what each commit actually changed:
   ```bash
   git diff <base>...HEAD
   ```

4. **Group commits by theme** before writing anything. Common themes:
   - New features / capabilities
   - Performance improvements
   - Bug fixes
   - Dead code removal / cleanup
   - Infrastructure / tooling / tests
   - Refactoring

5. **Write the CHANGELOG entry** covering ALL groups:
   - If existing CHANGELOG entries on the branch already cover some commits, replace them with one unified entry for the new version
   - Categorize changes into applicable sections:
     - `### Added` — new features


@@ 353,6 369,11 @@ For each classified comment:
   - Insert after the file header (line 5), dated today
   - Format: `## [X.Y.Z.W] - YYYY-MM-DD`

6. **Cross-check:** Compare your CHANGELOG entry against the commit list from step 2.
   Every commit must map to at least one bullet point. If any commit is unrepresented,
   add it now. If the branch has N commits spanning K themes, the CHANGELOG must
   reflect all K themes.

**Do NOT ask the user to describe changes.** Infer from the diff and commit history.

---


@@ 490,7 511,12 @@ The PR/MR body should contain these sections:

```
## Summary
<bullet points from CHANGELOG>
<Summarize ALL changes being shipped. Run `git log <base>..HEAD --oneline` to enumerate
every commit. Exclude the VERSION/CHANGELOG metadata commit (that's this PR's bookkeeping,
not a substantive change). Group the remaining commits into logical sections (e.g.,
"**Performance**", "**Dead Code Removal**", "**Infrastructure**"). Every substantive commit
must appear in at least one section. If a commit's work isn't reflected in the summary,
you missed it.>

## Test Coverage
<coverage diagram from Step 3.4, or "All new code paths have test coverage.">