~cytrogen/gstack

ref: 1717ed28910f9bb9ee98e8415c20051a5f30138e gstack/retro/SKILL.md -rw-r--r-- 18.7 KiB
1717ed28 — Garry Tan fix: browse binary discovery broken for agents (v0.3.5) (#44) a month ago

name: retro version: 2.0.0 description: | Weekly engineering retrospective. Analyzes commit history, work patterns, and code quality metrics with persistent history and trend tracking. Team-aware: breaks down per-person contributions with praise and growth areas. allowed-tools:

  • Bash
  • Read
  • Write
  • Glob
  • AskUserQuestion

#Update Check (run first)

_UPD=$(~/.claude/skills/gstack/bin/gstack-update-check 2>/dev/null || .claude/skills/gstack/bin/gstack-update-check 2>/dev/null || true)
[ -n "$_UPD" ] && echo "$_UPD"

If output shows UPGRADE_AVAILABLE <old> <new>: read ~/.claude/skills/gstack/gstack-upgrade/SKILL.md and follow the "Inline upgrade flow" (AskUserQuestion → upgrade if yes, touch ~/.gstack/last-update-check if no). If JUST_UPGRADED <from> <to>: tell user "Running gstack v{to} (just updated!)" and continue.

#/retro — Weekly Engineering Retrospective

Generates a comprehensive engineering retrospective analyzing commit history, work patterns, and code quality metrics. Team-aware: identifies the user running the command, then analyzes every contributor with per-person praise and growth opportunities. Designed for a senior IC/CTO-level builder using Claude Code as a force multiplier.

#User-invocable

When the user types /retro, run this skill.

#Arguments

  • /retro — default: last 7 days
  • /retro 24h — last 24 hours
  • /retro 14d — last 14 days
  • /retro 30d — last 30 days
  • /retro compare — compare current window vs prior same-length window
  • /retro compare 14d — compare with explicit window

#Instructions

Parse the argument to determine the time window. Default to 7 days if no argument given. Use --since="N days ago", --since="N hours ago", or --since="N weeks ago" (for w units) for git log queries. All times should be reported in Pacific time (use TZ=America/Los_Angeles when converting timestamps).

Argument validation: If the argument doesn't match a number followed by d, h, or w, the word compare, or compare followed by a number and d/h/w, show this usage and stop:

Usage: /retro [window]
  /retro              — last 7 days (default)
  /retro 24h          — last 24 hours
  /retro 14d          — last 14 days
  /retro 30d          — last 30 days
  /retro compare      — compare this period vs prior period
  /retro compare 14d  — compare with explicit window

#Step 1: Gather Raw Data

First, fetch origin and identify the current user:

git fetch origin main --quiet
# Identify who is running the retro
git config user.name
git config user.email

The name returned by git config user.name is "you" — the person reading this retro. All other authors are teammates. Use this to orient the narrative: "your" commits vs teammate contributions.

Run ALL of these git commands in parallel (they are independent):

# 1. All commits in window with timestamps, subject, hash, AUTHOR, files changed, insertions, deletions
git log origin/main --since="<window>" --format="%H|%aN|%ae|%ai|%s" --shortstat

# 2. Per-commit test vs total LOC breakdown with author
#    Each commit block starts with COMMIT:<hash>|<author>, followed by numstat lines.
#    Separate test files (matching test/|spec/|__tests__/) from production files.
git log origin/main --since="<window>" --format="COMMIT:%H|%aN" --numstat

# 3. Commit timestamps for session detection and hourly distribution (with author)
#    Use TZ=America/Los_Angeles for Pacific time conversion
TZ=America/Los_Angeles git log origin/main --since="<window>" --format="%at|%aN|%ai|%s" | sort -n

# 4. Files most frequently changed (hotspot analysis)
git log origin/main --since="<window>" --format="" --name-only | grep -v '^$' | sort | uniq -c | sort -rn

# 5. PR numbers from commit messages (extract #NNN patterns)
git log origin/main --since="<window>" --format="%s" | grep -oE '#[0-9]+' | sed 's/^#//' | sort -n | uniq | sed 's/^/#/'

# 6. Per-author file hotspots (who touches what)
git log origin/main --since="<window>" --format="AUTHOR:%aN" --name-only

# 7. Per-author commit counts (quick summary)
git shortlog origin/main --since="<window>" -sn --no-merges

# 8. Greptile triage history (if available)
cat ~/.gstack/greptile-history.md 2>/dev/null || true

#Step 2: Compute Metrics

Calculate and present these metrics in a summary table:

Metric Value
Commits to main N
Contributors N
PRs merged N
Total insertions N
Total deletions N
Net LOC added N
Test LOC (insertions) N
Test LOC ratio N%
Version range vX.Y.Z.W → vX.Y.Z.W
Active days N
Detected sessions N
Avg LOC/session-hour N
Greptile signal N% (Y catches, Z FPs)

Then show a per-author leaderboard immediately below:

Contributor         Commits   +/-          Top area
You (garry)              32   +2400/-300   browse/
alice                    12   +800/-150    app/services/
bob                       3   +120/-40     tests/

Sort by commits descending. The current user (from git config user.name) always appears first, labeled "You (name)".

Greptile signal (if history exists): Read ~/.gstack/greptile-history.md (fetched in Step 1, command 8). Filter entries within the retro time window by date. Count entries by type: fix, fp, already-fixed. Compute signal ratio: (fix + already-fixed) / (fix + already-fixed + fp). If no entries exist in the window or the file doesn't exist, skip the Greptile metric row. Skip unparseable lines silently.

#Step 3: Commit Time Distribution

Show hourly histogram in Pacific time using bar chart:

Hour  Commits  ████████████████
 00:    4      ████
 07:    5      █████
 ...

Identify and call out:

  • Peak hours
  • Dead zones
  • Whether pattern is bimodal (morning/evening) or continuous
  • Late-night coding clusters (after 10pm)

#Step 4: Work Session Detection

Detect sessions using 45-minute gap threshold between consecutive commits. For each session report:

  • Start/end time (Pacific)
  • Number of commits
  • Duration in minutes

Classify sessions:

  • Deep sessions (50+ min)
  • Medium sessions (20-50 min)
  • Micro sessions (<20 min, typically single-commit fire-and-forget)

Calculate:

  • Total active coding time (sum of session durations)
  • Average session length
  • LOC per hour of active time

#Step 5: Commit Type Breakdown

Categorize by conventional commit prefix (feat/fix/refactor/test/chore/docs). Show as percentage bar:

feat:     20  (40%)  ████████████████████
fix:      27  (54%)  ███████████████████████████
refactor:  2  ( 4%)  ██

Flag if fix ratio exceeds 50% — this signals a "ship fast, fix fast" pattern that may indicate review gaps.

#Step 6: Hotspot Analysis

Show top 10 most-changed files. Flag:

  • Files changed 5+ times (churn hotspots)
  • Test files vs production files in the hotspot list
  • VERSION/CHANGELOG frequency (version discipline indicator)

#Step 7: PR Size Distribution

From commit diffs, estimate PR sizes and bucket them:

  • Small (<100 LOC)
  • Medium (100-500 LOC)
  • Large (500-1500 LOC)
  • XL (1500+ LOC) — flag these with file counts

#Step 8: Focus Score + Ship of the Week

Focus score: Calculate the percentage of commits touching the single most-changed top-level directory (e.g., app/services/, app/views/). Higher score = deeper focused work. Lower score = scattered context-switching. Report as: "Focus score: 62% (app/services/)"

Ship of the week: Auto-identify the single highest-LOC PR in the window. Highlight it:

  • PR number and title
  • LOC changed
  • Why it matters (infer from commit messages and files touched)

#Step 9: Team Member Analysis

For each contributor (including the current user), compute:

  1. Commits and LOC — total commits, insertions, deletions, net LOC
  2. Areas of focus — which directories/files they touched most (top 3)
  3. Commit type mix — their personal feat/fix/refactor/test breakdown
  4. Session patterns — when they code (their peak hours), session count
  5. Test discipline — their personal test LOC ratio
  6. Biggest ship — their single highest-impact commit or PR in the window

For the current user ("You"): This section gets the deepest treatment. Include all the detail from the solo retro — session analysis, time patterns, focus score. Frame it in first person: "Your peak hours...", "Your biggest ship..."

For each teammate: Write 2-3 sentences covering what they worked on and their pattern. Then:

  • Praise (1-2 specific things): Anchor in actual commits. Not "great work" — say exactly what was good. Examples: "Shipped the entire auth middleware rewrite in 3 focused sessions with 45% test coverage", "Every PR under 200 LOC — disciplined decomposition."
  • Opportunity for growth (1 specific thing): Frame as a leveling-up suggestion, not criticism. Anchor in actual data. Examples: "Test ratio was 12% this week — adding test coverage to the payment module before it gets more complex would pay off", "5 fix commits on the same file suggest the original PR could have used a review pass."

If only one contributor (solo repo): Skip the team breakdown and proceed as before — the retro is personal.

If there are Co-Authored-By trailers: Parse Co-Authored-By: lines in commit messages. Credit those authors for the commit alongside the primary author. Note AI co-authors (e.g., noreply@anthropic.com) but do not include them as team members — instead, track "AI-assisted commits" as a separate metric.

If the time window is 14 days or more, split into weekly buckets and show trends:

  • Commits per week (total and per-author)
  • LOC per week
  • Test ratio per week
  • Fix ratio per week
  • Session count per week

#Step 11: Streak Tracking

Count consecutive days with at least 1 commit to origin/main, going back from today. Track both team streak and personal streak:

# Team streak: all unique commit dates (Pacific time) — no hard cutoff
TZ=America/Los_Angeles git log origin/main --format="%ad" --date=format:"%Y-%m-%d" | sort -u

# Personal streak: only the current user's commits
TZ=America/Los_Angeles git log origin/main --author="<user_name>" --format="%ad" --date=format:"%Y-%m-%d" | sort -u

Count backward from today — how many consecutive days have at least one commit? This queries the full history so streaks of any length are reported accurately. Display both:

  • "Team shipping streak: 47 consecutive days"
  • "Your shipping streak: 32 consecutive days"

#Step 12: Load History & Compare

Before saving the new snapshot, check for prior retro history:

ls -t .context/retros/*.json 2>/dev/null

If prior retros exist: Load the most recent one using the Read tool. Calculate deltas for key metrics and include a Trends vs Last Retro section:

                    Last        Now         Delta
Test ratio:         22%    →    41%         ↑19pp
Sessions:           10     →    14          ↑4
LOC/hour:           200    →    350         ↑75%
Fix ratio:          54%    →    30%         ↓24pp (improving)
Commits:            32     →    47          ↑47%
Deep sessions:      3      →    5           ↑2

If no prior retros exist: Skip the comparison section and append: "First retro recorded — run again next week to see trends."

#Step 13: Save Retro History

After computing all metrics (including streak) and loading any prior history for comparison, save a JSON snapshot:

mkdir -p .context/retros

Determine the next sequence number for today (substitute the actual date for $(date +%Y-%m-%d)):

# Count existing retros for today to get next sequence number
today=$(TZ=America/Los_Angeles date +%Y-%m-%d)
existing=$(ls .context/retros/${today}-*.json 2>/dev/null | wc -l | tr -d ' ')
next=$((existing + 1))
# Save as .context/retros/${today}-${next}.json

Use the Write tool to save the JSON file with this schema:

{
  "date": "2026-03-08",
  "window": "7d",
  "metrics": {
    "commits": 47,
    "contributors": 3,
    "prs_merged": 12,
    "insertions": 3200,
    "deletions": 800,
    "net_loc": 2400,
    "test_loc": 1300,
    "test_ratio": 0.41,
    "active_days": 6,
    "sessions": 14,
    "deep_sessions": 5,
    "avg_session_minutes": 42,
    "loc_per_session_hour": 350,
    "feat_pct": 0.40,
    "fix_pct": 0.30,
    "peak_hour": 22,
    "ai_assisted_commits": 32
  },
  "authors": {
    "Garry Tan": { "commits": 32, "insertions": 2400, "deletions": 300, "test_ratio": 0.41, "top_area": "browse/" },
    "Alice": { "commits": 12, "insertions": 800, "deletions": 150, "test_ratio": 0.35, "top_area": "app/services/" }
  },
  "version_range": ["1.16.0.0", "1.16.1.0"],
  "streak_days": 47,
  "tweetable": "Week of Mar 1: 47 commits (3 contributors), 3.2k LOC, 38% tests, 12 PRs, peak: 10pm",
  "greptile": {
    "fixes": 3,
    "fps": 1,
    "already_fixed": 2,
    "signal_pct": 83
  }
}

Note: Only include the greptile field if ~/.gstack/greptile-history.md exists and has entries within the time window. If no history data is available, omit the field entirely.

#Step 14: Write the Narrative

Structure the output as:


Tweetable summary (first line, before everything else):

Week of Mar 1: 47 commits (3 contributors), 3.2k LOC, 38% tests, 12 PRs, peak: 10pm | Streak: 47d

#Engineering Retro: [date range]

#Summary Table

(from Step 2)

(from Step 11, loaded before save — skip if first retro)

#Time & Session Patterns

(from Steps 3-4)

Narrative interpreting what the team-wide patterns mean:

  • When the most productive hours are and what drives them
  • Whether sessions are getting longer or shorter over time
  • Estimated hours per day of active coding (team aggregate)
  • Notable patterns: do team members code at the same time or in shifts?

#Shipping Velocity

(from Steps 5-7)

Narrative covering:

  • Commit type mix and what it reveals
  • PR size discipline (are PRs staying small?)
  • Fix-chain detection (sequences of fix commits on the same subsystem)
  • Version bump discipline

#Code Quality Signals

  • Test LOC ratio trend
  • Hotspot analysis (are the same files churning?)
  • Any XL PRs that should have been split
  • Greptile signal ratio and trend (if history exists): "Greptile: X% signal (Y valid catches, Z false positives)"

#Focus & Highlights

(from Step 8)

  • Focus score with interpretation
  • Ship of the week callout

#Your Week (personal deep-dive)

(from Step 9, for the current user only)

This is the section the user cares most about. Include:

  • Their personal commit count, LOC, test ratio
  • Their session patterns and peak hours
  • Their focus areas
  • Their biggest ship
  • What you did well (2-3 specific things anchored in commits)
  • Where to level up (1-2 specific, actionable suggestions)

#Team Breakdown

(from Step 9, for each teammate — skip if solo repo)

For each teammate (sorted by commits descending), write a section:

#[Name]
  • What they shipped: 2-3 sentences on their contributions, areas of focus, and commit patterns
  • Praise: 1-2 specific things they did well, anchored in actual commits. Be genuine — what would you actually say in a 1:1? Examples:
    • "Cleaned up the entire auth module in 3 small, reviewable PRs — textbook decomposition"
    • "Added integration tests for every new endpoint, not just happy paths"
    • "Fixed the N+1 query that was causing 2s load times on the dashboard"
  • Opportunity for growth: 1 specific, constructive suggestion. Frame as investment, not criticism. Examples:
    • "Test coverage on the payment module is at 8% — worth investing in before the next feature lands on top of it"
    • "3 of the 5 PRs were 800+ LOC — breaking these up would catch issues earlier and make review easier"
    • "All commits land between 1-4am — sustainable pace matters for code quality long-term"

AI collaboration note: If many commits have Co-Authored-By AI trailers (e.g., Claude, Copilot), note the AI-assisted commit percentage as a team metric. Frame it neutrally — "N% of commits were AI-assisted" — without judgment.

#Top 3 Team Wins

Identify the 3 highest-impact things shipped in the window across the whole team. For each:

  • What it was
  • Who shipped it
  • Why it matters (product/architecture impact)

#3 Things to Improve

Specific, actionable, anchored in actual commits. Mix personal and team-level suggestions. Phrase as "to get even better, the team could..."

#3 Habits for Next Week

Small, practical, realistic. Each must be something that takes <5 minutes to adopt. At least one should be team-oriented (e.g., "review each other's PRs same-day").

(if applicable, from Step 10)


#Compare Mode

When the user runs /retro compare (or /retro compare 14d):

  1. Compute metrics for the current window (default 7d) using --since="7 days ago"
  2. Compute metrics for the immediately prior same-length window using both --since and --until to avoid overlap (e.g., --since="14 days ago" --until="7 days ago" for a 7d window)
  3. Show a side-by-side comparison table with deltas and arrows
  4. Write a brief narrative highlighting the biggest improvements and regressions
  5. Save only the current-window snapshot to .context/retros/ (same as a normal retro run); do not persist the prior-window metrics.

#Tone

  • Encouraging but candid, no coddling
  • Specific and concrete — always anchor in actual commits/code
  • Skip generic praise ("great job!") — say exactly what was good and why
  • Frame improvements as leveling up, not criticism
  • Praise should feel like something you'd actually say in a 1:1 — specific, earned, genuine
  • Growth suggestions should feel like investment advice — "this is worth your time because..." not "you failed at..."
  • Never compare teammates against each other negatively. Each person's section stands on its own.
  • Keep total output around 3000-4500 words (slightly longer to accommodate team sections)
  • Use markdown tables and code blocks for data, prose for narrative
  • Output directly to the conversation — do NOT write to filesystem (except the .context/retros/ JSON snapshot)

#Important Rules

  • ALL narrative output goes directly to the user in the conversation. The ONLY file written is the .context/retros/ JSON snapshot.
  • Use origin/main for all git queries (not local main which may be stale)
  • Convert all timestamps to Pacific time for display (use TZ=America/Los_Angeles)
  • If the window has zero commits, say so and suggest a different window
  • Round LOC/hour to nearest 50
  • Treat merge commits as PR boundaries
  • Do not read CLAUDE.md or other docs — this skill is self-contained
  • On first run (no prior retros), skip comparison sections gracefully