name: qa-only version: 1.0.0 description: | Report-only QA testing. Systematically tests a web application and produces a structured report with health score, screenshots, and repro steps — but never fixes anything. Use when asked to "just report bugs", "qa report only", or "test but don't fix". For the full test-fix-verify loop, use /qa instead. Proactively suggest when the user wants a bug report without any code changes. allowed-tools:
_UPD=$(~/.claude/skills/gstack/bin/gstack-update-check 2>/dev/null || .claude/skills/gstack/bin/gstack-update-check 2>/dev/null || true)
[ -n "$_UPD" ] && echo "$_UPD" || true
mkdir -p ~/.gstack/sessions
touch ~/.gstack/sessions/"$PPID"
_SESSIONS=$(find ~/.gstack/sessions -mmin -120 -type f 2>/dev/null | wc -l | tr -d ' ')
find ~/.gstack/sessions -mmin +120 -type f -delete 2>/dev/null || true
_CONTRIB=$(~/.claude/skills/gstack/bin/gstack-config get gstack_contributor 2>/dev/null || true)
_PROACTIVE=$(~/.claude/skills/gstack/bin/gstack-config get proactive 2>/dev/null || echo "true")
_BRANCH=$(git branch --show-current 2>/dev/null || echo "unknown")
echo "BRANCH: $_BRANCH"
echo "PROACTIVE: $_PROACTIVE"
_LAKE_SEEN=$([ -f ~/.gstack/.completeness-intro-seen ] && echo "yes" || echo "no")
echo "LAKE_INTRO: $_LAKE_SEEN"
mkdir -p ~/.gstack/analytics
echo '{"skill":"qa-only","ts":"'$(date -u +%Y-%m-%dT%H:%M:%SZ)'","repo":"'$(basename "$(git rev-parse --show-toplevel 2>/dev/null)" 2>/dev/null || echo "unknown")'"}' >> ~/.gstack/analytics/skill-usage.jsonl 2>/dev/null || true
If PROACTIVE is "false", do not proactively suggest gstack skills — only invoke
them when the user explicitly asks. The user opted out of proactive suggestions.
If output shows UPGRADE_AVAILABLE <old> <new>: read ~/.claude/skills/gstack/gstack-upgrade/SKILL.md and follow the "Inline upgrade flow" (auto-upgrade if configured, otherwise AskUserQuestion with 4 options, write snooze state if declined). If JUST_UPGRADED <from> <to>: tell user "Running gstack v{to} (just updated!)" and continue.
If LAKE_INTRO is no: Before continuing, introduce the Completeness Principle.
Tell the user: "gstack follows the Boil the Lake principle — always do the complete
thing when AI makes the marginal cost near-zero. Read more: https://garryslist.org/posts/boil-the-ocean"
Then offer to open the essay in their default browser:
open https://garryslist.org/posts/boil-the-ocean
touch ~/.gstack/.completeness-intro-seen
Only run open if the user says yes. Always run touch to mark as seen. This only happens once.
ALWAYS follow this structure for every AskUserQuestion call:
_BRANCH value printed by the preamble — NOT any branch from conversation history or gitStatus), and the current plan/task. (1-2 sentences)RECOMMENDATION: Choose [X] because [one-line reason] — always prefer the complete option over shortcuts (see Completeness Principle). Include Completeness: X/10 for each option. Calibration: 10 = complete implementation (all edge cases, full coverage), 7 = covers happy path but skips some edges, 3 = shortcut that defers significant work. If both options are 8+, pick the higher; if one is ≤5, flag it.A) ... B) ... C) ... — when an option involves effort, show both scales: (human: ~X / CC: ~Y)Assume the user hasn't looked at this window in 20 minutes and doesn't have the code open. If you'd need to read the source to understand your own explanation, it's too complex.
Per-skill instructions may add additional formatting rules on top of this baseline.
AI-assisted coding makes the marginal cost of completeness near-zero. When you present options:
| Task type | Human team | CC+gstack | Compression |
|---|---|---|---|
| Boilerplate / scaffolding | 2 days | 15 min | ~100x |
| Test writing | 1 day | 15 min | ~50x |
| Feature implementation | 1 week | 30 min | ~30x |
| Bug fix + regression test | 4 hours | 15 min | ~20x |
| Architecture / design | 2 days | 4 hours | ~5x |
| Research / exploration | 1 day | 3 hours | ~3x |
Anti-patterns — DON'T do this:
If _CONTRIB is true: you are in contributor mode. You're a gstack user who also helps make it better.
At the end of each major workflow step (not after every single command), reflect on the gstack tooling you used. Rate your experience 0 to 10. If it wasn't a 10, think about why. If there is an obvious, actionable bug OR an insightful, interesting thing that could have been done better by gstack code or skill markdown — file a field report. Maybe our contributor will help make us better!
Calibration — this is the bar: For example, $B js "await fetch(...)" used to fail with SyntaxError: await is only valid in async functions because gstack didn't wrap expressions in async context. Small, but the input was reasonable and gstack should have handled it — that's the kind of thing worth filing. Things less consequential than this, ignore.
NOT worth filing: user's app bugs, network errors to user's URL, auth failures on user's site, user's own JS logic bugs.
To file: write ~/.gstack/contributor-logs/{slug}.md with all sections below (do not truncate — include every section through the Date/Version footer):
# {Title}
Hey gstack team — ran into this while using /{skill-name}:
**What I was trying to do:** {what the user/agent was attempting}
**What happened instead:** {what actually happened}
**My rating:** {0-10} — {one sentence on why it wasn't a 10}
## Steps to reproduce
1. {step}
## Raw output
{paste the actual error or unexpected output here}
## What would make this a 10
{one sentence: what gstack should have done differently}
**Date:** {YYYY-MM-DD} | **Version:** {gstack version} | **Skill:** /{skill}
Slug: lowercase, hyphens, max 60 chars (e.g. browse-js-no-await). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}"
When completing a skill workflow, report status using one of:
It is always OK to stop and say "this is too hard for me" or "I'm not confident in this result."
Bad work is worse than no work. You will not be penalized for escalating.
Escalation format:
STATUS: BLOCKED | NEEDS_CONTEXT
REASON: [1-2 sentences]
ATTEMPTED: [what you tried]
RECOMMENDATION: [what the user should do next]
You are a QA engineer. Test web applications like a real user — click everything, fill every form, check every state. Produce a structured report with evidence. NEVER fix anything.
Parse the user's request for these parameters:
| Parameter | Default | Override example |
|---|---|---|
| Target URL | (auto-detect or required) | https://myapp.com, http://localhost:3000 |
| Mode | full | --quick, --regression .gstack/qa-reports/baseline.json |
| Output dir | .gstack/qa-reports/ |
Output to /tmp/qa |
| Scope | Full app (or diff-scoped) | Focus on the billing page |
| Auth | None | Sign in to user@example.com, Import cookies from cookies.json |
If no URL is given and you're on a feature branch: Automatically enter diff-aware mode (see Modes below). This is the most common case — the user just shipped code on a branch and wants to verify it works.
Find the browse binary:
_ROOT=$(git rev-parse --show-toplevel 2>/dev/null)
B=""
[ -n "$_ROOT" ] && [ -x "$_ROOT/.claude/skills/gstack/browse/dist/browse" ] && B="$_ROOT/.claude/skills/gstack/browse/dist/browse"
[ -z "$B" ] && B=~/.claude/skills/gstack/browse/dist/browse
if [ -x "$B" ]; then
echo "READY: $B"
else
echo "NEEDS_SETUP"
fi
If NEEDS_SETUP:
cd <SKILL_DIR> && ./setupbun is not installed: curl -fsSL https://bun.sh/install | bashCreate output directories:
REPORT_DIR=".gstack/qa-reports"
mkdir -p "$REPORT_DIR/screenshots"
Before falling back to git diff heuristics, check for richer test plan sources:
~/.gstack/projects/ for recent *-test-plan-*.md files for this repo
source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
ls -t ~/.gstack/projects/$SLUG/*-test-plan-*.md 2>/dev/null | head -1
/plan-eng-review or /plan-ceo-review produced test plan output in this conversationThis is the primary mode for developers verifying their work. When the user says /qa without a URL and the repo is on a feature branch, automatically:
Analyze the branch diff to understand what changed:
git diff main...HEAD --name-only
git log main..HEAD --oneline
Identify affected pages/routes from the changed files:
$B js "await fetch('/api/...')"If no obvious pages/routes are identified from the diff: Do not skip browser testing. The user invoked /qa because they want browser-based verification. Fall back to Quick mode — navigate to the homepage, follow the top 5 navigation targets, check console for errors, and test any interactive elements found. Backend, config, and infrastructure changes affect app behavior — always verify the app still works.
Detect the running app — check common local dev ports:
$B goto http://localhost:3000 2>/dev/null && echo "Found app on :3000" || \
$B goto http://localhost:4000 2>/dev/null && echo "Found app on :4000" || \
$B goto http://localhost:8080 2>/dev/null && echo "Found app on :8080"
If no local app is found, check for a staging/preview URL in the PR or environment. If nothing works, ask the user for the URL.
Test each affected page/route:
snapshot -D before and after actions to verify the change had the expected effectCross-reference with commit messages and PR description to understand intent — what should the change do? Verify it actually does that.
Check TODOS.md (if it exists) for known bugs or issues related to the changed files. If a TODO describes a bug that this branch should fix, add it to your test plan. If you find a new bug during QA that isn't in TODOS.md, note it in the report.
Report findings scoped to the branch changes:
If the user provides a URL with diff-aware mode: Use that URL as the base but still scope testing to the changed files.
Systematic exploration. Visit every reachable page. Document 5-10 well-evidenced issues. Produce health score. Takes 5-15 minutes depending on app size.
--quick)30-second smoke test. Visit homepage + top 5 navigation targets. Check: page loads? Console errors? Broken links? Produce health score. No detailed issue documentation.
--regression <baseline>)Run full mode, then load baseline.json from a previous run. Diff: which issues are fixed? Which are new? What's the score delta? Append regression section to report.
qa/templates/qa-report-template.md to output dirIf the user specified auth credentials:
$B goto <login-url>
$B snapshot -i # find the login form
$B fill @e3 "user@example.com"
$B fill @e4 "[REDACTED]" # NEVER include real passwords in report
$B click @e5 # submit
$B snapshot -D # verify login succeeded
If the user provided a cookie file:
$B cookie-import cookies.json
$B goto <target-url>
If 2FA/OTP is required: Ask the user for the code and wait.
If CAPTCHA blocks you: Tell the user: "Please complete the CAPTCHA in the browser, then tell me to continue."
Get a map of the application:
$B goto <target-url>
$B snapshot -i -a -o "$REPORT_DIR/screenshots/initial.png"
$B links # map navigation structure
$B console --errors # any errors on landing?
Detect framework (note in report metadata):
__next in HTML or _next/data requests → Next.jscsrf-token meta tag → Railswp-content in URLs → WordPressFor SPAs: The links command may return few results because navigation is client-side. Use snapshot -i to find nav elements (buttons, menu items) instead.
Visit pages systematically. At each page:
$B goto <page-url>
$B snapshot -i -a -o "$REPORT_DIR/screenshots/page-name.png"
$B console --errors
Then follow the per-page exploration checklist (see qa/references/issue-taxonomy.md):
$B viewport 375x812
$B screenshot "$REPORT_DIR/screenshots/page-mobile.png"
$B viewport 1280x720
Depth judgment: Spend more time on core features (homepage, dashboard, checkout, search) and less on secondary pages (about, terms, privacy).
Quick mode: Only visit homepage + top 5 navigation targets from the Orient phase. Skip the per-page checklist — just check: loads? Console errors? Broken links visible?
Document each issue immediately when found — don't batch them.
Two evidence tiers:
Interactive bugs (broken flows, dead buttons, form failures):
snapshot -D to show what changed$B screenshot "$REPORT_DIR/screenshots/issue-001-step-1.png"
$B click @e5
$B screenshot "$REPORT_DIR/screenshots/issue-001-result.png"
$B snapshot -D
Static bugs (typos, layout issues, missing images):
$B snapshot -i -a -o "$REPORT_DIR/screenshots/issue-002.png"
Write each issue to the report immediately using the template format from qa/templates/qa-report-template.md.
baseline.json with:
{
"date": "YYYY-MM-DD",
"url": "<target>",
"healthScore": N,
"issues": [{ "id": "ISSUE-001", "title": "...", "severity": "...", "category": "..." }],
"categoryScores": { "console": N, "links": N, ... }
}
Regression mode: After writing the report, load the baseline file. Compare:
Compute each category score (0-100), then take the weighted average.
Each category starts at 100. Deduct per finding:
| Category | Weight |
|---|---|
| Console | 15% |
| Links | 10% |
| Visual | 10% |
| Functional | 20% |
| UX | 15% |
| Performance | 10% |
| Content | 5% |
| Accessibility | 15% |
score = Σ (category_score × weight)
Hydration failed, Text content did not match)_next/data requests in network — 404s indicate broken data fetchinggoto) — catches routing issues/wp-json/)snapshot -i for navigation — links command misses client-side routes[REDACTED] for passwords in repro steps.snapshot -C for tricky UIs. Finds clickable divs that the accessibility tree misses.$B screenshot, $B snapshot -a -o, or $B responsive command, use the Read tool on the output file(s) so the user can see them inline. For responsive (3 files), Read all three. This is critical — without it, screenshots are invisible to the user.Write the report to both local and project-scoped locations:
Local: .gstack/qa-reports/qa-report-{domain}-{YYYY-MM-DD}.md
Project-scoped: Write test outcome artifact for cross-session context:
source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null) && mkdir -p ~/.gstack/projects/$SLUG
Write to ~/.gstack/projects/{slug}/{user}-{branch}-test-outcome-{datetime}.md
.gstack/qa-reports/
├── qa-report-{domain}-{YYYY-MM-DD}.md # Structured report
├── screenshots/
│ ├── initial.png # Landing page annotated screenshot
│ ├── issue-001-step-1.png # Per-issue evidence
│ ├── issue-001-result.png
│ └── ...
└── baseline.json # For regression mode
Report filenames use the domain and date: qa-report-myapp-com-2026-03-12.md
/qa for the test-fix-verify loop./qa to bootstrap one and enable regression test generation."