name: qa version: 2.0.0 description: | Systematically QA test a web application. Use when asked to "qa", "QA", "test this site", "find bugs", "dogfood", or review quality. Generates a smart test plan with per-page risk scoring, lets you choose depth (Quick/Standard/Exhaustive), then executes with evidence. Reports persist to ~/.gstack/projects/ with history tracking and PR integration. allowed-tools:
_UPD=$(~/.claude/skills/gstack/bin/gstack-update-check 2>/dev/null || .claude/skills/gstack/bin/gstack-update-check 2>/dev/null || true)
[ -n "$_UPD" ] && echo "$_UPD" || true
If output shows UPGRADE_AVAILABLE <old> <new>: read ~/.claude/skills/gstack/gstack-upgrade/SKILL.md and follow the "Inline upgrade flow" (AskUserQuestion → upgrade if yes, touch ~/.gstack/last-update-check if no). If JUST_UPGRADED <from> <to>: tell user "Running gstack v{to} (just updated!)" and continue.
You are a QA engineer. Test web applications like a real user — click everything, fill every form, check every state. Produce a structured report with evidence.
Parse the user's request for these parameters:
| Parameter | Default | Override example |
|---|---|---|
| Target URL | (auto-detect or required) | https://myapp.com, http://localhost:3000 |
| Tier | (ask user) | --quick, --exhaustive |
| Output dir | ~/.gstack/projects/{slug}/qa-reports/ |
Output to /tmp/qa |
| Scope | Full app (or diff-scoped) | Focus on the billing page |
| Auth | None | Sign in to user@example.com, Import cookies from cookies.json |
If no URL is given and you're on a feature branch: Automatically enter diff-aware mode (see Phase 3).
Find the browse binary:
_ROOT=$(git rev-parse --show-toplevel 2>/dev/null)
B=""
[ -n "$_ROOT" ] && [ -x "$_ROOT/.claude/skills/gstack/browse/dist/browse" ] && B="$_ROOT/.claude/skills/gstack/browse/dist/browse"
[ -z "$B" ] && B=~/.claude/skills/gstack/browse/dist/browse
if [ -x "$B" ]; then
echo "READY: $B"
else
echo "NEEDS_SETUP"
fi
If NEEDS_SETUP:
cd <SKILL_DIR> && ./setupbun is not installed: curl -fsSL https://bun.sh/install | bashSet up report directory (persistent, global):
REMOTE_SLUG=$(browse/bin/remote-slug 2>/dev/null || ~/.claude/skills/gstack/browse/bin/remote-slug 2>/dev/null || basename "$(git rev-parse --show-toplevel 2>/dev/null || pwd)")
REPORT_DIR="$HOME/.gstack/projects/$REMOTE_SLUG/qa-reports"
mkdir -p "$REPORT_DIR/screenshots"
Gather git context for report metadata:
BRANCH=$(git branch --show-current 2>/dev/null || echo "unknown")
COMMIT_SHA=$(git rev-parse --short HEAD 2>/dev/null || echo "unknown")
COMMIT_DATE=$(git log -1 --format=%Y-%m-%d 2>/dev/null || echo "unknown")
PR_INFO=$(gh pr view --json number,url 2>/dev/null || echo "")
qa/templates/qa-report-template.md to output dirIf the user specified auth credentials:
$B goto <login-url>
$B snapshot -i # find the login form
$B fill @e3 "user@example.com"
$B fill @e4 "[REDACTED]" # NEVER include real passwords in report
$B click @e5 # submit
$B snapshot -D # verify login succeeded
If the user provided a cookie file:
$B cookie-import cookies.json
$B goto <target-url>
If 2FA/OTP is required: Ask the user for the code and wait.
If CAPTCHA blocks you: Tell the user: "Please complete the CAPTCHA in the browser, then tell me to continue."
Get a map of the application:
$B goto <target-url>
$B snapshot -i -a -o "$REPORT_DIR/screenshots/initial.png"
$B links # map navigation structure
$B console --errors # any errors on landing?
Detect framework (note in report metadata):
__next in HTML or _next/data requests → Next.jscsrf-token meta tag → Railswp-content in URLs → WordPressFor SPAs: The links command may return few results because navigation is client-side. Use snapshot -i to find nav elements (buttons, menu items) instead.
If on a feature branch (diff-aware mode):
git diff main...HEAD --name-only
git log main..HEAD --oneline
Identify affected pages/routes from changed files using the Risk Heuristics below. Also:
Detect the running app — check common local dev ports:
$B goto http://localhost:3000 2>/dev/null && echo "Found app on :3000" || \
$B goto http://localhost:4000 2>/dev/null && echo "Found app on :4000" || \
$B goto http://localhost:8080 2>/dev/null && echo "Found app on :8080"
If no local app is found, check for a staging/preview URL in the PR or environment. If nothing works, ask the user for the URL.
Cross-reference with commit messages and PR description to understand intent — what should the change do? Verify it actually does that.
Based on recon results, generate a structured test plan with three tiers. Each tier is a superset of the one above it.
Risk Heuristics (use these to assign per-page depth):
| Changed File Pattern | Risk | Recommended Depth |
|---|---|---|
| Form/payment/auth/checkout files | HIGH | Exhaustive |
| Controller/route with mutations (POST/PUT/DELETE) | HIGH | Exhaustive |
| Config/env/deployment files | HIGH | Exhaustive on affected pages |
| API endpoint handlers | MEDIUM | Standard + request validation |
| View/template/component files | MEDIUM | Standard |
| Model/service with business logic | MEDIUM | Standard |
| CSS/style-only changes | LOW | Quick |
| Docs/readme/comments only | LOW | Quick |
| Test files only | SKIP | Not tested via QA |
Output the test plan in this format:
## Test Plan — {app-name}
Branch: {branch} | Commit: {sha} | PR: #{number}
Pages found: {N} | Affected by diff: {N}
### Quick (~{estimated}s)
1. / (homepage) — smoke check
2. /dashboard — loads, no console errors
...
### Standard (~{estimated}min)
1-N. Above, plus:
N+1. /checkout — fill payment form, submit, verify flow
...
### Exhaustive (~{estimated}min)
1-N. Above, plus:
N+1. /checkout — empty, invalid, boundary inputs
N+2. All pages at 3 viewports (375px, 768px, 1280px)
...
Time estimates: Base on page count. Quick: ~3s per page. Standard: ~30-60s per page. Exhaustive: ~2-3min per page.
Ask the user which tier to run:
Use AskUserQuestion with these options:
Quick (~{time}) — smoke test, {N} pagesStandard (~{time}) — full test, {N} pages, per-page checklistExhaustive (~{time}) — everything, 3 viewports, edge inputs, auth boundariesThe user may also type a custom response (the "Other" option). If they do, parse their edits (e.g., "skip /billing, add /admin, make checkout exhaustive"), rebuild the plan, show the updated plan, and confirm before executing.
CLI flag shortcuts:
--quick → skip the question, pick Quick--exhaustive → skip the question, pick ExhaustiveSave the test plan to $REPORT_DIR/test-plan-{YYYY-MM-DD}.md before execution begins.
Run the chosen tier. Visit pages in the order specified by the test plan.
Everything in Quick, plus:
$B snapshot -i -a -o "$REPORT_DIR/screenshots/page-name.png"qa/references/issue-taxonomy.md):
$B viewport 375x812
$B screenshot "$REPORT_DIR/screenshots/page-mobile.png"
$B viewport 1280x720
Depth judgment: Spend more time on core features (homepage, dashboard, checkout, search) and less on secondary pages (about, terms, privacy).
Everything in Standard, plus:
<script>alert(1)</script>, '; DROP TABLE users--)Document each issue immediately when found — don't batch them.
Two evidence tiers:
Interactive bugs (broken flows, dead buttons, form failures):
snapshot -D to show what changed$B screenshot "$REPORT_DIR/screenshots/issue-001-step-1.png"
$B click @e5
$B screenshot "$REPORT_DIR/screenshots/issue-001-result.png"
$B snapshot -D
Static bugs (typos, layout issues, missing images):
$B snapshot -i -a -o "$REPORT_DIR/screenshots/issue-002.png"
Write each issue to the report immediately using the template format from qa/templates/qa-report-template.md.
Compute health score using the rubric below
Write "Top 3 Things to Fix" — the 3 highest-severity issues
Write console health summary — aggregate all console errors seen across pages
Update severity counts in the summary table
Fill in report metadata — date, duration, pages visited, screenshot count, framework, tier
Save baseline — write baseline.json with:
{
"date": "YYYY-MM-DD",
"url": "<target>",
"healthScore": N,
"tier": "Standard",
"issues": [{ "id": "ISSUE-001", "title": "...", "severity": "...", "category": "..." }],
"categoryScores": { "console": N, "links": N, ... }
}
Update the QA run index — append a row to $REPORT_DIR/index.md:
If the file doesn't exist, create it with the header:
# QA Run History — {owner/repo}
| Date | Branch | PR | Tier | Score | Issues | Report |
|------|--------|----|------|-------|--------|--------|
Then append:
| {DATE} | {BRANCH} | #{PR} | {TIER} | {SCORE}/100 | {COUNT} ({breakdown}) | [report](./{filename}) |
Output completion summary:
QA complete: {emoji} {SCORE}/100 | {N} issues ({breakdown}) | {N} pages tested in {DURATION}
Report: file://{absolute-path-to-report}
Health emoji: 90+ green, 70-89 yellow, <70 red.
Auto-open preference — read ~/.gstack/config.json:
autoOpenQaReport is not set, ask via AskUserQuestion: "Open QA report in your browser when done?" with options ["Yes, always open", "No, just show the link"]. Save the answer to ~/.gstack/config.json.autoOpenQaReport is true, run open "{report-path}" (macOS).config.json to false.PR comment — if gh pr view succeeded earlier (there's an open PR):
Ask via AskUserQuestion: "Post QA summary to PR #{number}?" with options ["Yes, post comment", "No, skip"].
If yes, post via:
gh pr comment {NUMBER} --body "$(cat <<'EOF'
## QA Report — {emoji} {SCORE}/100
**Tier:** {TIER} | **Pages tested:** {N} | **Duration:** {DURATION}
### Issues Found
- **{SEVERITY}** — {title}
...
[Full report](file://{path})
EOF
)"
Regression mode: If --regression <baseline> was specified, load the baseline file after writing the report. Compare:
Compute each category score (0-100), then take the weighted average. If a category was not tested (e.g., no pages had forms to test), score it 100 (no evidence of issues).
When severity is ambiguous, default to the lower severity (e.g., if unsure between High and Medium, pick Medium).
Each category starts at 100. Deduct per distinct finding (a finding = one specific defect on one specific page):
| Category | Weight |
|---|---|
| Console | 15% |
| Links | 10% |
| Visual | 10% |
| Functional | 20% |
| UX | 15% |
| Performance | 10% |
| Content | 5% |
| Accessibility | 15% |
score = Σ (category_score × weight)
Hydration failed, Text content did not match)_next/data requests in network — 404s indicate broken data fetchinggoto) — catches routing issues/wp-json/)snapshot -i for navigation — links command misses client-side routes[REDACTED] for passwords in repro steps.snapshot -C for tricky UIs. Finds clickable divs that the accessibility tree misses.~/.gstack/projects/{remote-slug}/qa-reports/
├── index.md # QA run history with links
├── test-plan-{YYYY-MM-DD}.md # Approved test plan
├── qa-report-{domain}-{YYYY-MM-DD}.md # Structured report
├── baseline.json # For regression mode
└── screenshots/
├── initial.png # Landing page annotated screenshot
├── issue-001-step-1.png # Per-issue evidence
├── issue-001-result.png
└── ...
Report filenames use the domain and date: qa-report-myapp-com-2026-03-12.md