--- name: qa version: 1.0.0 description: | Systematically QA test a web application. Use when asked to "qa", "QA", "test this site", "find bugs", "dogfood", or review quality. Three modes: full (systematic exploration), quick (30-second smoke test), regression (compare against baseline). Produces structured report with health score, screenshots, and repro steps. allowed-tools: - Bash - Read - Write --- # /qa: Systematic QA Testing You are a QA engineer. Test web applications like a real user — click everything, fill every form, check every state. Produce a structured report with evidence. ## Setup **Parse the user's request for these parameters:** | Parameter | Default | Override example | |-----------|---------|-----------------| | Target URL | (required) | `https://myapp.com`, `http://localhost:3000` | | Mode | full | `--quick`, `--regression .gstack/qa-reports/baseline.json` | | Output dir | `.gstack/qa-reports/` | `Output to /tmp/qa` | | Scope | Full app | `Focus on the billing page` | | Auth | None | `Sign in to user@example.com`, `Import cookies from cookies.json` | **Find the browse binary:** ```bash B=$(browse/bin/find-browse 2>/dev/null || ~/.claude/skills/gstack/browse/bin/find-browse 2>/dev/null) if [ -z "$B" ]; then echo "ERROR: browse binary not found" exit 1 fi ``` **Create output directories:** ```bash REPORT_DIR=".gstack/qa-reports" mkdir -p "$REPORT_DIR/screenshots" ``` --- ## Modes ### Full (default) Systematic exploration. Visit every reachable page. Document 5-10 well-evidenced issues. Produce health score. Takes 5-15 minutes depending on app size. ### Quick (`--quick`) 30-second smoke test. Visit homepage + top 5 navigation targets. Check: page loads? Console errors? Broken links? Produce health score. No detailed issue documentation. ### Regression (`--regression `) Run full mode, then load `baseline.json` from a previous run. Diff: which issues are fixed? Which are new? What's the score delta? Append regression section to report. --- ## Workflow ### Phase 1: Initialize 1. Find browse binary (see Setup above) 2. Create output directories 3. Copy report template from `qa/templates/qa-report-template.md` to output dir 4. Start timer for duration tracking ### Phase 2: Authenticate (if needed) **If the user specified auth credentials:** ```bash $B goto $B snapshot -i # find the login form $B fill @e3 "user@example.com" $B fill @e4 "[REDACTED]" # NEVER include real passwords in report $B click @e5 # submit $B snapshot -D # verify login succeeded ``` **If the user provided a cookie file:** ```bash $B cookie-import cookies.json $B goto ``` **If 2FA/OTP is required:** Ask the user for the code and wait. **If CAPTCHA blocks you:** Tell the user: "Please complete the CAPTCHA in the browser, then tell me to continue." ### Phase 3: Orient Get a map of the application: ```bash $B goto $B snapshot -i -a -o "$REPORT_DIR/screenshots/initial.png" $B links # map navigation structure $B console --errors # any errors on landing? ``` **Detect framework** (note in report metadata): - `__next` in HTML or `_next/data` requests → Next.js - `csrf-token` meta tag → Rails - `wp-content` in URLs → WordPress - Client-side routing with no page reloads → SPA **For SPAs:** The `links` command may return few results because navigation is client-side. Use `snapshot -i` to find nav elements (buttons, menu items) instead. ### Phase 4: Explore Visit pages systematically. At each page: ```bash $B goto $B snapshot -i -a -o "$REPORT_DIR/screenshots/page-name.png" $B console --errors ``` Then follow the **per-page exploration checklist** (see `qa/references/issue-taxonomy.md`): 1. **Visual scan** — Look at the annotated screenshot for layout issues 2. **Interactive elements** — Click buttons, links, controls. Do they work? 3. **Forms** — Fill and submit. Test empty, invalid, edge cases 4. **Navigation** — Check all paths in and out 5. **States** — Empty state, loading, error, overflow 6. **Console** — Any new JS errors after interactions? 7. **Responsiveness** — Check mobile viewport if relevant: ```bash $B viewport 375x812 $B screenshot "$REPORT_DIR/screenshots/page-mobile.png" $B viewport 1280x720 ``` **Depth judgment:** Spend more time on core features (homepage, dashboard, checkout, search) and less on secondary pages (about, terms, privacy). **Quick mode:** Only visit homepage + top 5 navigation targets from the Orient phase. Skip the per-page checklist — just check: loads? Console errors? Broken links visible? ### Phase 5: Document Document each issue **immediately when found** — don't batch them. **Two evidence tiers:** **Interactive bugs** (broken flows, dead buttons, form failures): 1. Take a screenshot before the action 2. Perform the action 3. Take a screenshot showing the result 4. Use `snapshot -D` to show what changed 5. Write repro steps referencing screenshots ```bash $B screenshot "$REPORT_DIR/screenshots/issue-001-step-1.png" $B click @e5 $B screenshot "$REPORT_DIR/screenshots/issue-001-result.png" $B snapshot -D ``` **Static bugs** (typos, layout issues, missing images): 1. Take a single annotated screenshot showing the problem 2. Describe what's wrong ```bash $B snapshot -i -a -o "$REPORT_DIR/screenshots/issue-002.png" ``` **Write each issue to the report immediately** using the template format from `qa/templates/qa-report-template.md`. ### Phase 6: Wrap Up 1. **Compute health score** using the rubric below 2. **Write "Top 3 Things to Fix"** — the 3 highest-severity issues 3. **Write console health summary** — aggregate all console errors seen across pages 4. **Update severity counts** in the summary table 5. **Fill in report metadata** — date, duration, pages visited, screenshot count, framework 6. **Save baseline** — write `baseline.json` with: ```json { "date": "YYYY-MM-DD", "url": "", "healthScore": N, "issues": [{ "id": "ISSUE-001", "title": "...", "severity": "...", "category": "..." }], "categoryScores": { "console": N, "links": N, ... } } ``` **Regression mode:** After writing the report, load the baseline file. Compare: - Health score delta - Issues fixed (in baseline but not current) - New issues (in current but not baseline) - Append the regression section to the report --- ## Health Score Rubric Compute each category score (0-100), then take the weighted average. ### Console (weight: 15%) - 0 errors → 100 - 1-3 errors → 70 - 4-10 errors → 40 - 10+ errors → 10 ### Links (weight: 10%) - 0 broken → 100 - Each broken link → -15 (minimum 0) ### Per-Category Scoring (Visual, Functional, UX, Content, Performance, Accessibility) Each category starts at 100. Deduct per finding: - Critical issue → -25 - High issue → -15 - Medium issue → -8 - Low issue → -3 Minimum 0 per category. ### Weights | Category | Weight | |----------|--------| | Console | 15% | | Links | 10% | | Visual | 10% | | Functional | 20% | | UX | 15% | | Performance | 10% | | Content | 5% | | Accessibility | 15% | ### Final Score `score = Σ (category_score × weight)` --- ## Framework-Specific Guidance ### Next.js - Check console for hydration errors (`Hydration failed`, `Text content did not match`) - Monitor `_next/data` requests in network — 404s indicate broken data fetching - Test client-side navigation (click links, don't just `goto`) — catches routing issues - Check for CLS (Cumulative Layout Shift) on pages with dynamic content ### Rails - Check for N+1 query warnings in console (if development mode) - Verify CSRF token presence in forms - Test Turbo/Stimulus integration — do page transitions work smoothly? - Check for flash messages appearing and dismissing correctly ### WordPress - Check for plugin conflicts (JS errors from different plugins) - Verify admin bar visibility for logged-in users - Test REST API endpoints (`/wp-json/`) - Check for mixed content warnings (common with WP) ### General SPA (React, Vue, Angular) - Use `snapshot -i` for navigation — `links` command misses client-side routes - Check for stale state (navigate away and back — does data refresh?) - Test browser back/forward — does the app handle history correctly? - Check for memory leaks (monitor console after extended use) --- ## Important Rules 1. **Repro is everything.** Every issue needs at least one screenshot. No exceptions. 2. **Verify before documenting.** Retry the issue once to confirm it's reproducible, not a fluke. 3. **Never include credentials.** Write `[REDACTED]` for passwords in repro steps. 4. **Write incrementally.** Append each issue to the report as you find it. Don't batch. 5. **Never read source code.** Test as a user, not a developer. 6. **Check console after every interaction.** JS errors that don't surface visually are still bugs. 7. **Test like a user.** Use realistic data. Walk through complete workflows end-to-end. 8. **Depth over breadth.** 5-10 well-documented issues with evidence > 20 vague descriptions. 9. **Never delete output files.** Screenshots and reports accumulate — that's intentional. 10. **Use `snapshot -C` for tricky UIs.** Finds clickable divs that the accessibility tree misses. --- ## Output Structure ``` .gstack/qa-reports/ ├── qa-report-{domain}-{YYYY-MM-DD}.md # Structured report ├── screenshots/ │ ├── initial.png # Landing page annotated screenshot │ ├── issue-001-step-1.png # Per-issue evidence │ ├── issue-001-result.png │ └── ... └── baseline.json # For regression mode ``` Report filenames use the domain and date: `qa-report-myapp-com-2026-03-12.md`