~cytrogen/gstack (1b317aae9ae959d55bda347e428248b05f1f5a9a): TODO.md

#TODO — gstack roadmap

[ ] Annotated screenshots (--annotate flag, numbered labels on elements mapped to refs)
[ ] Snapshot diffing (compare before/after accessibility trees, verify actions worked)
[ ] Dialog handling (dialog accept/dismiss — prevents browser lockup)
[ ] File upload (upload )
[ ] Cursor-interactive elements (-C flag, detect divs with cursor:pointer/onclick/tabindex)
[ ] Element state checks (is visible/enabled/checked )

[ ] SKILL.md — 6-phase workflow: Initialize → Authenticate → Orient → Explore → Document → Wrap up
[ ] Issue taxonomy reference (7 categories: visual, functional, UX, content, performance, console, accessibility)
[ ] Severity classification (critical/high/medium/low)
[ ] Exploration checklist per page
[ ] Report template (structured markdown with per-issue evidence)
[ ] Repro-first philosophy: every issue gets evidence before moving on
[ ] Two evidence tiers: interactive bugs (video + step-by-step screenshots), static bugs (single annotated screenshot)
[ ] Video recording (record start/stop for WebM capture via Playwright)
[ ] Key guidance: 5-10 well-documented issues per session, depth over breadth, write incrementally

[ ] Sessions (isolated browser instances with separate cookies/storage/history)
[ ] State persistence (save/load cookies + localStorage to JSON files)
[ ] Auth vault (encrypted credential storage, referenced by name, LLM never sees passwords)
[ ] retro + browse: deployment health tracking
- Screenshot production state
- Check perf metrics (page load times)
- Count console errors across key pages
- Track trends over retro window

Browser is the nervous system — every skill should be able to see, interact with, and verify the web
Skills are the product; the browser enables them
One repo, one install, entire AI engineering workflow
Bun compiled binary matches Rust CLI performance for this use case (bottleneck is Chromium, not CLI parsing)
Accessibility tree snapshots use ~200-400 tokens vs ~3000-5000 for full DOM — critical for AI context efficiency
Locator map approach for refs: store Map<string, Locator> on BrowserManager, no DOM mutation, no CSP issues
Snapshot scoping (-i, -c, -d, -s flags) is critical for performance on large pages
All new commands follow existing pattern: add to command set, add switch case, return string