~cytrogen/gstack

0ac7ef4e81a26b71f70e597c9f15c86e8e94bd8c — Garry Tan a month ago 7d26666
fix: harden planted-bug eval prompt for reliable form testing

Phase 3 was too vague ("click every nav link") causing the agent to
wander instead of systematically testing form fields. Now explicitly
directs: fill every input, clear it, try invalid values, submit and
check console. Added Phase 4 finalize step to ensure report is updated
with all findings.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 files changed, 9 insertions(+), 4 deletions(-)

M test/skill-e2e.test.ts
M test/skill-e2e.test.ts => test/skill-e2e.test.ts +9 -4
@@ 450,12 450,17 @@ Write every bug you found so far. Format each as:
- Severity: high / medium / low
- Evidence: what you observed

PHASE 3 — Interactive testing (click links, fill forms, test edge cases):
- Click every nav link, check for broken routes/404s
- Fill and submit forms with valid AND invalid data (empty fields, bad email, etc.)
- Run $B console --errors after each action
PHASE 3 — Interactive testing (systematic form + edge case testing):
- For EVERY input field on the page: fill it, clear it, try invalid values
- Specifically test: empty fields, invalid email formats, extra-long text, clearing numeric fields
- Submit the form and immediately run $B console --errors
- Click every link/button and check for broken behavior
- After finding more bugs, UPDATE ${reportPath} with new findings

PHASE 4 — Finalize report:
- UPDATE ${reportPath} with ALL bugs found across all phases
- Include console errors, form validation issues, visual overflow, missing attributes

CRITICAL RULES:
- ONLY test the page at ${targetUrl} — do not navigate to other sites
- Write the report file in PHASE 2 before doing interactive testing