From 0ac7ef4e81a26b71f70e597c9f15c86e8e94bd8c Mon Sep 17 00:00:00 2001 From: Garry Tan Date: Sat, 14 Mar 2026 13:28:18 -0500 Subject: [PATCH] fix: harden planted-bug eval prompt for reliable form testing Phase 3 was too vague ("click every nav link") causing the agent to wander instead of systematically testing form fields. Now explicitly directs: fill every input, clear it, try invalid values, submit and check console. Added Phase 4 finalize step to ensure report is updated with all findings. Co-Authored-By: Claude Opus 4.6 --- test/skill-e2e.test.ts | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/test/skill-e2e.test.ts b/test/skill-e2e.test.ts index be3c6ad232b992802ec68c4f1b51e74cf8c498fd..758f0d3f8fb3709772cb6b0b21e717d886a78cb1 100644 --- a/test/skill-e2e.test.ts +++ b/test/skill-e2e.test.ts @@ -450,12 +450,17 @@ Write every bug you found so far. Format each as: - Severity: high / medium / low - Evidence: what you observed -PHASE 3 — Interactive testing (click links, fill forms, test edge cases): -- Click every nav link, check for broken routes/404s -- Fill and submit forms with valid AND invalid data (empty fields, bad email, etc.) -- Run $B console --errors after each action +PHASE 3 — Interactive testing (systematic form + edge case testing): +- For EVERY input field on the page: fill it, clear it, try invalid values +- Specifically test: empty fields, invalid email formats, extra-long text, clearing numeric fields +- Submit the form and immediately run $B console --errors +- Click every link/button and check for broken behavior - After finding more bugs, UPDATE ${reportPath} with new findings +PHASE 4 — Finalize report: +- UPDATE ${reportPath} with ALL bugs found across all phases +- Include console errors, form validation issues, visual overflow, missing attributes + CRITICAL RULES: - ONLY test the page at ${targetUrl} — do not navigate to other sites - Write the report file in PHASE 2 before doing interactive testing