~cytrogen/gstack

846269e3b1f1cccf90cdc7946dec5b9a56e0fd38 — Garry Tan 6 days ago 4fc64f7
feat: voice-friendly skill triggers for AquaVoice (v0.14.6.0) (#732)

* feat: voice-friendly skill triggers for speech-to-text input

Add voice-triggers YAML field to 10 SKILL.md.tmpl files with natural-language
aliases (e.g. "see-so" for /cso, "tech review" for /plan-eng-review).
gen-skill-docs preprocesses voice triggers before transformFrontmatter,
folding them into the description and stripping the field from output.
Includes unit tests, README voice input section, and CONTRIBUTING.md update.

* chore: bump version and changelog (v0.14.6.0)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
M CHANGELOG.md => CHANGELOG.md +11 -0
@@ 1,5 1,16 @@
# Changelog

## [0.15.2.0] - 2026-04-02 — Voice-Friendly Skill Triggers

Say "run a security check" instead of remembering `/cso`. Skills now have voice-friendly trigger phrases that work with AquaVoice, Whisper, and other speech-to-text tools. No more fighting with acronyms that get transcribed wrong ("CSO" -> "CEO" -> wrong skill).

### Added

- **Voice triggers for 10 skills.** Each skill gets natural-language aliases baked into its description. "see-so", "security review", "tech review", "code x", "speed test" and more. The right skill activates even when speech-to-text mangles the command name.
- **`voice-triggers:` YAML field in templates.** Structured authoring: add aliases to any `.tmpl` frontmatter, `gen-skill-docs` folds them into the description during generation. Clean source, clean output.
- **Voice input section in README.** New users know skills work with voice from day one.
- **`voice-triggers` documented in CONTRIBUTING.md.** Frontmatter contract updated so contributors know the field exists.

## [0.15.1.0] - 2026-04-01 — Design Without Shotgun

You can now run `/design-html` without having to run `/design-shotgun` first. The skill detects what design context exists (CEO plans, design review artifacts, approved mockups) and asks how you want to proceed. Start from a plan, a description, or a provided PNG, not just an approved mockup.

M CONTRIBUTING.md => CONTRIBUTING.md +1 -1
@@ 254,7 254,7 @@ bun run build
| Aspect | Claude | Codex |
|--------|--------|-------|
| Output directory | `{skill}/SKILL.md` | `.agents/skills/gstack-{skill}/SKILL.md` (generated at setup, gitignored) |
| Frontmatter | Full (name, description, allowed-tools, hooks, version) | Minimal (name + description only) |
| Frontmatter | Full (name, description, voice-triggers, allowed-tools, hooks, version) | Minimal (name + description only) |
| Paths | `~/.claude/skills/gstack` | `$GSTACK_ROOT` (`.agents/skills/gstack` in a repo, otherwise `~/.codex/skills/gstack`) |
| Hook skills | `hooks:` frontmatter (enforced by Claude) | Inline safety advisory prose (advisory only) |
| `/codex` skill | Included (Claude wraps codex exec) | Excluded (self-referential) |

M README.md => README.md +6 -0
@@ 103,6 103,12 @@ cd ~/gstack && ./setup --host factory

Skills install to `~/.factory/skills/gstack-*/`. Restart `droid` to rescan skills, then type `/qa` to get started.

### Voice input (AquaVoice, Whisper, etc.)

gstack skills have voice-friendly trigger phrases. Say what you want naturally —
"run a security check", "test the website", "do an engineering review" — and the
right skill activates. You don't need to remember slash command names or acronyms.

## See it work

```

M VERSION => VERSION +1 -1
@@ 1,1 1,1 @@
0.15.1.0
0.15.2.0

M autoplan/SKILL.md => autoplan/SKILL.md +1 -0
@@ 11,6 11,7 @@ description: |
  automatically", or "make the decisions for me".
  Proactively suggest when the user has a plan file and wants to run the full review
  gauntlet without answering 15-30 intermediate questions. (gstack)
  Voice triggers (speech-to-text aliases): "auto plan", "automatic review".
benefits-from: [office-hours]
allowed-tools:
  - Bash

M autoplan/SKILL.md.tmpl => autoplan/SKILL.md.tmpl +3 -0
@@ 11,6 11,9 @@ description: |
  automatically", or "make the decisions for me".
  Proactively suggest when the user has a plan file and wants to run the full review
  gauntlet without answering 15-30 intermediate questions. (gstack)
voice-triggers:
  - "auto plan"
  - "automatic review"
benefits-from: [office-hours]
allowed-tools:
  - Bash

M benchmark/SKILL.md => benchmark/SKILL.md +1 -0
@@ 8,6 8,7 @@ description: |
  Compares before/after on every PR. Tracks performance trends over time.
  Use when: "performance", "benchmark", "page speed", "lighthouse", "web vitals",
  "bundle size", "load time". (gstack)
  Voice triggers (speech-to-text aliases): "speed test", "check performance".
allowed-tools:
  - Bash
  - Read

M benchmark/SKILL.md.tmpl => benchmark/SKILL.md.tmpl +3 -0
@@ 8,6 8,9 @@ description: |
  Compares before/after on every PR. Tracks performance trends over time.
  Use when: "performance", "benchmark", "page speed", "lighthouse", "web vitals",
  "bundle size", "load time". (gstack)
voice-triggers:
  - "speed test"
  - "check performance"
allowed-tools:
  - Bash
  - Read

M codex/SKILL.md => codex/SKILL.md +1 -0
@@ 8,6 8,7 @@ description: |
  your code. Consult: ask codex anything with session continuity for follow-ups.
  The "200 IQ autistic developer" second opinion. Use when asked to "codex review",
  "codex challenge", "ask codex", "second opinion", or "consult codex". (gstack)
  Voice triggers (speech-to-text aliases): "code x", "code ex", "get another opinion".
allowed-tools:
  - Bash
  - Read

M codex/SKILL.md.tmpl => codex/SKILL.md.tmpl +4 -0
@@ 8,6 8,10 @@ description: |
  your code. Consult: ask codex anything with session continuity for follow-ups.
  The "200 IQ autistic developer" second opinion. Use when asked to "codex review",
  "codex challenge", "ask codex", "second opinion", or "consult codex". (gstack)
voice-triggers:
  - "code x"
  - "code ex"
  - "get another opinion"
allowed-tools:
  - Bash
  - Read

M connect-chrome/SKILL.md => connect-chrome/SKILL.md +1 -0
@@ 7,6 7,7 @@ description: |
  action in real time. The extension shows a live activity feed in the Side Panel.
  Use when asked to "connect chrome", "open chrome", "real browser", "launch chrome",
  "side panel", or "control my browser".
  Voice triggers (speech-to-text aliases): "show me the browser".
allowed-tools:
  - Bash
  - Read

M connect-chrome/SKILL.md.tmpl => connect-chrome/SKILL.md.tmpl +2 -0
@@ 7,6 7,8 @@ description: |
  action in real time. The extension shows a live activity feed in the Side Panel.
  Use when asked to "connect chrome", "open chrome", "real browser", "launch chrome",
  "side panel", or "control my browser".
voice-triggers:
  - "show me the browser"
allowed-tools:
  - Bash
  - Read

M cso/SKILL.md => cso/SKILL.md +1 -0
@@ 9,6 9,7 @@ description: |
  Two modes: daily (zero-noise, 8/10 confidence gate) and comprehensive (monthly deep
  scan, 2/10 bar). Trend tracking across audit runs.
  Use when: "security audit", "threat model", "pentest review", "OWASP", "CSO review". (gstack)
  Voice triggers (speech-to-text aliases): "see-so", "see so", "security review", "security check", "vulnerability scan", "run security".
allowed-tools:
  - Bash
  - Read

M cso/SKILL.md.tmpl => cso/SKILL.md.tmpl +7 -0
@@ 9,6 9,13 @@ description: |
  Two modes: daily (zero-noise, 8/10 confidence gate) and comprehensive (monthly deep
  scan, 2/10 bar). Trend tracking across audit runs.
  Use when: "security audit", "threat model", "pentest review", "OWASP", "CSO review". (gstack)
voice-triggers:
  - "see-so"
  - "see so"
  - "security review"
  - "security check"
  - "vulnerability scan"
  - "run security"
allowed-tools:
  - Bash
  - Read

M design-html/SKILL.md => design-html/SKILL.md +1 -0
@@ 11,6 11,7 @@ description: |
  for each design type. Use when: "finalize this design", "turn this into HTML",
  "build me a page", "implement this design", or after any planning skill.
  Proactively suggest when user has approved a design or has a plan ready. (gstack)
  Voice triggers (speech-to-text aliases): "build the design", "code the mockup", "make it real".
allowed-tools:
  - Bash
  - Read

M design-html/SKILL.md.tmpl => design-html/SKILL.md.tmpl +4 -0
@@ 11,6 11,10 @@ description: |
  for each design type. Use when: "finalize this design", "turn this into HTML",
  "build me a page", "implement this design", or after any planning skill.
  Proactively suggest when user has approved a design or has a plan ready. (gstack)
voice-triggers:
  - "build the design"
  - "code the mockup"
  - "make it real"
allowed-tools:
  - Bash
  - Read

M gstack-upgrade/SKILL.md => gstack-upgrade/SKILL.md +1 -0
@@ 5,6 5,7 @@ description: |
  Upgrade gstack to the latest version. Detects global vs vendored install,
  runs the upgrade, and shows what's new. Use when asked to "upgrade gstack",
  "update gstack", or "get latest version".
  Voice triggers (speech-to-text aliases): "upgrade the tools", "update the tools", "gee stack upgrade", "g stack upgrade".
allowed-tools:
  - Bash
  - Read

M gstack-upgrade/SKILL.md.tmpl => gstack-upgrade/SKILL.md.tmpl +5 -0
@@ 5,6 5,11 @@ description: |
  Upgrade gstack to the latest version. Detects global vs vendored install,
  runs the upgrade, and shows what's new. Use when asked to "upgrade gstack",
  "update gstack", or "get latest version".
voice-triggers:
  - "upgrade the tools"
  - "update the tools"
  - "gee stack upgrade"
  - "g stack upgrade"
allowed-tools:
  - Bash
  - Read

M package.json => package.json +1 -1
@@ 1,6 1,6 @@
{
  "name": "gstack",
  "version": "0.15.1.0",
  "version": "0.15.2.0",
  "description": "Garry's Stack — Claude Code skills + fast headless browser. One repo, one install, entire AI engineering workflow.",
  "license": "MIT",
  "type": "module",

M plan-eng-review/SKILL.md => plan-eng-review/SKILL.md +1 -0
@@ 9,6 9,7 @@ description: |
  "review the architecture", "engineering review", or "lock in the plan".
  Proactively suggest when the user has a plan or design doc and is about to
  start coding — to catch architecture issues before implementation. (gstack)
  Voice triggers (speech-to-text aliases): "tech review", "technical review", "plan engineering review".
benefits-from: [office-hours]
allowed-tools:
  - Read

M plan-eng-review/SKILL.md.tmpl => plan-eng-review/SKILL.md.tmpl +4 -0
@@ 9,6 9,10 @@ description: |
  "review the architecture", "engineering review", or "lock in the plan".
  Proactively suggest when the user has a plan or design doc and is about to
  start coding — to catch architecture issues before implementation. (gstack)
voice-triggers:
  - "tech review"
  - "technical review"
  - "plan engineering review"
benefits-from: [office-hours]
allowed-tools:
  - Read

M qa-only/SKILL.md => qa-only/SKILL.md +1 -0
@@ 8,6 8,7 @@ description: |
  fixes anything. Use when asked to "just report bugs", "qa report only", or
  "test but don't fix". For the full test-fix-verify loop, use /qa instead.
  Proactively suggest when the user wants a bug report without any code changes. (gstack)
  Voice triggers (speech-to-text aliases): "bug report", "just check for bugs".
allowed-tools:
  - Bash
  - Read

M qa-only/SKILL.md.tmpl => qa-only/SKILL.md.tmpl +3 -0
@@ 8,6 8,9 @@ description: |
  fixes anything. Use when asked to "just report bugs", "qa report only", or
  "test but don't fix". For the full test-fix-verify loop, use /qa instead.
  Proactively suggest when the user wants a bug report without any code changes. (gstack)
voice-triggers:
  - "bug report"
  - "just check for bugs"
allowed-tools:
  - Bash
  - Read

M qa/SKILL.md => qa/SKILL.md +1 -0
@@ 11,6 11,7 @@ description: |
  or asks "does this work?". Three tiers: Quick (critical/high only),
  Standard (+ medium), Exhaustive (+ cosmetic). Produces before/after health scores,
  fix evidence, and a ship-readiness summary. For report-only mode, use /qa-only. (gstack)
  Voice triggers (speech-to-text aliases): "quality check", "test the app", "run QA".
allowed-tools:
  - Bash
  - Read

M qa/SKILL.md.tmpl => qa/SKILL.md.tmpl +4 -0
@@ 11,6 11,10 @@ description: |
  or asks "does this work?". Three tiers: Quick (critical/high only),
  Standard (+ medium), Exhaustive (+ cosmetic). Produces before/after health scores,
  fix evidence, and a ship-readiness summary. For report-only mode, use /qa-only. (gstack)
voice-triggers:
  - "quality check"
  - "test the app"
  - "run QA"
allowed-tools:
  - Bash
  - Read

M scripts/gen-skill-docs.ts => scripts/gen-skill-docs.ts +71 -3
@@ 132,6 132,63 @@ function extractNameAndDescription(content: string): { name: string; description
  return { name, description };
}

// ─── Voice Trigger Processing ────────────────────────────────

/**
 * Extract voice-triggers YAML list from frontmatter.
 * Returns an array of trigger strings, or [] if no voice-triggers field.
 */
function extractVoiceTriggers(content: string): string[] {
  const fmStart = content.indexOf('---\n');
  if (fmStart !== 0) return [];
  const fmEnd = content.indexOf('\n---', fmStart + 4);
  if (fmEnd === -1) return [];
  const frontmatter = content.slice(fmStart + 4, fmEnd);

  const triggers: string[] = [];
  let inVoice = false;
  for (const line of frontmatter.split('\n')) {
    if (/^voice-triggers:/.test(line)) { inVoice = true; continue; }
    if (inVoice) {
      const m = line.match(/^\s+-\s+"(.+)"$/);
      if (m) triggers.push(m[1]);
      else if (!/^\s/.test(line)) break;
    }
  }
  return triggers;
}

/**
 * Preprocess voice triggers: fold voice-triggers YAML field into description,
 * then strip the field from frontmatter. Must run BEFORE transformFrontmatter
 * and extractNameAndDescription so all hosts see the updated description.
 */
function processVoiceTriggers(content: string): string {
  const triggers = extractVoiceTriggers(content);
  if (triggers.length === 0) return content;

  // Strip voice-triggers block from frontmatter
  content = content.replace(/^voice-triggers:\n(?:\s+-\s+"[^"]*"\n?)*/m, '');

  // Get current description (after stripping voice-triggers, so it's clean)
  const { description } = extractNameAndDescription(content);
  if (!description) return content;

  // Build new description with voice triggers appended
  const voiceLine = `Voice triggers (speech-to-text aliases): ${triggers.map(t => `"${t}"`).join(', ')}.`;
  const newDescription = description + '\n' + voiceLine;

  // Replace old indented description with new in frontmatter
  const oldIndented = description.split('\n').map(l => `  ${l}`).join('\n');
  const newIndented = newDescription.split('\n').map(l => `  ${l}`).join('\n');
  content = content.replace(oldIndented, newIndented);

  return content;
}

// Export for testing
export { extractVoiceTriggers, processVoiceTriggers };

const OPENAI_SHORT_DESCRIPTION_LIMIT = 120;

function condenseOpenAIShortDescription(description: string): string {


@@ 163,8 220,10 @@ policy:
 */
function transformFrontmatter(content: string, host: Host): string {
  if (host === 'claude') {
    // Strip sensitive: field from Claude output (only Factory uses it)
    return content.replace(/^sensitive:\s*true\n/m, '');
    // Strip fields not used by Claude: sensitive (Factory-only), voice-triggers (folded into description by preprocessing)
    content = content.replace(/^sensitive:\s*true\n/m, '');
    content = content.replace(/^voice-triggers:\n(?:\s+-\s+"[^"]*"\n?)*/m, '');
    return content;
  }

  const fmStart = content.indexOf('---\n');


@@ 364,13 423,22 @@ function processTemplate(tmplPath: string, host: Host = 'claude'): { outputPath:
    throw new Error(`Unresolved placeholders in ${relTmplPath}: ${remaining.join(', ')}`);
  }

  // Preprocess voice triggers: fold into description, strip field from frontmatter.
  // Must run BEFORE transformFrontmatter so all hosts see the updated description,
  // and BEFORE extractedDescription is used by external host metadata.
  content = processVoiceTriggers(content);

  // Re-extract description AFTER voice trigger preprocessing so Codex openai.yaml
  // metadata gets the updated description with voice triggers included.
  const postProcessDescription = extractNameAndDescription(content).description;

  // For Claude: strip sensitive: field (only Factory uses it)
  // For external hosts: route output, transform frontmatter, rewrite paths
  let symlinkLoop = false;
  if (host === 'claude') {
    content = transformFrontmatter(content, host);
  } else {
    const result = processExternalHost(content, tmplContent, host, skillDir, extractedDescription, ctx, extractedName || undefined);
    const result = processExternalHost(content, tmplContent, host, skillDir, postProcessDescription, ctx, extractedName || undefined);
    content = result.content;
    outputPath = result.outputPath;
    symlinkLoop = result.symlinkLoop;

M test/gen-skill-docs.test.ts => test/gen-skill-docs.test.ts +48 -0
@@ 2581,3 2581,51 @@ describe('gen-skill-docs prefix warning (#620/#578)', () => {
    }
  });
});

describe('voice-triggers processing', () => {
  const { extractVoiceTriggers, processVoiceTriggers } = require('../scripts/gen-skill-docs') as {
    extractVoiceTriggers: (content: string) => string[];
    processVoiceTriggers: (content: string) => string;
  };

  test('extractVoiceTriggers parses valid YAML list', () => {
    const content = `---\nname: cso\ndescription: |\n  Security audit.\nvoice-triggers:\n  - "see-so"\n  - "security review"\n---\nBody`;
    const triggers = extractVoiceTriggers(content);
    expect(triggers).toEqual(['see-so', 'security review']);
  });

  test('extractVoiceTriggers returns [] when no field present', () => {
    const content = `---\nname: qa\ndescription: |\n  QA testing.\n---\nBody`;
    expect(extractVoiceTriggers(content)).toEqual([]);
  });

  test('processVoiceTriggers appends voice triggers to description', () => {
    const content = `---\nname: cso\ndescription: |\n  Security audit. (gstack)\nvoice-triggers:\n  - "see-so"\n  - "security review"\n---\nBody`;
    const result = processVoiceTriggers(content);
    expect(result).toContain('Voice triggers (speech-to-text aliases): "see-so", "security review".');
  });

  test('processVoiceTriggers strips voice-triggers field from output', () => {
    const content = `---\nname: cso\ndescription: |\n  Security audit. (gstack)\nvoice-triggers:\n  - "see-so"\n---\nBody`;
    const result = processVoiceTriggers(content);
    expect(result).not.toContain('voice-triggers:');
  });

  test('processVoiceTriggers returns content unchanged when no voice-triggers', () => {
    const content = `---\nname: qa\ndescription: |\n  QA testing.\n---\nBody`;
    expect(processVoiceTriggers(content)).toBe(content);
  });

  test('generated CSO SKILL.md contains voice triggers in description', () => {
    const content = fs.readFileSync(path.join(ROOT, 'cso', 'SKILL.md'), 'utf-8');
    expect(content).toContain('"see-so"');
    expect(content).toContain('Voice triggers (speech-to-text aliases):');
  });

  test('generated CSO SKILL.md does NOT contain raw voice-triggers field', () => {
    const content = fs.readFileSync(path.join(ROOT, 'cso', 'SKILL.md'), 'utf-8');
    const fmEnd = content.indexOf('\n---', 4);
    const frontmatter = content.slice(0, fmEnd);
    expect(frontmatter).not.toContain('voice-triggers:');
  });
});