Prompt Optimization Workflow

Systematically improve AI enhancement quality.

A practical guide for advanced users who want to systematically improve AI enhancement quality using the Preview panel's built-in tools.

Prerequisites

Preview mode enabled (Settings → General → Preview)
At least one LLM provider configured (Settings → LLM)
An enhancement mode selected (Settings → AI)

The Preview Panel at a Glance

The preview panel is rendered using WKWebView with a modern HTML/CSS/JS interface that adapts to light and dark mode automatically.

┌─────────────────────────────────────────────────────┐
│  ASR   [STT model ▾]  10.6s  ☐ Punc   Play ▶  Save │  ← Raw speech recognition
│  ┌─────────────────────────────────────────────┐    │
│  │ 根据目前程序的preview界面你能想到用户...       │    │
│  └─────────────────────────────────────────────┘    │
│                                                     │
│  [ Off ] [纠错润色] [翻译为英文] [润色+翻译EN] [...]  │  ← Mode switcher (⌘1–⌘9)
│                                                     │
│  AI  [provider/model ▾]  Tokens: 897 (↑878 ↓19)    │
│                                ☐ thinking  🧠 Prompt│  ← Enhancement controls
│  ┌─────────────────────────────────────────────┐    │
│  │ AI enhanced result (read-only)               │    │
│  └─────────────────────────────────────────────┘    │
│                                                     │
│  Final Result (editable)                  Translate ↗│
│  ┌─────────────────────────────────────────────┐    │
│  │ Your final text — edit here before confirm   │    │
│  └─────────────────────────────────────────────┘    │
│                                                     │
│  [History ▾]                  [Cancel]  [Confirm ⏎]  │
└─────────────────────────────────────────────────────┘

Core Tools for Prompt Optimization

1. Play ▶ — Verify What You Actually Said

Before blaming ASR or AI, play back the recording. This tells you which layer the problem is in:

What you hear	Diagnosis	Action
You misspoke or were unclear	Source problem	Re-record with clearer speech
Speech was fine, ASR text is wrong	ASR problem	Switch ASR engine, enable Punc, or try a different model
ASR text is correct, AI output is wrong	Prompt problem	Inspect and iterate on the prompt (see below)

2. Prompt Button — See the Actual System Prompt

Click Prompt ⓘ after enhancement completes to see the exact system prompt sent to the LLM. This is the single most important tool for optimization — you cannot improve what you cannot see.

What to look for:

Is the instruction clear and specific enough?
Does it mention that input comes from ASR (which may contain errors)?
Does it have an output-only rule to prevent commentary?
Are there edge cases the prompt doesn't cover?

3. Mode Switcher (⌘1–⌘9) — Compare Modes Instantly

Press ⌘1 through ⌘9 to switch modes without re-recording. Results are cached — switching back shows the previous result instantly (marked [cached]).

Rapid mode switching is debounced (300 ms). If you press ⌘3 then immediately ⌘5, only the ⌘5 enhancement request is sent to the LLM. Any in-flight streaming enhancement from the previous mode is cancelled immediately, so no tokens are wasted on intermediate switches.

Use this to:

Compare how different prompts handle the same input
A/B test a new mode against existing ones
Find which mode handles specific content types best

Switch models from the AI dropdown to see how the same prompt performs across different LLMs. Each (mode, model, thinking) combination is cached independently.

5. Thinking Mode — Observe AI Reasoning

Enable the thinking checkbox, then click the 🧠 (brain) button after enhancement to read the AI's reasoning process. This reveals:

Why the AI made specific word choices
Where the AI is confused or uncertain
Whether the AI understood your intent correctly

This is invaluable for diagnosing prompt issues — if the AI's reasoning is wrong, the prompt needs clarification.

6. Preview History — Browse Previous Results

Click the History button at the bottom of the panel to open a dropdown listing the last 10 preview records (stored in memory for the current session). Each entry shows an action icon, timestamp, enhancement mode, and a text preview.

Click any entry to load that record back into the panel — ASR text, enhanced text, final text, and audio playback buttons are all restored. This lets you revisit earlier results without re-recording, which is useful for comparing how a prompt edit affected the same input across sessions.

7. Final Result — Correct and Train

Edit the Final Result to fix any remaining issues before confirming. Each edit is recorded with a user_corrected flag, which feeds into the vocabulary system over time.

When you click Confirm or Cancel, any in-flight streaming enhancement is cancelled immediately to save tokens. You do not need to wait for the AI to finish before confirming your text.

Step-by-Step Optimization Workflow

Phase 1: Identify the Problem

Record a representative sentence.
Play ▶ the recording to confirm you said what you meant.
Read the ASR text — is the transcription accurate?
Read the AI result — did the enhancement improve or damage the text?
Click Prompt — read the system prompt that produced this result.

Phase 2: Diagnose

Use the controls to narrow down the issue:

Symptom	Diagnostic step	Likely cause
AI adds unwanted commentary	Check Prompt — missing output-only rule	Prompt needs "Output only... without explanation"
AI over-corrects correct text	Check Prompt — too aggressive instruction	Prompt needs "Preserve original text when correct"
AI misunderstands domain terms	Enable thinking, read 🧠	Prompt needs domain context or examples
Good with one model, bad with another	Switch models via dropdown	Prompt too model-dependent; add more explicit rules
Works for short input, fails for long	Test both via re-recording	Prompt needs length-aware handling

Phase 3: Edit the Mode

Open ~/.config/WenZi/enhance_modes/<mode_id>.md in a text editor. Or use Settings → AI → select mode → edit.
Make targeted changes based on your diagnosis.
Restart the app (or reload config) to load the updated prompt.
Re-record the same sentence and compare results.

Phase 4: Validate

⌘1–⌘9 to switch between the updated mode and other modes — is the updated one better?
Switch LLM models — does the improvement hold across models?
Test edge cases — try short input, long input, mixed languages, noisy speech.
Enable thinking — does the AI's reasoning now match your intent?

Phase 5: Refine Over Time

Correct Final Results consistently — the vocabulary system learns from your edits.
Review History (click History dropdown) — browse previous results and look for patterns in what the AI gets wrong.
Build vocabulary (Settings → AI → Build Vocabulary) — domain terms accumulate automatically.

Practical Examples

Example 1: AI Adds Explanation After Translation

Problem: The "翻译为英文" mode outputs "Translation: ..." with a prefix.

Diagnosis: Click Prompt → the prompt says "translate to English" but doesn't explicitly forbid commentary.

Fix: Add to the prompt:

Output only the translated text.
Do not add any prefix, label, or explanation.

Example 2: AI Corrects a Name Incorrectly

Problem: You say "找萍萍确认一下" but AI changes it to "找平平确认一下".

Diagnosis: Enable thinking → the AI treats "萍萍" as an ASR error for "平平" because it lacks context.

Fix options: - Edit Final Result to "萍萍" → the correction is logged and builds vocabulary over time - Enable Conversation History (Settings → AI) so the AI sees prior confirmed uses of "萍萍" - Manually build vocabulary after accumulating corrections

Example 3: Mode Works on One Model But Not Another

Problem: "命令行大神" produces clean shell commands on GPT-4 but adds markdown fences on a local model.

Diagnosis: Switch models via dropdown, compare outputs. Click Prompt to check.

Fix: Make the prompt more explicit:

Output only the raw command.
Do not wrap in markdown code blocks or backticks.
Do not add any explanation.

Quick Reference

Shortcut / Button	Purpose in optimization
Play ▶	Verify source audio — is the problem in your speech?
Punc	Toggle punctuation — does it improve ASR accuracy?
⌘1–⌘9	A/B test modes on the same audio
AI dropdown	A/B test models with the same prompt
Prompt ⓘ	Read the actual prompt sent to the LLM
☐ Thinking	Enable AI reasoning trace
🧠	Read the AI's reasoning — diagnose misunderstandings
Final Result	Correct errors — trains vocabulary over time
Translate ↗	Cross-check with Google Translate
History ▾	Browse last 10 preview records — compare results across prompt edits

Tips

One change at a time. When editing a prompt, change one thing, then test. Multiple changes make it hard to know what helped.
Save good test cases. Use Save to export recordings that expose prompt issues. Replay them after editing the prompt to verify the fix.
Use numbered rules. LLMs follow structured prompts with numbered rules more consistently than paragraph instructions.
Always mention ASR context. Include "The user's input comes from ASR and may contain recognition errors" in your prompts — it significantly improves error tolerance.
Check token counts. The Tokens display (↑prompt ↓completion) helps you judge prompt efficiency. A prompt with high ↑ and low ↓ may be too verbose.
Confirm early if satisfied. Clicking Confirm or Cancel immediately cancels any in-flight enhancement stream, saving tokens. You do not need to wait for the AI to finish generating.

← Enhancement Mode Examples

Ready-to-use enhancement mode templates.

Provider & Model Guide →

Configure ASR and LLM providers.

Prompt Optimization Workflow

Table of Contents

← Enhancement Mode Examples

Provider & Model Guide →