Skip to content

Agent Mode

Agent Mode is an opt-in setting that gives capable AI models a different runtime inside Whittl: the planner is skipped (the AI classifies its own intent), the tool loop runs unbounded up to 50 rounds, session memory persists across prompts, and a real bash tool becomes available for shell commands inside your project directory.

Think of it as the difference between asking a developer to change one line versus handing a developer the task and letting them work until it's done.

When to use Agent Mode

Agent Mode shines when the task has these characteristics:

  • The scope is unclear upfront. "Add user authentication to this app" means database, routes, login UI, session handling, password hashing — the agent discovers the scope as it goes.
  • Multiple related changes need to happen. Agent Mode works across files naturally; default mode generates or edits one "thing" at a time.
  • The task benefits from running and checking. Agent Mode can run your code, see errors, edit, re-run, verify — a tight feedback loop within one prompt.

Agent Mode is overkill when:

  • You know exactly what you want changed ("change the theme color to navy")
  • The change is one file, one function
  • Cost sensitivity is high and the task is simple

For targeted edits, default mode with its surgical editing is cheaper and faster.

Turning it on

Edit → Preferences → AI Generation → Agent Mode

Or the cog icon next to the backend dropdown → AI Generation tab → Agent Mode toggle.

Agent Mode is not free

An agent running for 20+ rounds on Claude Sonnet can cost $1–3 per task. A round on a premium model is typically $0.05–$0.15. Agent Mode raises the round ceiling from 5 (default mode's auto-fix) to 20–50 depending on the model.

Whittl's safeguards (oscillation guard, read-only bailout, Stop button) still fire — but Agent Mode sessions CAN be expensive if you run several in a row. Watch the token/cost display in the status bar.

Model tiers

Not all models qualify for full Agent Mode. Whittl classifies each available model into one of three tiers based on its demonstrated agentic ability:

Tier-S (full Agent Mode, up to 50 rounds)

  • anthropic/claude-opus-4
  • anthropic/claude-sonnet-4-5 (the sweet spot for most users)
  • openai/gpt-4o
  • google/gemini-2.5-pro
  • qwen/qwen3.5-plus (surprisingly capable for the price)
  • deepseek/deepseek-v3.2

Tier-A (capped Agent Mode, up to 20 rounds)

  • anthropic/claude-haiku-4-5
  • openai/gpt-4o-mini
  • google/gemini-2.5-flash
  • deepseek/deepseek-v3
  • meta-llama/llama-3.3-70b-instruct

Tier-B (no Agent Mode — treated as default-mode-only)

  • Smaller models (< 30B parameters)
  • Models with known instruction-following issues
  • Gemma 3 (its tool-use is unreliable enough that Whittl deny-lists it from Agent Mode specifically)

If you toggle Agent Mode on while a Tier-B model is selected, Whittl shows a notice and falls back to default mode. Switch to a Tier-A or Tier-S model to actually use Agent Mode.

What changes under the hood

When Agent Mode is on AND the selected model is Tier-S/A:

Planner is skipped

Default mode runs a cheap AI classifier ("is this a code modification or a question?") before the main generation. Agent Mode skips this and lets the model decide its own next action inside the tool loop. Saves a round of API calls per prompt.

Unbounded tool loop

Default mode caps tool-use rounds at 5 for auto-fix and 7-10 for modifications. Agent Mode raises this to 50 (Tier-S) / 20 (Tier-A). The agent can edit, syntax_check, run_code, read_code, and bash repeatedly until it thinks the task is done.

Session memory

Default mode treats each prompt as independent. Agent Mode remembers what the agent did on the previous prompt within the same project. Follow-ups like "now make that also work on startup" land on an agent that understands "that."

Session memory resets on:

  • Project switch
  • Explicit "clear chat" action
  • App restart

bash tool

Agent Mode unlocks a bash tool that runs shell commands in the project directory. The AI uses this for things like:

  • Running pytest to verify its changes
  • Running pip install to add a dependency
  • Inspecting the filesystem beyond what list_files shows
  • Running the actual app to see output

bash safety model

The bash tool has a three-tier safety model:

  • Whitelisted commands (ls, cat, grep, python --version, pip list, pytest, etc.) run without prompting.
  • Ask-before-run commands (pip install, anything that modifies the project) surface a confirmation prompt you have to click through.
  • Deny-listed commands (rm -rf, sudo, network operations, anything path-escaping) are blocked outright.

The whitelist/asklist/denylist is configured internally by Whittl and will be user-editable in v2.5+. For now, it's deliberately conservative.

Safeguards that remain active

Agent Mode raises the ceiling but doesn't remove the floor. All of these still fire:

  • Oscillation guard. If the agent bounces between the same two errors 3+ times across a sliding 6-entry window, the cycle aborts with a message. Tracked even across 50-round sessions.
  • Read-only bailout. If the agent spends 3+ consecutive rounds only reading code without editing, the cycle aborts.
  • Hard round cap. The tier-based cap (20 or 50) is enforced regardless of model state.
  • Stop button. Persists across rounds during an active agent cycle. Click once to cancel the queued round.
  • Error fingerprinting. Same error type + file + line repeated too often triggers early abort.

These exist specifically because Agent Mode is where runaway-cost scenarios would otherwise happen.

Running an Agent Mode task

Example: add a "Recent files" menu to a text editor app that doesn't have one.

  1. Open the project in Whittl.
  2. Toggle Agent Mode on.
  3. Select a tier-S model (e.g. claude-sonnet-4-5).
  4. Type the prompt:

    Add a "Recent files" submenu under File. It should show the last 10 opened files, with most recent at top. Persist the list to ~/.myapp/recent.json. Clear button at the bottom.
    
  5. Hit Generate.

The agent will typically:

  • Read the existing File menu code (read_code)
  • Locate where the submenu should slot in (search_code)
  • Edit the menu construction code (edit_code)
  • Add a new RecentFilesManager class (create_file)
  • Wire up signal connections
  • Run the app to verify (bash python main.py &) and check for errors
  • If errors surface, edit-and-retry up to the round cap
  • Report what it did and stop

Cost on Sonnet for a task like this: typically $0.20–$0.60. Runs in 30–90 seconds.

Running multiple prompts in one session

Session memory means follow-ups work naturally:

User:   Add a Recent files menu under File. [Agent does the thing.]
Agent:  Done. Added Recent files with 10-item MRU list persisted to ~/.myapp/recent.json.

User:   Now make it also track the files in each recent entry's preview.
Agent:  [Agent knows "it" = the Recent files menu, "preview" = files in the main editor.
         Modifies the existing RecentFilesManager to include a preview snippet.]

The agent sees the full session history and your previous prompts as part of its context. You don't have to re-specify context you've already established.

Troubleshooting

Agent won't stop, Stop button isn't responsive

This was a v2.2 bug. Fixed in v2.3 — the Stop button now genuinely cancels queued rounds. If you're on v2.3+ and seeing this, force-quit Whittl (it won't corrupt your project) and report a bug.

Agent keeps running the same command and failing

The oscillation guard should fire after 3+ repeats but occasionally the error shape changes slightly and the guard misses it. Click Stop, then reword the prompt with more specifics about what you want.

Agent took 47 rounds and my credit balance dropped a lot

This is rare but real on hard tasks with Tier-S models. Two mitigations:

  1. Use Tier-A (20-round cap) for tasks you suspect are complex but don't require Opus-tier reasoning.
  2. Break the task into smaller pieces. Instead of "add user authentication," try "add a login form → add a user model → add session handling → wire them together" as four separate prompts.

Agent Mode is greyed out

Your current model is Tier-B. Switch to a Tier-A or Tier-S model to unlock Agent Mode.

What's next

  • Choosing a Backend — which backend's model tiers you care about
  • Auto-fix Rules — the underlying safeguard system that's reused in Agent Mode
  • Skills System — how Agent Mode builds on Whittl's compounding knowledge layer