Last updated: April 2026
v2.2.0 is the biggest architectural change since v1.0. The headline: Whittl now treats the AI as an agent with real tools, not a text generator. Edits go through schema-enforced tool calls instead of text parsing. Output costs dropped 40-75% per modification. Autofix iterates until the code works.
edit_code calls instead of file regeneration. 40-75% fewer output tokens per change, up to 97% on large files. Theme color changes that took 83s now finish in 10-15s.v2.3 is code-complete and in final testing. It's the polish-and-prove release. The generation pipeline from v2.2 is solid, so this one focuses on two things: making the app look and feel like a single designed product, and making Agent Mode a real option for the models that can handle it. Every new feature is paired with safeguards so cheap models don't burn tokens flailing at problems they can't solve.
[Tools] [Thinks] [Long] [Vision] chips per model instead of opaque S/A/B letters. Concrete capabilities you can actually verify, not made-up tier grades.Here's the idea that made everything else click.
Every AI coding tool on the market is a chat window over someone else's model. Cursor, v0.dev, Claude Code, bolt.new are all wrappers. Their quality ceiling is the model's quality.
Whittl is quietly different. It already has 75+ auto-fix rules, a skills library, oscillation and round-cap guards, a tool executor, and custom validators. That's a real knowledge layer sitting between the user and the model. Weak models get more uplift from the layer because they make more fixable mistakes. v2.4 makes that layer a first-class product concept.
Unify autofix rules, skills, validators, and anti-patterns behind a single engine. Version the rule set independently of the Whittl binary so rules can update weekly without a full release. Positioning shifts: Whittl is not “an AI chat window that makes apps.” It's the knowledge layer that makes any model better at building Python desktop apps.
A curated, signed bundle of community-sourced rules that every Whittl installation pulls on startup. Like a virus-definitions update, but for code generation. Starts as download-only. I seed the library, users benefit from the updates. Opt-in contribution comes later.
Small “Streaming…” indicator on the currently-generating file tab so you know which file is being written right now during multi-file generation.
Add flet_version metadata to projects (defaults to 0.28, ignored by current code). Infrastructure ready for when Flet 1.0 stabilizes. See “Longer term” for why the full Flet upgrade is deferred.
Hit an auto-fix that worked? Optional “Submit to Commons” button shows you the exact JSON payload before anything leaves your machine. Your code, prompts, and project details never leave, only the structured fix rule, with your explicit consent. Submissions go through a review queue; approved rules ship in the next Commons bundle. Every contribution makes every Whittl installation smarter.
Every generation, regardless of model, runs through a fast local pass: security issues (path traversal, eval/exec, SQL injection), framework gotchas (Qt enum syntax, Flet device mismatches, CSS vs QSS), performance anti-patterns (blocking I/O on UI thread). Runs in parallel with code-apply, costs no tokens, works on every backend.
Cheap-then-escalate. Haiku generates, Whittl Layer fixes common mistakes, test runs. Only if it still fails does Whittl escalate to Sonnet. Internal testing: Haiku with the Layer matches bare-Sonnet on ~70% of typical tasks at ~1/15th the cost.
Named bundles of (model + tools + skills + prompt) pickable from a chat-header dropdown. Ships with starters: Game Builder (pygame-focused, longer round cap, Sonnet), Quick Edit (Haiku, planner skipped, fast iteration), Debug Only (read-only tools, explains instead of fixes), Code Review (Opus, critiques architecture). Users can create and share their own as portable .md files.
Every generation that passes tests becomes a validated example. Future similar prompts retrieve these as in-context examples to whichever model you're using. All local, never uploads. Your own successful work makes future Whittl runs on similar tasks more reliable.
Every time Whittl's autofix corrects something, that's a signal about what your chosen model gets wrong. Whittl remembers per-model patterns and prepends reminders to future prompts for that model: “Note: you historically emit X in these cases, try Y.” Weak models get custom augmentations based on their own history. The Layer compounds with use.
Expanded from the original “screenshot to app” idea. Drop a Figma PNG, paste HTML/CSS from a web app, or point Whittl at a URL. Get a native Python desktop app. Every other AI tool outputs more web. Whittl uniquely converts design artifacts into runnable software you own. Take your bolt.new prototype and make it useful.
Current Flet pin (0.28.3) is intentional. Flet 0.29 and 1.0 alpha introduced breaking API changes that invalidate the auto-fix rules, system prompts, templates, and APK tooling Whittl depends on. Migration is ~3-4 weeks of focused work.
The plan: wait for Flet 1.0 stable GA, add dual-version support so existing projects stay on 0.28 while new ones opt into 1.0, build a Migration Agent that upgrades old projects one-click. Commons accelerates 1.0 rule calibration via community contributions. Same release probably includes Flet web export since 1.0's web story is much better than 0.28's.
The strategic bet: most AI coding tools treat the model as the whole product. Their quality ceiling moves with the model. When a new Claude ships, they get better. When a competitor ships a slightly better wrapper, they have nothing to fight back with.
Whittl's bet is different. The Layer plus accumulated community contributions compound over time, independent of any specific model. The moats are narrower and more honest than “our model is better”:
The goal isn't to be a smarter chat window. The goal is to be the tool that makes every model good at building Python desktop apps, so users can pick the model that fits their budget and still get professional results. Haiku with the Whittl Layer matches bare-Sonnet on typical multi-file Python app generation in internal testing. That's a value proposition no other tool can honestly make.