From Screenshot to Desktop App¶
This is the complete workflow recipe from attached screenshot to shipped executable. The feature page for Screenshot to App explains the how; this page puts it in the context of a real project end to end.
Estimated time¶
About 30-60 minutes for a finished, polished app starting from one screenshot. Breakdown:
- Initial generation: 1 minute
- Iteration to match the screenshot: 10-30 minutes (5-10 prompts)
- Polish and assets: 10-20 minutes
- Build: 2-5 minutes
Total cost on mid-tier backends: $0.10-$0.50 for the whole flow.
Step 1: Prepare your screenshot¶
Capture at native resolution¶
Use a real screenshot capture, not a thumbnail. 1920×1080 or larger is ideal. The AI needs to read the UI text clearly; scaled-down screenshots compress text to unreadable pixel soup.
Win+Shift+S for the built-in Snipping Tool. Pick Rectangular snip and grab the full window.
Depends on your desktop environment. GNOME: Print. KDE: Spectacle. Flameshot works well cross-distro.
What makes a good reference¶
- Full window or full screen — not a zoomed-in crop that hides context
- All controls visible — menu, buttons, form fields, status bar
- Text that's readable (you can read it → AI can read it)
- One clear layout — don't combine multiple screens into a collage
What makes a bad reference¶
- Blurry or heavily compressed images — upload loses text clarity
- Dark-on-dark or low-contrast themes — harder for the model to extract control boundaries
- Mobile screenshots for desktop targets — the AI will honor the mobile layout even if you want desktop
- Watermarks or stamps over the UI — confuses the model
Step 2: Set up the project¶
- + New in the Projects panel
- Name it after what you're building
- Target: Desktop (PySide6) (unless you specifically want CustomTkinter or Flet)
- Click Create
An empty project opens.
Step 3: Pick a vision-capable backend¶
Whichever of these you have configured:
- Claude direct (Sonnet or Haiku) — highest fidelity
- OpenRouter with
google/gemini-2.5-flash-lite— cheapest that works - OpenRouter with
anthropic/claude-haiku-4-5— mid-tier, solid quality - Gemini direct — native vision, good for large contexts
See Choosing a Backend and Screenshot to App for the tradeoffs.
Step 4: First prompt¶
Attach the screenshot (drag-drop, paste, or file picker). Then write a short prompt:
Rebuild this as a PySide6 desktop app with the same layout.
Keep the code to a single file if it fits cleanly; otherwise split
into ui/ and core/ folders.
Don't over-specify. You want the AI to make its initial pass. You'll iterate from there.
Click Generate.
What to expect from the first output¶
- Overall layout approximately correct. Sidebar on the left if you had a sidebar, etc.
- Control types correctly identified. Buttons are buttons, dropdowns are dropdowns.
- Colors approximately matching. The AI extracts an approximate palette but exact hex codes need follow-up.
- Text content mostly correct. Transcribed from the image, usually accurate.
What NOT to expect¶
- Pixel-perfect spacing and margins. You'll iterate these.
- Icons. The AI sees icon shapes but names generic ones. Plan to add your own via
assets/. - Working functionality. The AI implements obvious things (a Settings button opens a settings dialog) but complex behavior isn't in the screenshot.
Step 5: Run it and see what's wrong¶
Click Test Run. The app window appears.
Compare it side-by-side with your reference screenshot. What are the three biggest differences?
Common first-iteration gaps:
- The layout is close but a specific region is too narrow or wide
- A particular control uses the wrong widget (QComboBox instead of a radio group, etc.)
- Colors are in the wrong zone of the palette (too beige, should be more navy)
- The scroll behavior isn't what you want
- A button doesn't have the click handler you expected
Step 6: Iterate structurally first, polish second¶
Don't jump to "make the buttons orange." First nail structure:
Prompt 1 (structure):
"The sidebar is too narrow. Make it 220px wide and the main content
should fill the rest."
Prompt 2 (behavior):
"The 'New File' button should open a file picker dialog. Save the
picked path to the recent-files list."
Prompt 3 (behavior):
"Add a status bar at the bottom showing the current file name and
unsaved indicator."
Structure and behavior settle first. Polish (colors, fonts, icons) is last.
Each prompt is 1-3 sentences. Be specific about WHAT changes and WHERE.
Step 7: Polish details¶
Once structure is right, polish:
Prompt 4 (colors):
"Match the button color in my screenshot — it's a warm tan, roughly
hex #C9A876."
Prompt 5 (typography):
"Use a larger font for section headers. The main title should be
noticeably bigger than body text."
Prompt 6 (spacing):
"Add more padding inside the settings panel — currently controls
are too close to the edges."
Step 8: Add assets¶
Drop your logo into assets/:
- Assets panel (tab next to Code)
- Drag your logo file → Assets panel
- Back to chat:
Use assets/logo.png in the app header, sized to 48px tall, with
the app name as text to the right of it.
Repeat for icons, backgrounds, sound effects, whatever your app needs.
Step 9: Test on the real thing¶
Close the Test Run window and re-open it — simulates cold launches users will experience.
Click through every feature you expected to work. Note anything that crashes or doesn't do what you want. Add to the iteration list.
Use Debugging a Crash for any errors that surface.
Step 10: Build¶
Once the app works and looks right, build a distributable:
- Download (F3) in the preview panel bar
- Build dialog → leave defaults
- Click Build
- Wait 1-3 minutes
- Build output folder opens
See Building an Installer for what you get, how to distribute, and common build issues.
Real-world example¶
From the v2.3 field test log:
User's prompt: "make me an app that looks and functions like this" with an attached music player screenshot.
Generation: Qwen3.5-Plus via OpenRouter, 116 seconds, 9,313 tokens, ~$0.01
Output: 20,640-char single-file PySide6 app with player controls, library view, now-playing display, seek bar, volume, metadata sidebar. Mostly working on first launch.
Iterations: ~4 follow-ups over 10 minutes to refine layout and color palette. Each iteration: 30 seconds, ~$0.01.
Total: ~$0.05, 20 minutes, from screenshot to working music player.
Your mileage varies by model and project complexity, but the shape of the workflow is consistent.
Tips that compound over time¶
- Save good projects as templates. When you build a great login + main-view + settings scaffold, template it. Next screenshot-to-app run starts from that scaffold instead of blank.
- Accumulate skills. If you repeatedly prompt "make scrollbars tan-colored" because the AI keeps defaulting to grey, write a skill once and it applies automatically forever. See Skills System.
- Reference your own past work. You can drop a screenshot of an earlier Whittl-generated app as a reference: "Use a similar sidebar to this one I made before."
What's next¶
- Screenshot to App feature — the feature-level reference
- Iterating on Generated Code — the iteration loop in depth
- Building an Installer — the last step