Gemini in Whittl¶

Google's Gemini is the best free-tier backend for Whittl and the leader for long-context work. 2.5 Pro handles context windows over 1M tokens, 2.5 Flash has a generous free tier suitable for hobbyist use, and every modern model supports vision natively.

Getting a key¶

Go to aistudio.google.com/apikey.
Sign in with a Google account.
Click Create API key. Pick or create a Google Cloud project.
Copy the key (starts with AIza).

Add it in Whittl via Edit → Preferences → API Keys → Gemini API Key.

Picking a model¶

2.5 Pro¶

The quality tier. Strong on complex code, handles huge context (1M+ tokens). Use Pro for:

Very large codebases where you want to send everything to the AI
Complex architecture or refactoring tasks
Tasks where Claude Sonnet feels expensive

Free tier: Yes but rate-limited (~5 requests/min, 25 requests/day). Paid tier: Fair middle-tier pricing.

2.5 Flash¶

The workhorse. Fast, free-tier-heavy, handles typical Whittl tasks well. Use Flash for:

Most generation and modification work
When you're burning through iterations and don't want to pay per request
Projects that fit the free tier quota

Free tier: Yes, very generous — suitable for most hobbyist use. Automatic prompt caching: 75% discount on cached content.

2.5 Flash-Lite¶

Cheapest option in the Gemini family. Use Flash-Lite for:

Quick iterations where you don't need Flash's full quality
Very high-volume workflows where even cents add up
Vision tasks (Flash-Lite handles screenshots fine)

3 Flash¶

Newer, typically better quality than 2.5 Flash at similar cost. Worth trying once it's been stable in your workflows for a week.

Switching models¶

The Gemini model dropdown next to the backend selector chooses the variant. Settings persist per project.

Whittl defaults to 2.5 Flash because it hits the sweet spot of quality, cost, and automatic caching.

Free tier quotas¶

Google's free tier gives you (as of mid-2026, check ai.google.dev/pricing):

2.5 Flash: 15 requests/min, 1M tokens/day, 1,500 requests/day
2.5 Flash-Lite: 30 requests/min, 1M tokens/day, 1,500 requests/day
2.5 Pro: 5 requests/min, 250K tokens/day, 25 requests/day (paid tier is much less limited)
3 Flash: similar to 2.5 Flash

For casual Whittl use (a project or two per day, normal iteration), the free tier covers everything. Heavy usage on complex projects may hit the daily cap and you'll need to upgrade.

Automatic prompt caching on Flash¶

Gemini 2.5 Flash automatically caches repeated prompt prefixes at a 75% discount. In Whittl's long-session workflows (iterate on one project across 10+ prompts), this means:

First request: full price for system prompt + tool defs + code context
Subsequent requests: 25% of that price for the same content

You don't configure anything — it's automatic on Flash.

Vision¶

Every modern Gemini model supports vision natively. Drop a screenshot into chat; it works.

Notes:

Gemini's vision quality on UI screenshots is very close to Claude's
2.5 Pro has the largest image context (multi-image prompts work well)
Flash-Lite is the cheapest vision-capable option across all backends

Long context (the 1M+ case)¶

Gemini 2.5 Pro handles context windows over 1 million tokens. That's roughly a whole mid-sized codebase in a single prompt.

Practical impact:

Whittl's smart routing kicks in less often on Gemini Pro — you can just send all the files
Large monolithic projects that Claude would struggle with fit cleanly in Pro's window
Multi-turn sessions with lots of history don't hit context limits

The caveat: latency scales with context length. A 500K-token prompt takes 15-30 seconds to process even before generation starts.

Safety filters¶

Gemini has stricter safety filters than Claude. Occasionally:

A prompt mentioning a security topic triggers a refusal
A screenshot containing certain visual content causes the model to decline
Certain framing ("build me a tool that monitors user activity") is blocked

If you hit a safety filter, reword the prompt with more benign language. If it persists, switch backends temporarily — Claude is more permissive on code-related requests.

Rate limits on paid tier¶

Once you move past the free tier (pay-as-you-go billing enabled on your Google Cloud project):

2.5 Pro: elevated limits, scales with project billing
2.5 Flash / Flash-Lite: very high limits, rarely hit in practice

Gemini pricing and free tier¶

Stay on the free tier¶

2.5 Flash's free tier is generous enough that you can run a Whittl project for days without needing paid access. Structure your work around it.

Use Flash-Lite for pure iteration¶

When you're doing 20 small edits in a row, Flash-Lite is cheaper than Flash and adequate for the task.

Lean on caching¶

Long sessions get cheaper over time via the automatic 75% discount on cached content. Don't start fresh projects for every task.

When NOT to use Gemini¶

Your prompts trigger safety filters. Switch to a more permissive backend.
You need the best quality at any price. Claude Opus > Gemini 2.5 Pro on pure coding benchmarks.
Agent Mode on hard tasks. Gemini Pro can do it, but Claude Sonnet is more reliable at multi-round tool use.

Troubleshooting¶

400 API key not valid but the key looks right

Common: the API key was created against a Google Cloud project that doesn't have the Generative Language API enabled. In Google Cloud Console, ensure "Generative Language API" is enabled for your project.

Hit daily quota on free tier

You've used your free allocation. Either wait until the next day (Pacific time reset), or upgrade to pay-as-you-go in Google Cloud Console.

Candidate blocked: SAFETY errors

Safety filter tripped. Reword the prompt. If it persists on legitimate requests, file feedback via AI Studio.

Gemini seems slower than benchmarks suggest

Time of day and region. Pacific peak hours are slower. European users sometimes see higher latency than US users. Not a Whittl issue.

What's next¶

Screenshot to App — Gemini 2.5 Flash-Lite is a top pick for this workflow
Multi-file Projects — Gemini Pro's long context pairs with multi-file naturally
OpenRouter — alternative route to Gemini if you want consolidated billing