Skip to content

Gemini in Whittl

Google's Gemini is the best free-tier backend for Whittl and the leader for long-context work. 2.5 Pro handles context windows over 1M tokens, 2.5 Flash has a generous free tier suitable for hobbyist use, and every modern model supports vision natively.

Getting a key

  1. Go to aistudio.google.com/apikey.
  2. Sign in with a Google account.
  3. Click Create API key. Pick or create a Google Cloud project.
  4. Copy the key (starts with AIza).

Add it in Whittl via Edit → Preferences → API KeysGemini API Key.

Picking a model

2.5 Pro

The quality tier. Strong on complex code, handles huge context (1M+ tokens). Use Pro for:

  • Very large codebases where you want to send everything to the AI
  • Complex architecture or refactoring tasks
  • Tasks where Claude Sonnet feels expensive

Free tier: Yes but rate-limited (~5 requests/min, 25 requests/day). Paid tier: Fair middle-tier pricing.

2.5 Flash

The workhorse. Fast, free-tier-heavy, handles typical Whittl tasks well. Use Flash for:

  • Most generation and modification work
  • When you're burning through iterations and don't want to pay per request
  • Projects that fit the free tier quota

Free tier: Yes, very generous — suitable for most hobbyist use. Automatic prompt caching: 75% discount on cached content.

2.5 Flash-Lite

Cheapest option in the Gemini family. Use Flash-Lite for:

  • Quick iterations where you don't need Flash's full quality
  • Very high-volume workflows where even cents add up
  • Vision tasks (Flash-Lite handles screenshots fine)

3 Flash

Newer, typically better quality than 2.5 Flash at similar cost. Worth trying once it's been stable in your workflows for a week.

Switching models

The Gemini model dropdown next to the backend selector chooses the variant. Settings persist per project.

Whittl defaults to 2.5 Flash because it hits the sweet spot of quality, cost, and automatic caching.

Free tier quotas

Google's free tier gives you (as of mid-2026, check ai.google.dev/pricing):

  • 2.5 Flash: 15 requests/min, 1M tokens/day, 1,500 requests/day
  • 2.5 Flash-Lite: 30 requests/min, 1M tokens/day, 1,500 requests/day
  • 2.5 Pro: 5 requests/min, 250K tokens/day, 25 requests/day (paid tier is much less limited)
  • 3 Flash: similar to 2.5 Flash

For casual Whittl use (a project or two per day, normal iteration), the free tier covers everything. Heavy usage on complex projects may hit the daily cap and you'll need to upgrade.

Automatic prompt caching on Flash

Gemini 2.5 Flash automatically caches repeated prompt prefixes at a 75% discount. In Whittl's long-session workflows (iterate on one project across 10+ prompts), this means:

  • First request: full price for system prompt + tool defs + code context
  • Subsequent requests: 25% of that price for the same content

You don't configure anything — it's automatic on Flash.

Vision

Every modern Gemini model supports vision natively. Drop a screenshot into chat; it works.

Notes:

  • Gemini's vision quality on UI screenshots is very close to Claude's
  • 2.5 Pro has the largest image context (multi-image prompts work well)
  • Flash-Lite is the cheapest vision-capable option across all backends

Long context (the 1M+ case)

Gemini 2.5 Pro handles context windows over 1 million tokens. That's roughly a whole mid-sized codebase in a single prompt.

Practical impact:

  • Whittl's smart routing kicks in less often on Gemini Pro — you can just send all the files
  • Large monolithic projects that Claude would struggle with fit cleanly in Pro's window
  • Multi-turn sessions with lots of history don't hit context limits

The caveat: latency scales with context length. A 500K-token prompt takes 15-30 seconds to process even before generation starts.

Safety filters

Gemini has stricter safety filters than Claude. Occasionally:

  • A prompt mentioning a security topic triggers a refusal
  • A screenshot containing certain visual content causes the model to decline
  • Certain framing ("build me a tool that monitors user activity") is blocked

If you hit a safety filter, reword the prompt with more benign language. If it persists, switch backends temporarily — Claude is more permissive on code-related requests.

Rate limits on paid tier

Once you move past the free tier (pay-as-you-go billing enabled on your Google Cloud project):

  • 2.5 Pro: elevated limits, scales with project billing
  • 2.5 Flash / Flash-Lite: very high limits, rarely hit in practice

Gemini pricing and free tier

Stay on the free tier

2.5 Flash's free tier is generous enough that you can run a Whittl project for days without needing paid access. Structure your work around it.

Use Flash-Lite for pure iteration

When you're doing 20 small edits in a row, Flash-Lite is cheaper than Flash and adequate for the task.

Lean on caching

Long sessions get cheaper over time via the automatic 75% discount on cached content. Don't start fresh projects for every task.

When NOT to use Gemini

  • Your prompts trigger safety filters. Switch to a more permissive backend.
  • You need the best quality at any price. Claude Opus > Gemini 2.5 Pro on pure coding benchmarks.
  • Agent Mode on hard tasks. Gemini Pro can do it, but Claude Sonnet is more reliable at multi-round tool use.

Troubleshooting

400 API key not valid but the key looks right

Common: the API key was created against a Google Cloud project that doesn't have the Generative Language API enabled. In Google Cloud Console, ensure "Generative Language API" is enabled for your project.

Hit daily quota on free tier

You've used your free allocation. Either wait until the next day (Pacific time reset), or upgrade to pay-as-you-go in Google Cloud Console.

Candidate blocked: SAFETY errors

Safety filter tripped. Reword the prompt. If it persists on legitimate requests, file feedback via AI Studio.

Gemini seems slower than benchmarks suggest

Time of day and region. Pacific peak hours are slower. European users sometimes see higher latency than US users. Not a Whittl issue.

What's next

  • Screenshot to App — Gemini 2.5 Flash-Lite is a top pick for this workflow
  • Multi-file Projects — Gemini Pro's long context pairs with multi-file naturally
  • OpenRouter — alternative route to Gemini if you want consolidated billing