PickForge
Widget-level AI context for Flutter.
Local-first. Open source. Built for people who ship.
We do not ship 12 things halfway. We ship a few things all the way — local-first, open source, sharp around the edges.
Widget-level AI context for Flutter.
Pick a widget in your running Flutter app. Pickforge dispatches its full context — source, ancestor chain, screenshots — to Claude Code, Codex, or OpenCode in a new terminal. Local-first. Open source.
Local Linux dictation with AI cleanup.
Hotkey to record. Local whisper.cpp transcribes. A fast model cleans it up. The finished text pastes into whatever you were typing in.
Local AI usage visibility from the tray.
Watch quota windows, provider state, and local model activity without opening a dashboard. PickGauge stays quiet until something needs attention.
Benchmark AI coding models on real Dart and Flutter tasks.
Run codegen and agentic tracks across providers. Hidden verifiers, repeated trials, human-review queues. Stable leaderboards you can defend.
| # | Model | Pass@1 | Speed | Reliability | Score |
|---|---|---|---|---|---|
| 01 | GPT 5.3 Codex Spark Codex | 91.2% | 82 | 96 | 81 ▲ climbing |
| 02 | Claude Sonnet 4.5 Anthropic | 88.7% | 76 | 94 | 79 |
| 03 | DeepSeek V4 Flash DeepSeek | 84.1% | 95 | 88 | 85 |
| 04 | Qwen 3 Coder 32B Ollama Local | 76.4% | 64 | 81 | 76 |
| 05 | OpenCode Go OpenCode | 72.9% | 88 | 78 | 73 |
Every product in the studio. Add a row to products.ts and the grid grows — no redesign required.
Widget-level AI context for Flutter.
Local Linux dictation with AI cleanup.
Local AI usage visibility from the tray.
Benchmark AI coding models on real Dart and Flutter tasks.
If you're building a developer tool, a Flutter app that needs AI that actually understands the widget tree, or a benchmarking rig that holds its weights — we should talk.
Your code, your machine, your rules. We build tools that earn their network calls — never the other way around.
MIT licensed. Public roadmaps. Public source. Public reasoning. If we can't ship it open, we don't ship it.
We live in the widget tree. We know what a real Dart benchmark looks like, and we know the difference between a 90 and a 91.
Designed for the person running it at 2am, not the slide deck on Monday. Sharp UX, dense info, no ceremony.