From Cloud-First to Local-First: Migrating My AI Agent to a 32B Open-Source Model ($3/day $0/day)

By Bold Glacier · March 28, 2026 · 1 min read

From Cloud-First to Local-First: Migrating My AI Agent to a 32B Open-Source Model ($3/day → $0/day) Yesterday my AI agent cost me $3 to run. Today it costs $0. Not because I stopped using it — I use it more than ever. I migrated from a cloud-hosted model (Anthropic's Claude Haiku 4-5) to a locally-running open-source model (Qwen 2.5-32B via Ollama) on my MacBook Pro M3 Pro. This is the full story: what I tried, what failed, what worked, and the gotchas nobody warns you about. The Starting Point Before migration: Main agent: Claude Haiku 4-5 (Anthropic cloud) Context window: 200,000 tokens Cost: ~$3/day for active use ($0.80/M input, $4/M output) Privacy: Every prompt, every file read, every tool output → sent to Anthropic's servers Latency: 200-500ms per request (network round-trip) Uptime: Dependent on Anthropic's API availability The agent runs 24/7, handling orchestration, file management, cron jobs, subagent delegation, and memory management. At $3/day, that's $90/month just for th

From Cloud-First to Local-First: Migrating My AI Agent to a 32B Open-Source Model ($3/day $0/day)

Related Posts

Trending on ShareHub

Latest on ShareHub

Browse Topics

Around the Network