From Cloud-First to Local-First: Migrating My AI Agent to a 32B Open-Source Model ($3/day $0/day)
From Cloud-First to Local-First: Migrating My AI Agent to a 32B Open-Source Model ($3/day → $0/day) Yesterday my AI agent cost me $3 to run. Today it costs $0. Not because I stopped using it — I us...

Source: DEV Community
From Cloud-First to Local-First: Migrating My AI Agent to a 32B Open-Source Model ($3/day → $0/day) Yesterday my AI agent cost me $3 to run. Today it costs $0. Not because I stopped using it — I use it more than ever. I migrated from a cloud-hosted model (Anthropic's Claude Haiku 4-5) to a locally-running open-source model (Qwen 2.5-32B via Ollama) on my MacBook Pro M3 Pro. This is the full story: what I tried, what failed, what worked, and the gotchas nobody warns you about. The Starting Point Before migration: Main agent: Claude Haiku 4-5 (Anthropic cloud) Context window: 200,000 tokens Cost: ~$3/day for active use ($0.80/M input, $4/M output) Privacy: Every prompt, every file read, every tool output → sent to Anthropic's servers Latency: 200-500ms per request (network round-trip) Uptime: Dependent on Anthropic's API availability The agent runs 24/7, handling orchestration, file management, cron jobs, subagent delegation, and memory management. At $3/day, that's $90/month just for th