Ask HN: Has anyone replaced Claude/GPT with a local model for daily coding?

1143 points · 491 comments on HN

Points and comments are a snapshot, not live.

A HN user asks if anyone has fully swapped Claude or GPT for a local model in daily coding work.

The post asks whether developers have fully transitioned from cloud-based coding assistants like Claude or GPT to local models for daily use, not just for side experiments. The author requests details on setup and performance, such as tokens per second.

What commenters are saying

Several users have tried or are interested, but cloud models remain faster and more capable for interactive coding, though local setups are possible with significant hardware investment. One user runs DeepSeek V4 Flash on 2x RTX Pro 6000 Blackwell, achieving 190-980 tok/s and provides detailed cost analysis. Another finds a 27B model equals Haiku or Sonnet on some tasks. However, most report slower speeds (0.7 tok/s on consumer hardware) and missing enterprise tooling. A few joke about wetware substrates.