Kimi K2.7-Code: open-source coding model with better token efficiency

438 points · 230 comments on HN · read original →

Points and comments are a snapshot, not live.

Moonshot AI releases Kimi K2.7-Code, a 32B-parameter coding model using 30% fewer thinking tokens than K2.6.

Kimi K2.7-Code is a Mixture-of-Experts model with 1 trillion total parameters and 32 billion activated parameters, built on a 256K token context length. It reduces thinking token usage by approximately 30% compared to K2.6 while improving performance on coding tasks. On Kimi Code Bench v2, it scores 62.0 versus K2.6's 50.9 and trails GPT-4.5 (69.0) and Claude Opus 4.8 (67.4). On agentic benchmarks like MCP-Atlas and MCPMark-Verified, it scores 76.0 and 81.1 respectively. The model supports image and video input, includes native INT4 quantization, and is available via Moonshot's API or through vLLM and SGLang inference engines. Released under Modified MIT license.

What commenters are saying

Commenters question the cost advantage. While Kimi costs roughly one-fifth the price of Anthropic's Opus, users report Claude performs noticeably better on real-world coding tasks despite benchmark claims of near-parity. One commenter notes that on DeepSWE (an ungamed benchmark), Opus 4.6 and even GPT-4-mini beat Kimi K2.6 soundly, suggesting total token costs (not per-token price) matter more than advertised rates. Data residency concerns may favor US providers. Some users successfully run open-weight models locally or via cheaper providers, though many report reverting to Claude for consistency and reduced mental overhead.