Did Claude increase bugs in rsync?

alexispurslane.github.io · On Hacker News (2026-06-06)

458 points · 461 comments on HN · read original →

Points and comments are a snapshot, not live.

Statistical analysis finds Claude-assisted rsync releases show no unusual bug increase compared to historical distribution.

An analysis of rsync bug data from v2.4.6 to v3.4.3 examines whether Claude-assisted releases introduced more bugs. The study tracked 36 releases with severity-weighted bugs per 10 commits as the metric. Only 2 releases contained Claude commits: v3.4.2 (9 Claude commits, 0.00 sev/10c) and v3.4.3 (28 Claude commits, 3.29 sev/10c). An exact permutation test yielded a p-value of 46%, meaning randomly selected release pairs score as badly or worse 46% of the time. Fisher's exact test gave p=74%. The historical mean (2.95 sev/10c) was 1.8 times the Claude releases' mean (1.65 sev/10c). Neither Claude release qualified as a statistical outlier. The methodology was guided by a statistician and automated to avoid hallucination, with a fully reproducible pipeline available for verification.

What commenters are saying

The thread centers on whether severity weighting and commit-counting metrics properly capture the concern. Top comments argue the analysis omits meaningful context: several replies cite specific severe regressions (broken incremental backups, CPU spikes) that occurred in 3.4.3, and note that commits-as-a-denominator can obscure deterioration if commit volume increases while actual bugs per unit of work worsen. One high-ranked reply points to ChatGPT's assessment that the analysis is technically competent but draws stronger conclusions than the confounded metrics warrant. Defenders counter that the original criticism was itself blunt and that shifting to more nuanced metrics is progress. A maintainer comment indicates v3.4.3 drew attention from fresh code review, creating visibility bias. Alternative implementations (openrsync) are mentioned as having their own trade-offs.