Cybersecurity researchers aren't happy about the guardrails on Anthropic's Fable
Points and comments are a snapshot, not live.
Article body wasn't reachable. The HN discussion summary is below.
Points and comments are a snapshot, not live.
Article body wasn't reachable. The HN discussion summary is below.
What commenters are saying
Commenters argue Anthropic is using safeguards as anti-competitive protection rather than genuine safety. DeepSeek's less-restricted model makes it more useful for security research, while Anthropic's silent degradation on ML tasks (via prompt modification or steering vectors, not model fallback) catches legitimate research alongside violations. One commenter notes Anthropic's model card explicitly states ML safeguards will not be visible, unlike cybersecurity/biology restrictions. Concerns raised: silent downgrading may charge full price for degraded output; oversensitive detection nets benign ML work; and the approach mirrors anti-consumer tactics by hardware vendors (Nvidia's LHR, train manufacturers bricking repairs). Counterargument: guardrails on AI are inevitable and rule enforcement always produces edge cases and false positives.