GPT-2: Too Dangerous To Release (2019)

naokishibuya.github.io (2022-12-30) · On Hacker News (2026-06-10)

277 points · 121 comments on HN · read original →

Points and comments are a snapshot, not live.

OpenAI withheld GPT-2's largest model in 2019 over misuse concerns, then released it nine months later after observing no strong evidence of harm.

GPT-2 was a scaled-up version of GPT-1 with 1.5 billion parameters trained on 40GB of web text, achieving state-of-the-art results across multiple benchmarks. OpenAI initially refused to release the full model, citing malicious application risks. After nine months, they released it anyway, publishing five key findings: humans find GPT-2 outputs convincing, it can be fine-tuned for misuse, detection is challenging (RoBERTa achieved roughly 95% detection rates), no strong evidence of misuse emerged, and standards for studying bias were needed.

The author reflects that GPT-2 now seems less dangerous in hindsight, especially compared to ChatGPT. However, newer concerns persist: students using AI for homework, hard-to-detect cheating, and broader misuse prevention challenges that grow with model capability.

What commenters are saying

The thread splits between those skeptical of OpenAI's "responsible disclosure" as marketing (framing safety concerns as a business advantage) and those defending the need to take AI risks seriously. The skeptical camp argues the company profits from regulation that constrains competitors while their hosted access enables misuse anyway. The other side counters that real harms occurred: AI-generated spam flooded the internet, students cheat, propaganda spreads, and legitimate jobs in translation and coding were disrupted. One commenter notes the broader enshittification of the internet predates LLMs but has accelerated dramatically, while another observes that preventing misuse at scale is nearly impossible when bad actors have access to hosted models.