AI outperforms law professors in Stanford Law study
Stanford study finds law professors rate AI responses higher than peer-written answers in 75% of contract law comparisons.
A Stanford Law School study led by Professor Julian Nyarko evaluated nearly 3,000 blind comparisons of AI-generated and professor-written answers to contract law questions from 16 law professors across U.S. law schools. AI responses won 75% of matchups and were flagged as pedagogically harmful only 3.5% of the time versus 12% for peer answers. The study tested whether large language models could serve as effective tutors in a field requiring judgment and nuanced reasoning rather than factual recall. Multiple evaluation methods were used, and AI systems performed comparably to the best human instructor. Nyarko cautioned against wholesale adoption, noting the question of responsible implementation remains open, but suggested blanket skepticism may be unwarranted.
What HN community is saying
Commenters raised concerns about hallucinated case law and whether LLMs can reliably handle real-world legal work where mistakes have serious consequences. Several noted that AI's superior performance in the study may reflect writing quality and communication skill rather than legal reasoning, and questioned whether metrics like professor preference adequately measure legal competence. Some argued the study shows promise for AI as a tool for expert lawyers who can verify output, but warned against non-experts using AI unsupervised. Discussion also touched on broader risks of atrophy when humans rely on AI that is right 80-90% of the time, versus code's structural safeguards like testing and static analysis.