Claude 3 does SCOTUS

By Jay O'Keeffe on June 18, 2024

Adam Unikowsky has a fascinating Substack article about running SCOTUS briefs through Claude 3 (h/t Tyler Cowen at Marginal Revolution). Here’s a taste:

. . . I decided to do a little more empirical testing of AI’s legal ability. Specifically, I downloaded the briefs in every Supreme Court merits case that has been decided so far this Term, inputted them into Claude 3 Opus (the best version of Claude), and then asked a few follow-up questions. (Although I used Claude for this exercise, one would likely get similar results with GPT-4.)

The results were otherworldly. Claude is fully capable of acting as a Supreme Court Justice right now. When used as a law clerk, Claude is easily as insightful and accurate as human clerks, while towering over humans in efficiency.

Let’s start with the easiest thing I asked Claude to do: adjudicate Supreme Court cases. Claude consistently decides cases correctly. When it gets the case “wrong”—meaning, decides it differently from how the Supreme Court decided it—its disposition is invariably reasonable…

Of the 37 merits cases decided so far this Term, Claude decided 27 in the same way the Supreme Court did. In the other 10 (such as Campos-Chaves), I frequently was more persuaded by Claude’s analysis than the Supreme Court’s. A few of the cases Claude got “wrong” were not Claude’s fault, such as DeVillier v. Texas, in which the Court issued a narrow remand without deciding the question presented.

Way more at the link.

This is min-blowing. A few thoughts.

First, Adam Unikowsky is smarter than I am.

Second, I’ve run similar conversations with ChatGPT about live cases in my state-court appellate practice (most recently using 4o). I’m not going to dump the results here for reasons that should be obvious. But generally speaking, I was less ready to swap out an associate or clerk for the AI. Now, the cases in Adam’s SCOTUS dataset are going to be more thoroughly briefed than your run-of-the-mill state court appeal. They are also going to deal with legal questions at a greater level of abstraction than an appeal to an intermediate appellate court. All that said, with my questions, the AI would sometimes deliver results that could look facially plausible to someone who was unfamiliar with the case, but might not persuade someone who’d been working in the weeds. If you’ve played with Lexis AI, it’s a similar feeling: The output is legitimately amazing, but a real live associate would deliver better work product today. I have zero confidence that this will still be true in six months, at least with the ChatGPT opinions (who knows how long it takes Lexis to evolve). And, of course, the associate would take orders of magnitude longer and cost the client vastly more money.

Third, is it possible that Claude is a better lawyer than ChatGPT?

Fourth, I have not unleashed ChatGPT on an expert opinion, but that sounds legit fascinating. As would using the AI to outline a cross-examination based on the opinion and a bunch of old transcripts.