Skip to content

Menu

Network by SubjectChannelsBlogsHomeAboutContact
AI Legal Journal logo
Subscribe
Search
Close
PublishersBlogsNetwork by SubjectChannels
Subscribe

AI Code Leak Exposes the Fault Lines of Copyright

By Diego Freire on April 9, 2026
Email this postTweet this postLike this postShare this post on LinkedIn

One of the most valuable AI companies in the world may have just accidentally given away one of its crown jewels and immediately turned to copyright law to limit the damage. On March 31, 2026, Anthropic accidentally exposed the source code for Claude Code, one of its most valuable AI products. The company issued Digital Millennium Copyright Act (DMCA) takedown requests that removed over 8,000 copies of the leaked code from GitHub. However, within hours of the leak, and before those takedown requests could be processed, a developer used AI to translate the entire codebase into a new Python repository that became the fastest-growing in GitHub history. This all comes at a time when AI companies are relying heavily on fair use as a defense in their mounting copyright infringement lawsuits over their use of copyrighted works to train their models, and as courts have made it clear that copyright law does not recognize AI as an author. This article examines how copyright law applies to AI-generated works and what the AI Code leak reveals about the tensions at the heart of AI and intellectual property.

The leak occurred when Anthropic inadvertently included a JavaScript source map file, intended for internal debugging, in a public software release of Claude Code.[1] Before most people were even awake, the discovery was posted on social media, and within hours, the codebase was mirrored across GitHub. The leaked code revealed the internal framework that makes Claude Code work and the proprietary architecture that gives Anthropic a competitive advantage.[2] At the same time as the leaked code was mirrored across GitHub, a developer used an AI tool to port the code to Python, and pushed the “claw-code” before sunrise to become the fastest-growing GitHub repo in history, hitting 100,000 stars in one day. Anthropic, shortly after discovering the leak, quickly issued DMCA takedowns removing over 8,000 copies, but the Python rewrite remains accessible and, as of this writing, is being actively developed by the community. This brings multiple Copyright questions, including:

Does the Claude Code Have Copyright Protection and Qualify for DMCA Takedowns?

The central question is whether Anthropic can claim copyright over code that is reportedly 90% AI-generated. The answer may be yes, but with significant caveats. Under U.S. copyright law, only works created by human authors qualify for protection. This means AI-generated works, standing alone, are not protected under copyright law. However, this does not mean that a work loses protection merely because AI was involved in its production. Copyright can still attach to human contributions in a mixed human-AI work through selection, coordination, arrangement, revision, curation, testing, and integration choices made by human engineers.

Even though the bulk of Claude Code was generated by AI, copyright can still attach to human creative choices through the selection, arrangement, coordination, and organization of the code. If Anthropic engineers made creative decisions about how to structure, organize, or curate the AI-generated output, those human contributions could be protectable. Thus, the 90% AI-generated figure does not mean 0% copyright; it means the scope of protection may be narrower, limited to the human-authored elements.

Under DMCA, copyright holders can demand that platforms remove infringing content by submitting a takedown notice. To issue a valid DMCA takedown, the copyright holder or its authorized agent must identify the copyrighted work claimed to be infringed, identify the infringing material and its location, provide a statement of good faith belief that the use is not authorized, and certify under penalty of perjury that the information is accurate and that the complainant is authorized to act on behalf of the copyright owner. Platforms like GitHub must expeditiously remove or disable access to the material to preserve their DMCA safe-harbor protection.[3]

With that in mind, Anthropic’s DMCA claims likely rest on these human contributions, not the AI-generated portions. Because the leaked code was unpublished proprietary source code that was never meant to be public, and because wholesale copying of the entire codebase occurred, Anthropic has a strong basis for asserting that the mirrors and copies infringe its copyright in the protectable human-authored elements. Accordingly, the DMCA provides a mechanism to remove such infringing copies from centralized platforms like GitHub, while cease-and-desist letters and, where available, copyright infringement suits may be used against identifiable operators, hosts, or developers behind mirrored copies on platforms that lack a comparable takedown process.

Does the Python Code Have Copyright Protection, Who Owns It, and Does It Qualify for DMCA Takedown by Anthropic?

Under copyright law, a derivative work is a new work based on a preexisting work, such as a translation, adaptation, or transformation. The copyright owner has the exclusive right to create or authorize derivative works. This means that translations of copyrighted code into another programming language may constitute derivative works if they incorporate protectable expression from the original.

The Python code may itself be protected by copyright if it includes creative human choices in its implementation, arrangement, or structure. However, whether the Python code has independent copyright protection does not determine whether Anthropic can issue a DMCA takedown. The relevant question is whether the Python rewrite copies the protectable expression from the leaked Claude Code. If the Python rewrite copies a protectable expression, like source code or other protectable expression, including a nonliteral expression in limited circumstances, it could be considered an unauthorized derivative work, regardless of whether the Python code includes its own creative contributions.

If the Python code is an unauthorized derivative work, its author would own copyright only in their original contributions, not the underlying Anthropic expression. The derivative work would incorporate Anthropic’s protectable expression without authorization. Anthropic, as the owner of the underlying copyright in the leaked Claude Code’s protectable elements, would retain rights in those elements even as they appear in the derivative work. The question becomes whether the Python code’s author copy protectable human expression, or only unprotectable ideas and functional elements. If the Python rewrite captures the same architecture and structure, Anthropic may argue it is an infringing derivative, even if no literal code was copied.

If the Python rewrite is an unauthorized derivative work that copies protectable expression from leaked Claude Code, Anthropic could issue a DMCA takedown request against it. However, this is more legally complex than taking down direct mirrors of the leaked source code. A clean-room rewrite that only replicates unprotectable ideas, systems, or functional concepts, without copying protectable expression, may fall outside copyright infringement altogether. That is because copyright does not protect ideas, only expression. The success of a DMCA takedown against the Python rewrite would depend on whether Anthropic can demonstrate that protectable expression was copied, not merely functional concepts or unprotectable ideas.

Does the Python Code Qualify Under Fair Use?

Fair use permits limited use of copyrighted material without permission under certain circumstances, based on four factors: (1) the purpose and character of the use, including whether it is commercial or transformative; (2) the nature of the copyrighted work; (3) the amount and substantiality of the portion used; and (4) the effect of the use on the potential market for the copyrighted work.

From Anthropic’s perspective, there is a strong argument that fair use should not protect wholesale mirroring or a close Python reimplementation of the leaked code. Anthropic can argue that this use is commercial and market-substituting rather than transformative since the point is not criticism, commentary, scholarship, or internal analysis, but reproducing or closely reimplementing a competing product. The remaining factors also tend to favor Anthropic, as the leaked material is unpublished proprietary source code; what was copied was effectively the entire codebase; and widespread mirroring or close reimplementation threatens direct substitution by allowing rivals to bypass the time and expense of developing similar systems themselves.

The unauthorized nature of the leak further weakens any fair use defense. Even if courts ultimately conclude that some training on lawfully acquired copyrighted works may qualify as fair use, the Python rewrite arises from code that was never lawfully released to the public. That does not create a categorical rule that every downstream use automatically fails fair use. But it does strengthen Anthropic’s argument that this case is fundamentally different from the broader AI training debate. The question here is not whether copyrighted works can be copied for a distinct model-training purpose, but whether leaked, unpublished proprietary code can be copied and closely reimplemented in a way that substitutes for the original. On those facts, a fair use defense is much weaker and may well fail.

The collision of copyright and AI is straining a legal framework built for human creators, not systems that can generate millions of lines of code or absorb billions of copyrighted works. Courts are beginning to grapple with these issues, but the questions are multiplying faster than the answers. The Claude Code leak captures the central tension. AI companies invoke fair use when they ingest copyrighted works to train models, yet turn to copyright when their own AI-assisted outputs are copied. Those positions are not formally inconsistent since fair use and copyrightability are distinct questions, and training a model is not the same as copying leaked source code to recreate a competing product. Thus, as AI becomes more deeply embedded in creative and technical work, copyright law will have to adapt. The basic principle remains that copyright protects human expression. Where human judgment, selection, arrangement, and creative direction shape the work, copyright still has a claim.


[1] This has been reported as a human error, not a security breach.

[2] This “agentic harness” is the system that connects the AI model to real-world tasks, and the leak exposed several key features: a sophisticated memory system that allows the AI to retain and organize information across sessions, an autonomous background mode called KAIROS that enables the AI to work continuously without user input, internal codenames for unreleased AI models, and a stealth mode designed to allow the AI to contribute to public projects without revealing its involvement. For competitors, this is seen as a blueprint that can show how Anthropic built one of its most valuable products.

[3] DMCA only works against centralized platforms, while decentralized mirrors and code-sharing sites are harder to reach and are still hosting the leak code.

Photo of Diego Freire Diego Freire

Diego F. Freire is an associate in the firm’s Intellectual Property Group. He concentrates his practice on intellectual property law matters, including patent and trademark prosecution, due diligence, and clearance/opinion matters.

  • Posted in:
    Privacy & Data Security
  • Blog:
    The Firewall
  • Organization:
    Dykema
  • Article: View Original Source

LexBlog logo
Copyright © 2026, LexBlog. All Rights Reserved.
Legal content Portal by LexBlog LexBlog Logo