We may now be living in a golden age of software copyright jurisprudence. Perhaps not in eventual outcomes – the Supreme Court’s decision in Google v. Oracle left many observers underwhelmed and many questions frustratingly unanswered – but in the richness of the number of disputes currently being raised that could help elide some of the trickier unanswered questions in the law.
To this end, the Does v. Github litigation (aka the “Copilot case”) might be on the leading edge of teasing out some really interesting issues that have long been lingering under, or are brand new to, the application of copyright to the “art” of software programming. On 4 May 2023 we had the first hearing in that case, which might show the direction that that case is heading and where newer or unexplored issues might develop in that case.
The hearing itself was a classic civil procedure dispute, where the defendants (Github, Microsoft, and various OpenAI entities) asked the court to toss out on procedural grounds most, if not all, of the claims made by the Plaintiffs – four programmer who have posted their code on Github under various open source licenses.
Requesting a court to toss out claims – or an entire case – early in the proceedings is a fairly common tactic in federal litigation as the result of two cases from the U.S. Supreme Court from the late-2000s now sometimes collectively referred to as “Twiqbal.” What has evolved in response to those cases – and the prevalence of defendants requesting dismissal under them – is a bit of a do-si-do where the judge finds some of the grounds for dismissal meritorious, but gives the plaintiff some latitude to clean up or better outline some or all of their claims so that they may survive. It very much appears that that is going to happen in the Copilot case – although plaintiffs also may be dropping some of the original claims, which were about a wide a net to cast as possible under federal and state law with one notable exception – no claim of direct or indirect copyright infringement has been lodged.
Other issues the hearing touched on, which could be of substantial interest and which may get teased out more detail if and when this litigation progresses are the following:
· The Sony/Betamax case – the 1980s case that found fair use for “time shifting” using VCRs – might still yet have some continued vitality. It appears that there may be a debate over whether, and to what extent, Generative AI tools like Copilot engage in substantial non-infringing use of the content they are trained upon, even if there are certain uses that nevertheless infringe. During hearing, certain numbers were thrown around (for example, that Copilot may have, in 1% of the prompts made to it, reproduced copyrightable content); the extent to which these numbers are borne out, and are sufficient to invoke Sony/Betamax fair use, will be interesting to watch.
· Section 1202b of the U.S. Copyright Act (17 USC § 1202(b)) (part of the controversial Digital Millennium Copyright Act – the DMCA) precludes the removal, without permission of a copyright holder, of so-called “CMI” – copyright management information. Is it possible to violate this part of the copyright statute even if you are not using – or are using in a way that doesn’t require the copyright holder’s permission – any copyrightable material to which that CMI is attached? Given that there is, as yet, no claim being made of copyright infringement, this may be a crucial question to sustain federal claims – and federal remedies, which in the case of CMI, can be more substantial than any copyright infringement claim – as statutory damages are assessed based on “act of circumvention” not “with respect to any one work”.
· Is there a separate state law claim for breach of contract available even if there may be also a claim available for copyright infringement (or, for that matter, a claim for removal of CMI)? This is an issue that’s currently being requested to be reviewed by the U.S. Supreme Court under the doctrine of “copyright pre-emption” – which holds that you may not pursue via state law claims an action that is really no more than a federal copyright infringement action. The extent two which this issue applies in open source licensing is a long-standing debate, which I addressed in my recent book chapter on open source licensing and copyrights.
· What sort of activities that Copilot (or other Generative AI tools) does can establish a license violation? Most of the press on the matter has focused on the output of the Copilot tool – the code suggestions that Copilot provides in response to a user prompt – but can a claim also be made that the mere act of training the tool is itself a license violation? That claim would seem to be inconsistent with Software Freedom One: “The freedom to study how the program works” – does that freedom come with conditions? At one point, a lawyer for the Plaintiffs – said to be an active member of the open source community – said that at least the Affero GPL might impose license compliance obligations merely by the act of training. If validated, this theory might substantially impact how companies deal with code under this license in the future.
And finally, although not argued at this stage of the lawsuit, there remains the underlying – and frustratingly substantially unresolved – issue of how does one separate copyrightable expression from uncopyrightable ideas or functionality in the area of software. This was the subject of an extensive, and thoughtful analysis of the trial court in the Google v. Oracle dispute, but that decision was eventually overturned and the litigation from that point forward revolved around the thornier (and substantially fact-dependent) question of fair use.
Will this case rise, and converge, to address some of the more interesting and thorny issues in software, open source and the law? That remains to be seen – but the case is one worth following.