The New York Times vs. OpenAI: a Historic Copyright Battle Begins
Cecilia Ziniti
CEO & Founder, GC AI | General Counsel and CLO | AI Investor | Board Member
Breaking - the New York Times filed suit against OpenAI and Microsoft this morning. The complaint lays out the single best copyright suit I’ve seen on generative AI yet. The NYT alleges unauthorized use of its copyrighted content to train AI models like OpenAI's ChatGPT. I just finished the 69-page complaint. Here’s my analysis.?
?? Elements of Infringement - Access & Substantial Similarity
The complaint demonstrates meticulously and clearly what’s needed to prove infringement: access & substantial similarity between the original and infringing works. The NYT's content is the biggest proprietary data source in the set used to train GPT and is weighted heavily. The complaint cites a stunning eight examples where ChatGPT output matches NYT articles verbatim. Visuals lay it bare side by side. Copied text appears in red, and the few new words GPT generated in black – a stark contrast that a jury would cling to.? Example:
I don’t see OpenAI defending this practice effectively unless they more clearly instruct GPT not to produce verbatim articles or make other big tech changes. It would be better here to settle and pay NYT than have the court get involved in the design of these restrictions.?
?? The NYT is the Perfect Plaintiff
The legacy of the NYT's journalism, from an in-depth exposé on niche topics like taxi industry lending to comprehensive Wire Cutter reviews, underscores the originality of and labor behind the works. These aren't just articles; they're the product of extensive research and creativity, challenging any future fair use claims from OpenAI.?
But - note a common misconception about copyright that shows up in this otherwise excellent complaint: copyright does not protect labor, only creativity. So while the complaint’s stats of one article requiring 600 in-person interviews are amazing - I’d advise NYT to focus on the thought and strategy versus the labor of the work). Also note that this is a sharp contrast with the suit against GitHub Copilot, where the copied works cited were simple and open-source code. Plaintiffs in the GitHub case have had to amend their complaint a few times, and I don't see them succeeding. NYT likely will.
? Failed negotiations paint a picture of damages to NYT
NYT reached out to OAI to negotiate in April 2023. Since then, OpenAI has reached successful licensing agreements with other media outlets like Politico to train on their content. My hypothesis? OAI is lowballing NYT or refusing an ongoing royalty.
领英推荐
In my view, this is a strategic error by OpenAI. The more money OpenAI makes and examples NYT can cite over time,? the higher the Times' damages and leverage. Also notable?? The complaint explains that 50-100M people interact with NYT’s digital content each week, and NYT very carefully crafts its paid versus subscription content user journey.?
?? Big Bad Tech & The Public Good?
The complaint frames OpenAI as profit-driven, no longer open, and raking in $80M a month on its tech. It contrasts this with the public good served by journalism and NYT’s impact on the world. It's a compelling story likely to resonate in the courtroom, where the judge will, as part of the analysis, weigh the societal value of copyright against tech innovation - a juxtaposition that’s part of every historic copyright case.
??Smart alternative claims of misinformation
The complaint cleverly alleges alternate claims of misinformation around Bing and ChatGPT hallucinations. My favorite line? “Had Bing Chat actually [copied verbatim], it would have committed copyright infringement. But … [instead, it] completely fabricated a paragraph, including specific quotes … that appear nowhere in The Times article [ ] or anywhere else on the internet.”
Even if legally, these end up having no merit -- I don't know the law in this area like I know copyright to be able to say offhand - putting the scary world of hallucination in front of the judge (and jury of public opinion) is a smart litigation move.
?? Good lawyers matter.
Susman Godfrey's record of successfully taking on tech giants from Google to AT&T to Spotify adds gravitas to the NYT's claim. This isn't opportunism like the lawsuits filed a week after ChatGPT was released.?
This case is a potential watershed moment for AI technology and intellectual property rights. Stay tuned!
#AII #CopyrightLaw #TechNews #LegalAnalysis #ChatGPT
??Sharing your message with my voice ?? Shining a light on your business with digital marketing tools ?? Accelerating Growth with Ads??Boosting my community with KBOO Community Radio ??
10 个月Thank you for the awesome article explaining the case. So helpful. As a daily user of AI tools and a paid customer on OpenAI, I'm interested in seeing how this plays out. I would also like to know where Google's Bard fits in. Recently, OpenAI used quotes with a link to the source, that was a new and welcome surprise. I asked about this many months back because Ubersuggest had a beta version of an AI writer that actually listed the sources so we could site them. When they took it out of Beta they took that information away. ?? I like the fact that the companies are being helped accountable. Now, if they could implement an APA style reference that would be ??.
First Party Property Litigation/ Business & Legal Affairs/ 40 Under 40 and Circle of Excellence recipient
10 个月This was such a cool summary to read on a very much interesting topic. Thank you!
Head of AI Product Marketing | x-Google, Meta, Microsoft | Advisor
10 个月It's going to be very hard for OpenAI to wriggle out of this
I lead AI-first technology operations.
10 个月Jeffrey Ouimet?
GP @ Responsibly Ventures
11 个月Cc Jason Calacanis