wake up babe, the new york times just sued openAI
the quandaries and implications of generative AI
Every January for the past few years, I have opened up my class syllabus for Intellectual Property in the Digital Age to update it. For better or for worse, the tremendous (and fast!) rise of ChatGPT, and generative AI (GenAI) more generally, has created way more questions of ethics than answers of law.
Under those circumstances, it was only a matter of time before the lawsuits against GenAI companies. Artists first accused GenAI image generators of infringing their works in January of last year. Then authors, including John Grisham and Jodi Picoult, and the Authors Guild sued OpenAI in September. And in the usually sleepy lull between Christmas and New Year’s, the New York Times filed their complaint against Microsoft and OpenAI.
From a tactical standpoint, the timing of the complaint is peak lawful evil—respondents to lawsuits have a set number of days to respond to complaints, so filing on December 27 ensures that the suit ruined some poor Big Law associates’ holidays. (Including at my former law firm, who represents Microsoft and OpenAI.) You can’t, after all, delay working on a response by ~5 days when you only have 21 days to respond!! (Okay, fine—by waiving “service,” i.e., getting “served” with papers, Microsoft and OpenAI now have 60 days to respond—but that’s still not a ton of time.)
Before we get into the salient details of the lawsuit and my thoughts on what’s interesting—a poll!
I will probably do a mix of formats, but I want to understand what format you most prefer for free posts like this one so I can better tailor debrief to your preferences.
OpenAI is disrupting the New York Times’ revenue streams (allegedly).
Chances are, you’ve visited the New York Times’ website. You’ve seen the paywall come up. You’ve seen the targeted advertisements. You’ve seen the affiliate links on Wirecutter. Apart from the Times’ formidable podcast empire, those three tranches are how the Times makes money: subscriptions; advertising; and affiliate links.
OpenAI is disrupting all of these revenue streams (allegedly) by obviating the need to (1) purchase a subscription to the Times (impacting subscription revenue) or (2) visit the Times webpage itself (depriving the Times of revenue from targeted advertising and affiliate link clicks & purchases). Why visit the Times’ reporting on predatory lending in NYC’s taxi industry when you can just ask a GPT-powered chatbot whether the NYC taxi industry is predatory?
The complaint contains interesting examples of how a user of ChatGPT can prompt the chatbot to reproduce excerpts from original Times articles:
On its face, the excerpt is pretty damning. An individual admits to being paywalled out of an article, and ChatGPT *gasp* affably displays pretty much the exact words??
The lawyer in me immediately comes up with arguments to downplay this example: it’s only a few paragraphs; how many users would specifically ask ChatGPT about specific Times articles anyway; journalism receives lower copyright protection because of its factual nature. But let’s put those aside. Instead of brainstorming defenses for Microsoft and OpenAI for free, I want to place this lawsuit within the broader precedent of copyright cases involving new technologies.
I think I’ve seen this film before…
Rewind to 2002. Google just introduced Google Books, a tool that scanned pages of books, rendered the words on scanned pages searchable, and returned relevant pages from books containing search terms that users put in. The Authors Guild (yes, the same one who is suing OpenAI right now) sued Google back in 2005 for Google Books, contending that it was “massive copyright infringement.” Nowadays, Google Books isn’t even a groundbreaking tool, but in the mid-aughts, it was highly controversial.
After ten years in litigation,1 the Second Circuit ruled that Google Books was fair use. Meaning that yes, Google Books technically copied copyrighted works but it’s okay!! Google Books “transformed” the original copyrighted works enough, making its use fair. Part of the court’s rationale turned on how limited the excerpts from the books were. You couldn’t just read a whole book on Google Books. Moreover, the court viewed Google Books as a “public service,” and public goods are always viewed favorably by the courts.
Now, you likely already see the parallels between Google Books and GenAI. Both tools ingest a large amount of data, receive search queries/prompts, and spit out results in response to the queries/prompts. GenAI takes it a few steps further, though—GenAI will summarize the book for you. Or even summarize hundreds of books on a requested topic and then draft your paper for you.
I suspect this case will turn on whether these additional features are seen as transformative or merely usurping the Times’ works and market. Google Books’ argument that it wasn’t displacing the market for books was persuasive—I mean, who’s going to try search terms endlessly in the hopes of reading a book in its entirety? There’s a similar argument available here—who’s going to endlessly prompt ChatGPT for “the next paragraph,” “the next one,” “the next one,” until they finish the entire article (if that’s even technically possible within ChatGPT’s parameters)?
Another wrinkle to consider is the Supreme Court’s fair use decision in Andy Warhol Foundation v. Lynn Goldsmith last year. As I wrote previously, the Supreme Court narrowed what constituted fair use. If two works share substantially the same purpose (e.g., both works are to be shown in magazines alongside articles), then fair use cannot be invoked. I’ll be curious how Microsoft and OpenAI frame the purposes of GenAI, to place its purpose as far away from the Times articles’ purposes as possible. Summarizing vs. reporting? Searching vs. creating?
Furthermore, because Google Books was decided in 2015 before this narrowing of the fair use defense in 2023, it’s unclear how Warhol v. Goldsmith will interact with Google Books. With the narrowed fair use exception, will GenAI land within the safe confines of Google Books—or just outside of it?
GenAI is challenging our conceptions of what human “work” is.
I have a percolating thought about how GenAI has forced us, the humans, to really confront the difference between aggregator and generator works. Tasks that used to be completed by humans—such as reading and summarizing hundreds of pages—can now be completed by GenAI in minutes. We’re still getting accustomed to that reality, that soon humans will no longer have “grunt” work to do, but GenAI has upped the ante: If machines can now generate photos, images, entire articles and books, is there a difference between what the machines generate and what the humans generate? And will that difference be recognized in the market, financially?
I’m still of the mind that humans are capable of incredible generation that GenAI is not. (Researchers have found that training GenAI using GenAI content causes irreversible defects. Clearly, human-generated content is still different in an important way!) If anything, GenAI has fractured our conception of human capabilities even more—not just as between aggregator and generator, but between simple generator and complex generator. Technology has already displaced aggregation tasks; now, GenAI will displace simple generative tasks, too.
I find this somewhat exciting—humans should be pushed towards complex generation!—but recognize these displacements do not happen in vacuums without the realities of capitalism, nonexistent consumer savings, and cost-of-living crises. How do we transition our society towards GenAI without exacerbating existing inequities? To me, that is the Big Question we should be thinking about, beyond whether the Times or OpenAI will emerge victorious. ◆
Thank you for reading debrief! If you enjoyed this post, consider upgrading your subscription or sharing this post:
Becoming a paid subscriber gives you access to my inner sanctum—essays and private podcast episodes on nascent ideas that need to be nurtured like newborn kittens. Subscribing also helps ensure that my informational and educational content on Substack and all other platforms remains freely available to all.
Please know, though, that having you here and being able to be in conversation with you is the most important thing to me. If you are a student without disposable income, un-or under-employed, or a minimum-wage worker, just email me or fill out this form and I’ll comp you a free subscription, no questions asked. If you’d like to donate one of these subscriptions, you can do so here.
Reader Questions from Instagram
Will this have a potential floodgate effect?
The Times is actually a little late to the game, imo. OpenAI is already being sued by authors and writers (see the cases I mention above). I could see other organizations hopping on to file related suits, but given how expensive litigating a case like this is and how poorly funded the arts are, I suspect more cost-conscious organizations will simply follow the case with interest. If all creatives were as well-funded as the Times, I would expect a floodgate, for sure.
Does it really matter if the NYT wins?
Ugh, I hate giving the lawyer answer: it depends. A lot depends on what OpenAI does from a product perspective and how much money the court awards the Times if OpenAI is liable. The copyright suit against Robin Thicke and Pharrell Williams for “Blurred Lines” ushered in an era of increased copyright lawsuits in music, which arguably stifles the creation of new musical works. I could see this lawsuit having the same effect if the Times is awarded a large sum, but for GenAI technologies.
Isn’t copyright law just a scam?
I am biased, but I definitely don’t think so!! I see copyright law as the heart of innovation and progress in the U.S., which sounds sappy but is true. The strong copyright laws in the U.S. are part of the reason why Hollywood is a global center of entertainment and why musicians and artists are drawn to amassing a U.S. audience. One could argue that copyright laws in the U.S. are now too strong, but on the whole, I am and will forever be a copyright stan.
I know! This is why I couldn’t be a litigator for the long haul. I love the substance of litigation but the speed of transactional work, ugh.
Application of sweat of the brow or modicum of creativity does help, but will fundamental doctrines of IP witness changes with emerging technologies ?
My view is this: ChatGPT doesn't create new content. That's a uniquely human ability. It just takes already published work and based on a few algorithms that teach it what sounds coherent it spits out a new variation. It's like taking a puzzle with the same image but changing the structure of the puzzle pieces. Is it a new puzzle? new interpretation? or same puzzle regardless of the pieces?
Interesting topic and lots to debate here!