New York Times Hits Back at OpenAI’s Hacking Claims

“[I]n OpenAI’s telling, The Times engaged in wrongdoing by detecting OpenAI’s theft of The Times’s own copyrighted content.” – The Times’ opposition brief

In an opposition brief filed Monday, The New York Times Company (The Times) told a New York district court that OpenAI’s late February claim that The Times “paid someone to hack OpenAI’s products” in order to prove OpenAI infringed its copyrights amounts to little more than “grandstanding.”

In late December 2023, the Times became the latest of many complainants to accuse OpenAI’s Large Language Model, ChatGPT, as well as Microsoft’s GPT-4-powered Bing Chat, of widespread copyright infringement. The Times alleged that Microsoft and OpenAI reproduce Times content verbatim and also often attribute false information to the Times.

The Times’ opposition brief filed yesterday responds to OpenAI’s recent motion to dismiss, which alleged that The Times paid someone to target and exploit “a bug (which OpenAI has committed to addressing) by using deceptive prompts that blatantly violate OpenAI’s terms of use.” The Times called this accusation “as irrelevant as it is false,” pointing the court to its Exhibit J to the complaint, which explains that The Times elicited the infringing content from OpenAI’s chatbot, ChatGPT, by prompting it with the first few words or sentences of Times articles. “That work was only necessary because OpenAI does not disclose the content it uses to train its models and power its user-facing products,” wrote The Times, adding: “Yet in OpenAI’s telling, The Times engaged in wrongdoing by detecting OpenAI’s theft of The Times’s own copyrighted content.”

As for the rest of OpenAI’s arguments to dismiss, The Times told the court they are chiefly factual arguments that cannot be decided at the motion to dismiss stage. For instance, OpenAI’s claim that users don’t generally use OpenAI to bypass paywalls would require the court to accept its statements at face value with no analysis of user behavior. And OpenAI’s bid to dismiss The Times’ Digital Millennium Copyright Act (DMCA) claim turns on specifics about the design of OpenAI’s model-training process that must be uncovered via discovery.

The Times brief also contrasts the two companies by labeling itself and its business model as being “built on world-class journalism” while OpenAI and its business model are “built on mass copyright infringement.” The Times is alleging that not only the training data but the ChatGPT and “Browse with Bing” products and the outputs they produce in response to queries infringe The Times’s copyrights.

The brief also dismisses OpenAI’s apparent theory that The Times must identify every third party that has infringed Times articles as a result of using ChatGPT and Browse with Bing in order to argue contributory infringement beyond the instances identified in the complaint. Under Arista Recs v. Usenet.com, “knowledge of specific infringements ins not required to support a finding of contributory infringement” and “The Times need only allege that OpenAI ‘knew or should have known that its service would encourage infringement,’” said the brief. The Times also alleges that OpenAI was aware of the infringement because The Times informed them of it in April 2023 and dubbed OpenAI’s “failure to acknowledge The Times’s outreach…particularly striking” since it relied in its motion to dismiss on a case that says “’cease-and-desist letters’ are ‘traditional indicia of actual or constructive knowledge’ of contributory infringement.” Hartmann v. Popcornflix.com LLC.

OpenAI has been sued by numerous creators and authors over the last year for training its chatbots on content found online, including non-public or copyright-protected content. At IPWatchdog’s recent AI Masters program, panelists pointed to numerous problems with existing generative AI products, from Chatbots that have encouraged suicide to others that have spit out confidential trade secrets when pressed. We’re witnessing a big gold rush with these companies wanting to release these systems before they’re ready for prime time,” said one panelist, Martijn Rasser, CRO and Managing Director at Datenna. “Companies need to hit the brakes because once it’s out in the open, you can’t un-invent these models.”

Image Source: Deposit Photos
Author: iqoncept
Image ID: 159215852

Eileen McDermott Eileen McDermott is the Editor-in-Chief of IPWatchdog.com. Eileen is a veteran IP and legal journalist, and no stranger to the intellectual property world, having held editorial and managerial positions at [...see more]

Warning & Disclaimer: The pages, articles and comments on IPWatchdog.com do not constitute legal advice, nor do they create any attorney-client relationship. The articles published express the personal opinion and views of the author as of the time of publication and should not be attributed to the author’s employer, clients or the sponsors of IPWatchdog.com.

Join the Discussion

7 comments so far. Add my comment.

Anon
March 18, 2024 10:20 am
No resurrection, so let me see if I can recall my post.
Pro Say, you began by exclaiming a long history of understanding of Fair Use, yet only want to pin your understanding to the NYT brief.

That brief is in legal error precisely because is does not show an understanding of Fair Use – under controlling precedents, with errors in both commission and omission.

I have explained this to you, and you have not responded to the substance of what I have provided.

As to:

“ of your many years of hard work”

Immaterial – there is no ‘sweat of the brow’ doctrine in US law.

“All the briefs . . . all the applications . . .all the responses to all the office actions . . . all the everything you’ve ever written and produced was free for the taking; to be used, modified, (mis) quoted, etc. however the takers wanted . . . without payment or even attribution.”

Absolutely – given the nature of the AI engines at point here, as Fair Use is made of all the ingested matter – due to the technical transformations that are directly on point.

Again – this is entirely within the factual and legal points that I have presented.

Points that you have not addressed.
Anon
March 15, 2024 08:58 am
It appears that my response from over the weekend has been lost to the aether.
Pro Say
March 13, 2024 09:31 pm
Ugh. O.K. my friend.

Like it or not, I’ve pointed to what I believe to be fair use (and what is not) — I’m with the Times. Their understandings be my understandings. Their positions be my positions.

Now it’s your turn:

Which of the following do you consider to be the fair use of your many years of hard work, and which are not?

All the briefs . . . all the applications . . .all the responses to all the office actions . . . all the everything you’ve ever written and produced was free for the taking; to be used, modified, (mis) quoted, etc. however the takers wanted . . . without payment or even attribution.

Please be specific and explain your position(s).
Anon
March 13, 2024 12:25 pm
As I mentioned the first time around, your “banking” on the NYT position is simply poor.

This does not show ANY understanding of your own personal aspects, as your prior comment indicated a long and thorough study (which would have preceded the poor NYT briefs).

Do you even understand my position as to limits of rights under US copyright law? As far as I can tell, you have no understanding of the law, and merely want a certain desired ends.
Pro Say
March 13, 2024 11:26 am
Anon — “your personal understanding of that concept”

Since my understanding is in accord with that of the N.Y. Times, there be no need to repeat their briefs here in order to know my understanding.

“if in accord with limits of rights under US Copyright law”

A bit of a qualifier to my inquire here.

Do you or do you not believe that what I say in my “Note to Anon” constitutes fair use . . . that all such use is (would be) in accord with limits of rights under US Copyright law?

Because — according to the Times — that’s exactly what OpenAI is doing.
Anon
March 13, 2024 08:57 am
Pro Say – if in accord with limits of rights under US Copyright law, who would I be to say, “boo?”

Fair Use – you never came back around to show your personal understanding of that concept. Shall we pick up the discussion there?
Pro Say
March 12, 2024 07:45 pm
This type of “everything’s up for grabs / information just want’s to be free” AI is a malignant cancer; one whose copyright infringements spreads out in every direction like the Big Bang through everyone’s hard-earned, original content.

Either pay for it, or stop doing it.

Pick one.

(Note to Anon: How would you like it if all the briefs . . . all the applications . . .all the responses to all the office actions . . . all the everything you’ve ever written and produced was free for the taking; to be used, modified, (mis) quoted, etc. however the takers wanted . . . without payment or even attribution?)