Reddit Dubs Perplexity AI and Data Scraping Companies ‘Would-Be Bank Robbers’

“According to the complaint, Perplexity has admitted that Reddit is one of its ‘top tier sources’ for data, citing an August 2025 Perplexity blog post that said ‘Reddit has emerged as the most cited domain across AI models globally.’”

Reddit filed a lawsuit yesterday against artificial intelligence (AI) company Perplexity AI and three other defendants for their alleged illegal circumvention of Reddit security measures meant to protect misuse of its content and data.

Reddit, which describes itself in the complaint as “one of the largest repositories of human conversation in existence,” likened the actions of Oxylabs UAB, AWMProxy, and SerpApi to those of “would-be bank robbers.” Through their development of tools that bypass both Google’s and Reddit’s anti-scraping measures, and their scraping of Reddit content from Google search results, these defendants, “knowing they cannot get into the bank vault, break into the armored truck carrying the cash instead,” said the complaint.

Perplexity AI, meanwhile, refused to enter into an agreement with Reddit and is a customer of SerpApi, and allegedly was caught “red-handed by using the digital equivalent of marked bills…to track Reddit data and confirm that Perplexity was using Reddit data acquired through the scraping of Google SERPs,” according to the complaint. Reddit said it sent Perplexity a cease-and-desist letter but that Perplexity subsequently only increased its use of Reddit data “forty-fold,”, the lawsuit added.

Reddit charges that each of the defendants is profiting by “evading technological control measures to access Reddit data it knows it does not have permission to access or use.”

Because Reddit has over 100 million active users per day, its data is “widely seen as invaluable to AI companies” and “is particularly well-suited to training” large language models (LLMs) because it is constantly growing. Reddit has entered into partnership agreements regarding use of its data with companies such as Google. According to the complaint, Perplexity has admitted that Reddit is one of its “top tier sources” for data, citing an August 2025 Perplexity blog post that said “Reddit has emerged as the most cited domain across AI models globally.”

Reddit is seeking injunctive relief enjoining the defendants from accessing its or Google’s “website, servers, systems, and any data contained therein” for the purpose of unlawfully scraping Reddit’s data, or developing tools used to illegally scrape data; damages and costs.

Perplexity, meanwhile, responded to the lawsuit in a Reddit post, calling the complaint “a sad example of what happens when public data becomes a big part of a public company’s business model.” It accused Reddit of desperately trying to monetize its data in the face of a languishing business model and hypothesized that “it’s about a show of force in Reddit’s training data negotiations with Google and OpenAI. (Perplexity doesn’t train foundation models!).”

According to Perplexity, they did not ignore Reddit’s request for a licensing deal but instead simply told the company that Perplexity does not train AI models on content and, therefore, “it is impossible for us to sign a license agreement to do so.”

Perplexity said it will not “bow to strong-arm tactics” and reiterated that it merely summarizes Reddit discussions, citing Reddit threads in its answers, “just like people share links to posts here all the time.”

In October 2024, Dow Jones, the publisher of The Wall Street Journal and The New York Post, also filed a lawsuit against Perplexity, accusing Perplexity of using and repackaging Dow Jones’ copyrighted content to train its model and generate AI content, as well as falsely attributing incorrect content to the news outlets (so-called “hallucinations”).

Image Source: Deposit Photos
Author: Primakov
Image ID: 294680452

Eileen McDermott Eileen McDermott is the Editor-in-Chief of IPWatchdog.com. Eileen is a veteran IP and legal journalist, and no stranger to the intellectual property world, having held editorial and managerial positions at [...see more]

Warning & Disclaimer: The pages, articles and comments on IPWatchdog.com do not constitute legal advice, nor do they create any attorney-client relationship. The articles published express the personal opinion and views of the author as of the time of publication and should not be attributed to the author’s employer, clients or the sponsors of IPWatchdog.com.

Join the Discussion

One comment so far.

Anon
October 24, 2025 08:27 am
So a question for my more litigator savvy friends, is this a well-pled court filing?

I ask as I have to dig more than half way through the complaint to try to find the legal assertion supporting the filing (page 30 to arrive at a first count).

Scraping – per se – is not a violation of 17 U.S.C. §§ 1201(a)(1)(A), 1201(a)(2), 1201(b), nor does it support charges of unfair competition, unjust enrichment, or civil conspiracy.

View Comments

Reddit Dubs Perplexity AI and Data Scraping Companies ‘Would-Be Bank Robbers’

Join the Discussion

Anon

Varsity Sponsors

Latest IPW Posts

Raskin Calls Trump’s Abandonment of BOARD OF PEACE Trademarks ‘Necessary Course Correction”

CAFC Dismisses Appeal Against Trustpilot as Moot Following Settlement

Federal Circuit Finds Representative Claim Error Harmless, Affirms Section 101 Dismissal

CAFC Vacates Indefiniteness Ruling on TrackTime Patent, Affirms Invalidity Finding on Related Patent in Amazon Dispute

Other Barks & Bites for Friday, July 3: Google White Paper Says Only AI Outputs Infringe Copyright; CJEU Upholds €4.1B Fine for Android Abuses; and CAFC Orders Remand to Review Indefiniteness Ruling Under Dyfan

Latest Podcasts

Patent Monetization Reality Check: Can Your Patent Portfolio Survive Due Diligence? | IPWatchdog Unleashed

America’s Patent System Was Built for a Different Century | IPWatchdog Unleashed

America’s Broken Patent System Must Return to First Principles | IPWatchdog Unleashed

The AI Arms Race Runs Through the Patent System | IPWatchdog Unleashed

Efficient Infringement Rewards Copycats and Erodes Competition | IPWatchdog Unleashed

IPWatchdog Events

Webinar: AI for DNA & RNA Patent Preparation – From Sequence Data to Draft-Ready Applications

CLE Webinar: Sponsored by Junior

Women’s IP Forum 2026

Life Sciences Masters™ 2026

CLE Webinar: Sponsored by Junior

From IPWatchdog

More from IPWatchdog