Reddit Dubs Perplexity AI and Data Scraping Companies ‘Would-Be Bank Robbers’

“According to the complaint, Perplexity has admitted that Reddit is one of its ‘top tier sources’ for data, citing an August 2025 Perplexity blog post that said ‘Reddit has emerged as the most cited domain across AI models globally.’”

Perplexity AIReddit filed a lawsuit yesterday against artificial intelligence (AI) company Perplexity AI and three other defendants for their alleged illegal circumvention of Reddit security measures meant to protect misuse of its content and data.

Reddit, which describes itself in the complaint as “one of the largest repositories of human conversation in existence,” likened the actions of Oxylabs UAB, AWMProxy, and SerpApi to those of “would-be bank robbers.” Through their development of tools that bypass both Google’s and Reddit’s anti-scraping measures, and their scraping of Reddit content from Google search results, these defendants, “knowing they cannot get into the bank vault, break into the armored truck carrying the cash instead,” said the complaint.

Perplexity AI, meanwhile, refused to enter into an agreement with Reddit and is a customer of SerpApi, and allegedly was caught “red-handed by using the digital equivalent of marked bills…to track Reddit data and confirm that Perplexity was using Reddit data acquired through the scraping of Google SERPs,” according to the complaint. Reddit said it sent Perplexity a cease-and-desist letter but that Perplexity subsequently only increased its use of Reddit data “forty-fold,”, the lawsuit added.

Reddit charges that each of the defendants is profiting by “evading technological control measures to access Reddit data it knows it does not have permission to access or use.”

Because Reddit has over 100 million active users per day, its data is “widely seen as invaluable to AI companies” and “is particularly well-suited to training” large language models (LLMs) because it is constantly growing. Reddit has entered into partnership agreements regarding use of its data with companies such as Google. According to the complaint, Perplexity has admitted that Reddit is one of its “top tier sources” for data, citing an August 2025 Perplexity blog post that said “Reddit has emerged as the most cited domain across AI models globally.”

Reddit is seeking injunctive relief enjoining the defendants from accessing its or Google’s “website, servers, systems, and any data contained therein” for the purpose of unlawfully scraping Reddit’s data, or developing tools used to illegally scrape data; damages and costs.

Perplexity, meanwhile, responded to the lawsuit in a Reddit post, calling the complaint “a sad example of what happens when public data becomes a big part of a public company’s business model.” It accused Reddit of desperately trying to monetize its data in the face of a languishing business model and hypothesized that “it’s about a show of force in Reddit’s training data negotiations with Google and OpenAI. (Perplexity doesn’t train foundation models!).”

According to Perplexity, they did not ignore Reddit’s request for a licensing deal but instead  simply told the company that Perplexity does not train AI models on content and, therefore, “it is impossible for us to sign a license agreement to do so.”

Perplexity said it will not “bow to strong-arm tactics” and reiterated that it merely summarizes Reddit discussions, citing Reddit threads in its answers, “just like people share links to posts here all the time.”

In October 2024, Dow Jones, the publisher of The Wall Street Journal and The New York Post, also filed a lawsuit against Perplexity, accusing Perplexity of using and repackaging Dow Jones’ copyrighted content to train its model and generate AI content, as well as falsely attributing incorrect content to the news outlets (so-called “hallucinations”).

 

Image Source: Deposit Photos
Author: Primakov
Image ID: 294680452 

Share

Warning & Disclaimer: The pages, articles and comments on IPWatchdog.com do not constitute legal advice, nor do they create any attorney-client relationship. The articles published express the personal opinion and views of the author as of the time of publication and should not be attributed to the author’s employer, clients or the sponsors of IPWatchdog.com.

Join the Discussion

One comment so far. Add my comment.

  • [Avatar for Anon]
    Anon
    October 24, 2025 08:27 am

    So a question for my more litigator savvy friends, is this a well-pled court filing?

    I ask as I have to dig more than half way through the complaint to try to find the legal assertion supporting the filing (page 30 to arrive at a first count).

    Scraping – per se – is not a violation of 17 U.S.C. §§ 1201(a)(1)(A), 1201(a)(2), 1201(b), nor does it support charges of unfair competition, unjust enrichment, or civil conspiracy.

Add Comment

Your email address will not be published. Required fields are marked *

Varsity Sponsors

IPWatchdog Events

Webinar: Sponsored by ClearstoneIP
January 27, 2026 @ 12:00 pm - 1:00 pm EST
PTAB Masters™ 2026
January 29, 2026 @ 8:00 am - January 30, 2026 @ 5:00 pm EST
IPWatchdog LIVE 2026 at the Renaissance Arlington Capital View
March 22, 2026 @ 1:00 pm - March 24, 2026 @ 7:00 pm EDT
Artificial Intelligence Masters™ 2026
May 18, 2026 @ 8:00 am - May 19, 2026 @ 5:00 pm EDT
Patent Masters™ 2026 – Portfolios, Licensing and Enforcement
June 8, 2026 @ 8:00 am - June 10, 2026 @ 5:00 pm EDT

From IPWatchdog