Perplexity logo is seen in this illustration taken May 20, 2024. REUTERS/Dado Ruvic/Illustration
Reddit logo is seen in this illustration taken November 7, 2022. REUTERS/Dado Ruvic/Illustration

(Reuters) -Social media platform Reddit sued artificial intelligence startup Perplexity in New York federal court on Wednesday, accusing it and three other companies of unlawfully scraping its data to train Perplexity's AI-based search engine.

Reddit said in the complaint that the data-scraping companies circumvented its data protection measures in order to steal data that Perplexity "desperately needs" to power its "answer engine" system.

The case is one of many filed by content owners against tech companies over the alleged misuse of their copyrighted material to train AI systems. Reddit filed a similar lawsuit against AI startup Anthropic in June that is still ongoing.

"Our approach remains principled and responsible as we provide factual answers with accurate AI, and we will not tolerate threats against openness and the public interest," Perplexity said in a statement.

"AI companies are locked in an arms race for quality human content - and that pressure has fueled an industrial-scale 'data laundering' economy," Reddit chief legal officer Ben Lee said in a statement.

Reddit, which features thousands of interest-based "subreddit" web communities, said in the lawsuit that it is the most commonly cited source for AI-generated answers to user questions. It has licensed its content to Google, OpenAI and others for their AI training.

Reddit said that Lithuania-based Oxylabs, Russia-based AWMProxy and Texas-based SerpApi scraped Reddit data from billions of search results without permission and that Perplexity, which does not have a license to use Reddit content, worked with at least one of the data-scraping companies to obtain Reddit material.

"We strongly disagree with Reddit's allegations and intend to vigorously defend ourselves in court," a SerpApi spokesperson said. Oxylabs said in a statement that it was "shocked and disappointed by this news, as Reddit has made no attempt to speak with us directly," and that it would also defend itself against the allegations.

AWMProxy could not be reached for comment.

Reddit said it sent Perplexity a cease-and-desist letter last year, after which it "increased the volume of citations to Reddit forty-fold."

Reddit asked the court for unspecified monetary damages and an order blocking Perplexity from using its data.

(Reporting by Blake Brittain in Washington; Editing by Nick Zieminski and Stephen Coates)