A new lawsuit against OpenAI could decide whether the company’s use of training data scraped from the public internet may continue. Credit: Andrey Popov / Getty Images OpenAI, the Microsoft-backed company that developed the ChatGPT generative AI chatbot, is the defendant in a federal class action lawsuit filed this week in California, where it is accused of misappropriating personal information for training purposes. The complaint, filed in the Northern District of California, lists 15 causes of action, including violations of the Computer Fraud and Abuse act, the Electronic Communications Privacy Act, and several state consumer rights laws and common-law torts. The claims center on the idea that OpenAI essentially “stole” the plaintiffs’ private information and used it to create a highly valuable product without compensation. “OpenAI used the stolen data to train and develop [ChatGPT] utilizing large language models … and deep language algorithms to analyze and generate human-like language that can be used for a wide range of applications,” the complaint said. By taking data from the public internet that nevertheless contained personally identifiable information, the plaintiffs contend, OpenAI has violated their privacy. (The identities of the plaintiffs were not fully disclosed in the complaint, which asked the court for permission to keep them private, in the hope of avoiding “intrusive scrutiny.”) In addition to monetary damages, the plaintiffs asked to take a number of corrective actions against OpenAI’s alleged misdeeds, including the establishment of an independent AI council for governance and open access to all personal information collected by OpenAI. The case is likely to test the assumption that the use of data from the public internet for AI training constitutes fair use under US copyright law, which would mean that AI creators like OpenAI couldn’t be held liable for violating copyright. While the complaint does not discuss the fair use argument in detail, a second class action suit — this one initiated by two Massachusetts-based authors — more directly alleges copyright violations by OpenAI in regard to the authors’ material being used to help train AI. “Because the OpenAI Language Models cannot function without the expressive information extracted from Plaintiffs’ works (and others) and retained inside them, the OpenAI Language Models are themselves infringing derivative works,” according to the complaint in that case, also filed in the Northern District of California. OpenAI did not immediately respond to requests for comment. Related content reviews Arc browser for Windows — better than Chrome? This might just be the best web browser for power users. But you’ll have to rewire your brain. By Chris Hoffman May 08, 2024 13 mins Windows Browsers Productivity Software news Google US antitrust trial: A timeline The biggest antitrust trial of the century, targeting Google's search business, is drawing to a close while a second trial against the tech giant, focusing on advertising, is scheduled for later this year. Here's an updated, play-by-play a By Jon Gold May 08, 2024 9 mins Technology Industry Google Legal news analysis Why Google's Pixel 8a may be the most important phone of 2024 Don't be fooled: This unassuming midranger holds some outsized significance for Android and for you — regardless of whether you ever intend to buy it. By JR Raphael May 08, 2024 12 mins Smartphones Google Android opinion FTC ban on non-competes would put employees in the driver's seat If the ban goes into effect, the talent wars will broaden and intensify — and return-to-work efforts would likely crumple. By Scot Finnie May 08, 2024 5 mins Technology Industry IT Jobs IT Skills Podcasts Videos Resources Events SUBSCRIBE TO OUR NEWSLETTER From our editors straight to your inbox Get started by entering your email address below. Please enter a valid email address Subscribe