News Publishers Sue OpenAI, Microsoft Over AI Training Copyright Claims

A coalition of news publishers has sued OpenAI and Microsoft, alleging their AI models were trained on copyrighted news content without permission and seeking damages and the removal of protected works from AI datasets.

News Publishers Sue OpenAI, Microsoft Over AI Training Copyright Claims
Image Credits: Unsplash/Representational Image

A coalition of news publishers has sued OpenAI and Microsoft in U.S. District Court for the Southern District of New York, alleging the companies used copyrighted news content without permission to train generative artificial intelligence models.

The case, Richner Communications, Inc. et al. v. Microsoft Corporation et al., alleges that the companies systematically scraped publishers’ content and stripped metadata associated with copyright to obscure the unauthorised use of news articles.

The publishers are seeking damages under the Copyright Act and a permanent injunction that could require OpenAI to remove copyrighted news content from its AI training data sets and GPT models.

Also Read: Maharashtra FDA Mandates Calorie Counts, Allergen Information on Menus

DMCA Allegations:

Besides copyright infringement claims, the suit also heavily relies on provisions of the Digital Millennium Copyright Act (DMCA).

The complaint claims that when OpenAI scraped content from news websites or third-party datasets, it purposefully stripped out Copyright Management Information (CMI) like the names of publishers and authors, titles, terms of use, and copyright notices.

Publishers said the removal of this metadata was meant to hide how much copyrighted content had been used for training the AI, but at the same time allow for downstream copyright infringement.

Also Read: LinkedIn Unveils ‘Buyability’ Framework to Boost B2B Marketing Effectiveness

Legal Battle:

The complaint also claims that ChatGPT can produce word-for-word excerpts or summaries of copyrighted news articles without retaining the original copyright information from the articles, a practice the publishers say violates Section 1202 of the DMCA.

The plaintiffs are demanding a trial by jury and monetary damages from both OpenAI and Microsoft. Microsoft has also been named in lawsuits claiming vicarious liability due to its close partnership with OpenAI and its funding of the company’s AI infrastructure.

The lawsuit notably seeks a court order that, if successful for the plaintiffs, would compel OpenAI to delete all copies of the publishers’ registered copyrighted works from its current GPT models and historical training data.

The lawsuit is the latest legal move in the long-running battle between news publishers and artificial intelligence companies over the use of copyrighted material to train AI systems.