NVIDIA Seeks Dismissal of Expanded Copyright Lawsuit Over AI Training Data

NVIDIA asks a federal court to throw out an expanded class‑action lawsuit alleging it used pirated books to train its AI models, arguing authors lack proof of actual use and that no specific infringement occurred.

4 February 2026 by

TechStora Editorial Board

Background of the lawsuit

Several authors filed an expanded class‑action suit claiming NVIDIA trained its AI models on millions of pirated books. The complaint was amended to target newer models and datasets, prompting fresh discovery requests from the plaintiffs.

NVIDIA's core arguments for dismissal

The chip maker contends that the plaintiffs have not demonstrated that their specific books were actually used in training. Key points include:

Contacting Anna's Archive does not equate to infringement.
Speculation that a large dataset “must have” contained the works is insufficient.
There is no evidence NVIDIA knowingly used infringing material.

Claims NVIDIA challenges

NVIDIA seeks to dismiss virtually every new claim in the amended complaint, including:

Contributory copyright infringement – no proof of knowledge or material contribution.
Vicarious copyright infringement – no evidence of specific pirated books.

The company emphasizes that its NeMo framework offers optional tools that customers can apply to any dataset, licensed or public‑domain.

The direct infringement claim

The only claim not covered by the current motion is the direct infringement allegation that NVIDIA used the Books3 database to train its NeMo model. NVIDIA plans to address this claim at trial or via summary judgment, likely relying on a robust fair‑use defense.

Implications for AI and copyright law

This case highlights the growing legal uncertainty surrounding AI training data. A dismissal could set a precedent that mere speculation about dataset composition is insufficient for copyright liability, while a successful direct‑infringement defense could reinforce fair‑use arguments for AI developers.