California Assembly Bill 2013 (2024)

California Assembly Bill 2013 (AB 2013) is a California law requiring developers of generative artificial intelligence systems to publicly disclose information about the data used to train their models. The law was authored by Assemblymember Jacqui Irwin (D-Thousand Oaks), signed by Governor Gavin Newsom on September 28, 2024, and took effect on January 1, 2026. It passed both chambers of the legislature unanimously (38Ã¢ÂÂ0 in the Senate, 75Ã¢ÂÂ0 in the Assembly).

AB 2013 was among 18 AI-related bills enacted by California in 2024, a period in which the state's regulation of artificial intelligence drew national attention, particularly around the vetoed SB 1047.

Provisions

Scope

The law applies to developers who make generative AI systems or services publicly available to Californians, whether for free or for compensation. The statute defines "developer" broadly to cover anyone who designs, codes, or produces an AI system, as well as anyone who creates a new version or update that materially changes its functionality, including through retraining or fine-tuning. Exemptions exist for AI systems used solely for security and integrity purposes, for aircraft operations, or for national security and defense purposes made available only to federal entities.

The law applies retroactively to any generative AI system released on or after January 1, 2022, and to any substantial modification made after that date.

Disclosure requirements

Developers must post documentation on their websites describing the data used to train their generative AI systems. The law requires this documentation to include a "high-level summary" of the datasets, covering twelve categories of information:

The sources or owners of the datasets and a description of how they further the system's intended purpose
The number of data points and a description of data types
Whether the datasets include data protected by copyright, trademark, or patent, or whether they are in the public domain, along with applicable licensing information
Whether the datasets include personal information as defined by the California Consumer Privacy Act or aggregate consumer information
Whether the datasets were purchased or licensed
Whether the developer used synthetic data in training
Any cleaning, processing, or other modification performed on the datasets
The time period during which data was collected and when datasets were first used in development

This documentation must be posted before the system is made publicly available and updated before each substantial modification.

Enforcement

The law does not establish a specific penalty or enforcement mechanism for noncompliance, nor does it include a trade secret exemption for disclosures. The absence of a trade secret provision has been a point of concern among legal commentators, who have noted that forced disclosure could reduce the value of proprietary information about training datasets.

Compliance

When the law took effect on January 1, 2026, OpenAI and Anthropic were among the first companies to publish the required documentation. Both companies addressed each of the twelve statutory categories but did not name any specific datasets, instead characterizing their training data at a general level by referring to categories such as web content, licensed material, user contributions, and AI-generated data.

In its disclosure, Anthropic said that personal information appears in its training data as a byproduct of collecting publicly available web content, and described the use of technical measures to reduce the presence of such information in the model's responses. Both companies stated that their training data may include copyrighted material.

As of early 2026, several other major AI developers had not yet published disclosures.

xAI lawsuit

On December 29, 2025, two days before the law took effect, xAI, the developer of the Grok chatbot, filed a federal lawsuit in the United States District Court for the Central District of California against California Attorney General Rob Bonta, seeking to block enforcement of AB 2013. xAI is represented by the firm of Paul Clement and Erin Murphy.

The complaint raises four constitutional claims:

That AB 2013 effects an unconstitutional taking of xAI's trade secrets under the Fifth Amendment, because it forces public disclosure of proprietary information without compensation. xAI argues that the composition and curation of its training datasets are worth billions of dollars and derive their value from secrecy.
That the law constitutes an unconstitutional regulatory taking by destroying the economic value of xAI's trade secrets and interfering with its investment-backed expectations, particularly because the law applies retroactively to models released before it was enacted.
That the law violates the First Amendment by compelling speech, forcing xAI to publicly describe aspects of its products. xAI argues this is a content-based regulation that should be subject to strict scrutiny.
That the law is unconstitutionally vague under the Due Process Clause of the Fourteenth Amendment, because it does not define key terms such as "high-level," "datasets," or "data point."

Legal commentators at the Institute for Law & AI have observed that the strength of xAI's trade secret argument is weakened by the fact that OpenAI and Anthropic chose to comply voluntarily, suggesting the statute can be satisfied without disclosing competitively sensitive details. The California Department of Justice said it would defend the law.

References