U k r V i s t i

l o a d i n g

YouTube's Role in Google's AI Training

This article discusses how Google utilizes YouTube content for training its AI models and the implications for copyright.

image

Google is leveraging content from its vast YouTube library to train its artificial intelligence models, including Gemini and the new Veo 3 video and audio generator. CNBC reports this based on its sources.

According to one insider, a selection of 20 billion videos is used for training purposes. Google has confirmed this information but clarified that it only pertains to a portion of the content and is done in accordance with agreements with creators and media companies.

A YouTube representative explained that the company has always used its own content to enhance services, and the emergence of generative AI has not changed that. "We understand the importance of guarantees, which is why we developed robust protection mechanisms for creators," the company stated.

However, experts are concerned about the implications for copyright. They believe that using others' videos to train AI without creators' consent could lead to a crisis in intellectual property. Although YouTube claims it has communicated this before, most creators were unaware that their content was being used for training purposes.

Google does not disclose how many videos are used for training its models. However, even if it involves just 1% of the library, that amounts to over 2.3 billion minutes of content—40 times more than its competitors.

By uploading videos, creators grant YouTube broad rights to use their content. However, they have no option to opt-out of having their videos used for Google's training models.

Representatives from organizations defending digital rights argue that creators' years of hard work are being used to develop AI without compensation or even notification. For instance, Vermillio has created a service called Trace ID that determines the similarity between AI-generated videos and original content. In some cases, the match exceeded 90%.

Some creators are open to their content being used for training, viewing new tools as opportunities for experimentation. However, the majority feel that the situation lacks transparency and requires clearer regulations.

YouTube has even entered into an agreement with the Creative Artists Agency to develop a management system for AI content that mimics famous personalities. Yet, the mechanisms for removing or tracking similar content remain imperfect.

Meanwhile, calls are already being made in the US to provide authors with legal protections that would allow them to control the use of their creative works in the age of generative AI.

Additionally, Google recently revised its internal content moderation rules on YouTube—now videos that partially violate the rules may remain online if deemed socially important.