AI

Stable Video Diffusion: Stability AI Announces Its First Ever Foundation Model for Generative Video

By Scott FaulknerUpdated:November 22, 20235 Mins Read

Stable Video Diffusion: Stability AI Announces Its First Ever Foundation Model for Generative Video

Stability AI has announced ‘Stable Video Diffusion’, It showcased its first foundation model for generative video based on its image model, Stable Diffusion.

The company said that this generative AI video model represents a significant step in its journey toward creating models for everyone of every type. Thus, we can conclude that Stable Video Diffusion is one of the few video-generating models available in open source. However, not everyone can use the model currently, as Stability AI emphasised that this model is not intended for real-world or commercial applications at this stage.

Also Read: The OpenAI Employee Exodus: Dangerous Threats to Resign and Join Microsoft’s AI Endeavour

Navigator

Stability AI States That Stable Video Diffusion Is in Research Preview

According to Stability AI, the model is in research preview, and they have already made the code for Stable Video Diffusion available on their GitHub repository. However, everyone who uses the model needs to abide by specific rules.

“Those who wish to run the model must agree to certain terms of use, which outline the Stable Video Diffusion intended applications (Educational or creative tools, design and other artistic processes, etc.) and non-intended ones (Factual or true representations of people or events).”

Related: Adobe Firefly: A Game-Changer for Graphic Design

2 Things to Know About Stability AI’s Stable Video Diffusion

SVD and SVD-XT Models

Based on the research paper provided by Stability AI, the Stable Video Diffusion will come in two forms — SVD and SVD-XT. While both models can generate high-quality four-second clips and videos at between three and 30 frames per second, there is still a difference as the former transforms still images into 576×1024 videos in 14 frames, while the latter can up the frames to 24.

Today, we are releasing Stable Video Diffusion, our first foundation model for generative AI video based on the image model, @StableDiffusion. As part of this research preview, the code, weights, and research paper are now available.

Additionally, today you can sign up for our… pic.twitter.com/0MbV5DDPt2
— Stability AI (@StabilityAI) November 21, 2023

In addition, according to the whitepaper released by Stability AI, the studio trained the SVD and SVD-XT models on a dataset of millions of videos and then fine-tuned on a much smaller set of hundreds of thousands to around a million clips. However, there are no specifications on where those studios came from besides the paper implying public research datasets.

Therefore, it is highly possible that Stability AI could face copyright issues and legal and ethical challenges around usage rights. If you may recall, Ed Newton-Rex, the former VP of audio at the startup for just over a year and played a pivotal role in the launch of Stability’s music-generating tool, Stable Audio, departed earlier this month due to a disagreement with the company. In his open letter, he stated it goes against his morals as the company includes training on data scraped from the internet or other databases without consent.

“I’ve always fought against this in my career. When everyone submitted the pieces, it became apparent that many companies were still relying on the fair use argument and trying to justify that to the U.S. Copyright Office. It’s just what I really disagree with. So I feel like I did resign from Stability AI, but I also feel like I stepped down from a group of large AI companies that will take the same approach,” he had said

Limitations

Stability AI has been transparent about Stable Video Diffusion’s limitations. For instance, the models cannot produce video if movement or camera pans are slow, be controlled by text, render text (At least not legibly), or consistently generate faces and people properly.

First text-to-video render of Stable Video Diffusion (SVD) from a Midjourney input image.

I'm impressed with

– coherent movement
– video quality
– accuracy with original image

Shame the explosions didn't, uh, explode. pic.twitter.com/qDnZAg0hzF
— fofr (@fofrAI) November 21, 2023

However, the models are still in early development, and Stability AI has stated its plans for the model’s future. According to the information provided by TechCrunch, Stability AI is planning varieties of models that build on and extend SVD and SVD-XT with a text-to-video tool that’ll bring text prompting to the models on the web. As for their ultimate goal, Stability appears to be prioritising commercialisation as they rightly note that Stable Video Diffusion has potential applications in advertising, education, entertainment and beyond.

Also Read: The Importance of Gaming Safety: Xbox Safety Gaming Toolkit for Playtime Limits and Chat Restrictions in 2023

Will Stable Video Diffusion Cause More Harm Than Good?

With the amount of recent news about non-consensual deepfakes circulating around the web, it is inevitable that people are worrying about the ways in which people can abuse Stable Video Diffusion, given it doesn’t appear to have a built-in content filter. Their worries are understandable as some have indeed used Stability AI services for malicious intent.

Stable Video Diffusion: Stability AI Announces Its First Ever Foundation Model for Generative Video (1)

For example, when the company first released Stabl Diffusion, it did not take long before people with questionable intentions used it to create non-consensual deepfake porn, AI-generated art of nude celebrities and worse, on the infamous discussion board 4chan. In response, Emad Mostaque, the CEO of Stability AI, mentioned that it was unfortunate that the model leaked on 4chan and stressed that the company was working with leading ethicists and technologies on safety and other mechanisms around responsible release.

However, the safety mechanism, Safety Classifier, can be disabled by people by default, thus leading to people bypassing the regulations and committing crimes like perpetuating abuse or implicating someone in a crime they didn’t commit. Therefore, Stability AI needs to ensure proper restrictions and safety monitoring for Stable Video Diffusion before commercialising to the public.

For more news articles regarding Tech, Crypto, AI and Gaming, follow our Facebook and X social media pages for daily updates on the most talked-about topics.

Author Profile

Scott Faulkner

Latest entries

Visited 72 times, 1 visit(s) today

Artificial Intelligence

Previous ArticlePentagon’s AI Strategy: Enhancing U.S. Decision-Making Advantage in Competition with China

Next Article Sunbird Halts iMessage App for Android Temporarily Due to Security Concerns

Scott Faulkner