SEARCH
SHARE IT
Meta introduced the MovieGen series of media foundation AI models, which can generate realistic films with sound using text instructions. MovieGen has two basic models: MovieGen Video and MovieGen Audio.
MovieGen Video is a transformer model with 30 billion parameters that can create high-quality, high-definition images and videos from a single text command. The created videos can be up to 16 seconds long, with a frame rate of 16 frames per second.
MovieGen Audio is a 13-billion parameter transformer model that can accept a video input and optional text prompts to produce high-fidelity audio up to 45 seconds long that syncs with the video. This new audio model can produce ambient sounds, instrumental background music, and Foley sounds. Meta claims to provide cutting-edge outcomes in audio quality, video-to-audio alignment, and text-to-audio alignment.
These models aren't just for making brand new videos. They allow you to alter existing videos with simple text instructions. MovieGen also allows users to perform localized alterations, such as adding, removing, or replacing elements, as well as global changes like backdrop or style. For example, if you have a video of someone tossing a ball with a simple written prompt, you can edit the video to show the person throwing a watermelon while keeping the rest of the original material.
MovieGen models will enable users to make individualized videos. These algorithms can create bespoke films that preserve human identity and motion by using a person's photograph and a written cue. Meta claims that these models achieve cutting-edge achievements in character retention and natural movement in video.
Meta says these models outperform existing video generation models like OpenAI Sora and Runway Gen-3. Meta is now working with creative specialists to refine the model prior to its public release.
MORE NEWS FOR YOU