1st SORA Music Video, How SORA is Evolving & Guessing Possible Pricing - fxguide
๐ Abstract
The article discusses the use of SORA, a diffusion model that generates longer and more cohesive video clips, by selected artists and directors in a limited alpha testing phase. It focuses on the experience of director Paul Trillo, who has used SORA to create a music video for the artist Washed Out. The article covers the current capabilities and limitations of SORA, the cost of using the tool, and the importance of prompt engineering in achieving the desired creative outcomes.
๐ Q&A
[01] The Hardest Part
1. What was the process of creating the music video "The Hardest Part" using SORA?
- Paul Trillo generated around 700 clips, most of which were around 20 seconds long, to create the nearly 4-minute video.
- He used about 55 of those clips in the final video, which was rendered at 720P resolution and then upscaled to 2K using Topaz.
- The transitions between scenes were created using long-form prompts rather than the multi-modal blending feature, which was not yet available at the time.
2. What are the estimated costs associated with using SORA for video generation?
- Based on industry estimates, the pure compute cost of inferring 230 minutes of SORA video could be around $644, not including upload, download, and storage costs.
- The pricing for SORA usage would be considered cheap for professionals but expensive for non-professionals.
- The massive computing power required for SORA and similar AI models highlights the need for significant investment in silicon-chip manufacturing capacity, as mentioned by OpenAI's CEO Sam Altman.
[02] New SORA Features
1. What is the new MiniClips feature in SORA, and how does it help the creative process?
- MiniClips is a new SORA UI tool that allows directors to see the first four frames of 8 or 32 mini-inferred clips before committing to a full inference.
- This feature helps directors quickly judge whether a clip is worth continuing to a full inference, saving time and improving prompt engineering.
2. How does prompt engineering work in SORA, and what advice does Paul Trillo have for filmmakers?
- Prompt engineering is critical in SORA, as the tool currently relies solely on text prompts without multi-modal input.
- Paul Trillo used long, detailed prompts that allowed for multiple beats and scene changes within a single generation.
- He advises filmmakers to experiment, fail, and try again, using their mind's eye to envision what they want to see and breaking it down in a way that SORA can understand.
3. How does SORA's understanding of cinematic terms compare to traditional filmmaking language?
- SORA seems to understand specific terms like "zooming" and "FPV perspective" but may not fully grasp the nuances of cinematic language, such as the difference between a "panning shot," "tracking shot," and "dolly shot."
- Paul Trillo found that SORA responded well to terms like "motion blur" and "35mm film stock," which helped create the desired aesthetic for "The Hardest Part" video.