OpenAI released Sora 2, a major upgrade to its video generation model that the company describes as jumping "straight to what we think may be the GPT-3.5 moment for video." The company also just launched a new social iOS app called Sora, powered by the Sora 2 model.

The most significant improvements focus on physics accuracy and controllability:

  • Better object permanence and physics modeling - The original Sora would "morph objects and deform reality" to execute prompts (like making a missed basketball shot teleport to the hoop), while Sora 2 actually models realistic outcomes like rebounds off backboards

  • Cameos feature - Users can upload a short video and audio recording to inject themselves into any generated scene with "remarkable fidelity," maintaining accurate appearance and voice

  • Multi-shot consistency - The model can follow complex instructions across multiple shots while maintaining world state, handling realistic, cinematic, and anime styles

  • Synchronized dialogue and sound effects - As a versatile video-audio generation system, it can produce realistic background soundscapes, speech, and sound effects with impressive detail

Sora 2 launches through the new social iOS app Sora, with social features built around the cameos capability. The service starts free with "generous limits" in the US and Canada, with ChatGPT Pro users getting access to a higher-quality Sora 2 Pro model. An API release is also planned.

In a separate blog post, CEO Sam Altman called this the "ChatGPT for creativity" moment, noting the team's focus on avoiding addictive social media patterns through features like natural language feed control and periodic wellbeing checks.

What matters: The physics improvements address real limitations that made previous AI video feel obviously artificial. The cameos feature could be genuinely useful for quick creative projects, though the social app approach feels like uncharted territory for a video generation tool.

Reply

or to participate

Keep Reading

No posts found