Kuaishou Technology has officially launched Kling O1, a new AI video model that unifies generation and editing capabilities into a single multimodal system. The release positions the model as a "one-stop solution" designed to resolve persistent issues with character and scene consistency across different workflows.
Key Details
Unified Architecture: Built on a Multimodal Visual Language (MVL) framework, Kling O1 handles text-to-video, image-to-video, in-painting, and style transfer within a single engine rather than requiring separate tools.
Editing & Control: The model supports natural language editing commands like "remove passersby" or "transition day to dusk" without manual masking, and offers user-defined video durations between 3 and 10 seconds.
Consistency Focus: Kuaishou claims the model uses "director-like memory" to maintain character identity and prop stability across shots, addressing a common pain point in AI video production.


