Apple's Modest AI Moves, 4D Gaussian Splats Arrive, and Novel AI Storage Methods

In the latest episode of Denoised, hosts Joey Daoud and Addy Ghani dive into several significant developments in technology that matter to media and entertainment professionals. From Apple's somewhat conservative approach to AI at WWDC to the emergence of 4D Gaussian Splats for dynamic scene creation, this episode covers key advancements that could impact creative workflows. The hosts also explore innovative AI research, including a technique for storing AI memory in MP4 files that challenges conventional approaches to data storage. Let's break down these developments and understand why they matter to creators, technologists, and industry professionals.

Apple's WWDC: Liquid Glass UI Overhaul and Cautious AI Approach

Apple's Worldwide Developers Conference this year featured a substantial visual redesign across their operating systems, but showed a notably measured approach to AI integration compared to competitors. The company introduced Liquid Glass, a new design language featuring transparent, glassy interface elements that will appear across iOS, macOS, and other Apple platforms.

Joey and Addy discussed how this aesthetic update, while visually interesting, comes at a time when other tech giants are making much more substantial AI announcements. Apple did showcase Apple Intelligence, but the hosts noted that its functionality appears limited compared to offerings from Google and Microsoft.

"The timing couldn't be worse," Addy observed, referring to how Apple's design-focused updates contrast with the AI-centric announcements from competitors at recent events like Google I/O and Microsoft Build.

Key WWDC announcements included:

Liquid Glass UI: A transparent, glassy interface redesign rolling out across all Apple platforms
Apple Intelligence Framework: A foundation models framework that can run locally on devices, though with seemingly limited capabilities
ChatGPT Integration: Users can opt in to give Apple Intelligence access to ChatGPT for more advanced capabilities
Version Number Unification: iOS, iPadOS, and other Apple operating systems will now use year-based version numbers (iOS 26, etc.)
Phone Communication Improvements: Call screening and hold assist features that can manage calls while you're busy
visionOS Updates: Persistent widgets that remain in physical space locations, plus improved "personas" for more realistic avatars
iPad Multitasking: More laptop-like windowing capabilities for iPads

The hosts expressed concern about Apple's seemingly cautious approach to AI, with Addy noting, "If you are one step behind today, you're ten steps behind tomorrow." They discussed how Apple might be opening itself to disruption if it doesn't innovate more aggressively in AI, particularly as wearable technology advances.

Joey pointed out that Apple has historically never been first to market with new technologies, preferring to refine existing concepts. However, both hosts agreed that in the rapidly evolving AI space, this strategy might be riskier than usual.

4D Gaussian Splats: Dynamic Scene Capture Takes a Major Step Forward

The conversation shifted to an exciting development in computational photography and scene capture: the arrival of 4D Gaussian Splats. Addy provided a clear explanation of this technology, which represents an important advance in how dynamic scenes can be captured and reproduced.

"Gaussian splats have been around for a couple of years. It's basically AI-powered photogrammetry," Addy explained. "It's somewhere in the middle between machine vision, traditional photogrammetry, and AI inference."

The hosts discussed a new paper from a Chinese university called "FreeTimeGS: Free Gaussian Primitives at Anytime Anywhere for Dynamic Scene Construction," which adds the crucial time dimension to Gaussian Splats technology.

Key points about 4D Gaussian Splats:

What They Are: Gaussian Splats are point clouds with color information and gradients that can represent 3D objects without traditional 3D rendering
The 4D Breakthrough: The addition of time as the fourth dimension allows for capturing dynamic scenes rather than just static environments
Technical Approach: The system references past and present frames to optimize the point cloud solution, similar to how video compression works
Current Requirements: Approximately 20 cameras are needed to capture enough data, making this still primarily a professional tool
Processing Demands: Content must be processed after capture rather than in real-time, currently limiting some applications

The hosts emphasized the potential applications in filmmaking and virtual production, noting that this technology could eliminate the need to build complex digital twins of real-world environments. As Addy explained, "You don't have to build complex real-world things if you can just capture it well enough."

They clarified an important misconception: this technology does not extract 3D information from a single 2D video. Multiple camera angles are required to generate the data needed for the 4D Gaussian Splat reconstruction.

Innovative AI Research: Video-Based Memory and Faster Image Generation

The final segment covered two interesting AI research papers with potential implications for how AI systems store and process information.

Storing AI Memory in MP4 Files

The hosts discussed a novel approach called Memvid that uses video files to store AI memory and knowledge bases.

"Unlike traditional vector databases that consume massive amounts of RAM and storage, Memvid compresses your knowledge base into compact video files while maintaining instant access to any piece of information," Joey explained.

This approach represents creative thinking about data storage for AI systems:

The Concept: Storing text information within video file formats rather than traditional databases
Potential Benefits: More efficient memory usage and potentially expanded knowledge windows for large language models
Technical Hypothesis: Addy suggested this might create a "pseudo RAM with the video" that allows for more efficient memory management
Research Applications: Could help make large AI models more manageable for researchers without requiring massive computing resources

"This is such outside-the-box thinking," Addy noted, suggesting this kind of innovative approach is what companies like Apple should be pursuing.

Contrastive Flow Matching for Image Generation

The final topic covered a technical advancement in image generation called Contrastive Flow Matching that could make training image generation models substantially more efficient.

The paper claims to achieve:

9x faster image generation
5x fewer steps required
More efficient training process

Addy explained that traditional image generation involves many steps of adding noise and then removing it, with all processes funneling through one neural network. The contrastive approach keeps these processes separate, leading to more efficient training.

The hosts noted that while smaller, more efficient models are valuable, particularly for mobile devices, they typically require large models first. "You can't get to these more efficient routes or these smaller models without having the big models first," Joey pointed out.

Conclusion: Practical Impacts for Creators and Technologists

This episode of Denoised highlighted several developments that media and entertainment professionals should monitor. Apple's design-focused updates and cautious AI approach suggest the company is taking a measured path while competitors make bolder moves. Meanwhile, 4D Gaussian Splats offer a glimpse of how dynamic scene capture could transform production workflows, potentially eliminating the need for complex digital twins in many scenarios.

The AI research papers demonstrate how quickly the field is evolving, with novel approaches to data storage and model training that could make powerful AI tools more accessible and efficient. For creators and technologists working in media production, these developments represent both new capabilities and challenges as the industry continues to transform.

The views and opinions expressed in this podcast are the personal views of the hosts and do not necessarily reflect the views or positions of their respective employers or organizations. This show is independently produced by VP Land without the use of any outside company resources, confidential information, or affiliations.