FullCircle is a new open-source pipeline that produces clean 3D reconstructions from handheld 360-degree video shot on a consumer camera. Walk through a space with something like an Insta360 X4, and FullCircle turns that footage into a usable 3D asset — automatically handling the biggest obstacle that has always made casual 360 capture unreliable: the camera operator appearing in every frame.

For virtual production teams who rely on photogrammetry for set extensions, digital twins, and location scouting, the practical implications are significant. FullCircle collapses the gap between "quick reference capture" and "usable 3D asset" by eliminating the biggest friction points in 360-degree reconstruction.

The Problem With Casual 360 Capture

Standard photogrammetry workflows use either narrow-field perspective cameras with carefully planned overlap, or 360-degree cameras mounted on tripods with precise repositioning between shots. Both approaches produce good reconstructions but demand time and discipline on set.

Consumer 360 cameras like the Insta360 X4 solve the coverage problem — every frame captures the full environment. But they introduce a new one: the camera operator appears in every single frame. That person becomes a persistent, moving obstruction that corrupts feature matching, breaks pose estimation, and leaves ghostly artifacts in the final reconstruction. Previous 360-degree methods either required the operator to stay out of frame or needed manual annotation to mask them out.

FullCircle solves this automatically.

How It Works

The pipeline has three stages, all designed to work on raw dual-fisheye frames rather than stitched equirectangular panoramas.

Stage one identifies and masks the camera operator using off-the-shelf foundation models. The system generates virtual pinhole views from each omnidirectional frame, runs YOLOv8 person detection and SAMv2 segmentation to locate the operator, then re-centers the omnidirectional image so the operator sits in the low-distortion center of a synthetic fisheye view. This re-centering approach lets SAMv2 produce temporally consistent, complete masks without any fine-tuning or manual prompts.

Stage two feeds the masked fisheye frames into COLMAP for camera pose estimation. Masking the operator before this step is critical — without it, COLMAP tries to match features on a moving person across frames, which either produces bad poses or fails entirely.

Stage three trains a 3D Gaussian Ray Tracing (3DGRT) model on the masked raw fisheye frames for 30,000 iterations. 3DGRT, developed at NVIDIA, natively supports arbitrary camera models including fisheye optics, which means FullCircle avoids the information loss that comes from undistorting or resampling wide-angle imagery into perspective projections.

No component in the pipeline requires task-specific training. The entire operator removal system runs on pretrained YOLOv8 and SAMv2 weights.

Why Raw Fisheye Matters

Working directly on raw dual-fisheye frames rather than undistorted perspective crops is a meaningful quality advantage. In controlled comparisons on the same camera trajectories, FullCircle's fisheye-native approach hits 29.6 dB PSNR, while the best perspective-domain baseline manages 26.3 dB. That 3.3 dB gap represents a visible difference in reconstruction fidelity.

The advantage comes from two sources. Each 360-degree frame provides multi-view constraints in all directions simultaneously, giving the optimizer stronger geometric signal. And skipping the undistortion step preserves pixel-level detail that resampling would degrade.

What This Means for Production

The capture protocol for FullCircle is: hand someone an Insta360 X4, tell them to walk through the space, and press record. That is a fundamentally different proposition from deploying a photogrammetry team with structured capture plans.

For location scouting, this means a single scout can capture a usable 3D scan of a practical location in minutes. For virtual production, it lowers the barrier to building digital twins of spaces that would otherwise only get flat reference photography. For previz teams, it provides geometry that is good enough to block scenes in a 3D environment without waiting for a dedicated scan crew.

The system does have limits. It struggles with scenes containing multiple static people who are not the camera operator, and relies on COLMAP for pose estimation, which can fail in geometrically ambiguous environments.

Availability

FullCircle is an open research project from the University of British Columbia, Bilkent University, and Google DeepMind, led by Yalda Foroutan and Ipek Oztas, with code available on GitHub via the Theia Lab. The underlying 3DGRT renderer comes from NVIDIA's open-source 3dgrut repository. The paper is on arXiv (2603.22572), and the team has released a new benchmark dataset of nine scenes with 99 captures across multiple difficulty tiers.

The pipeline builds entirely on existing open-source components, meaning production-oriented developers can evaluate it against their own capture workflows without waiting for a commercial release.

Reply

Avatar

or to participate

Keep Reading