Reconstructing Retinal Images from Natural Outdoor Walks with 3D Gaussian Splatting

Poster Presentation 53.470: Tuesday, May 21, 2024, 8:30 am – 12:30 pm, Pavilion
Session: 3D Perception: Virtual and augmented reality

Daniel Panfili1 (), Nathaniel Powell2, Mary Hayhoe3; 1University of Texas at Austin

Understanding the visual signal incident on the retina is crucial for understanding what visual information is used to guide specific actions. Retinal image calculation requires a gaze vector, eye position in a world-centered coordinate frame, and a representation of the environment. Previously, we calculated a 3D representation of the environment (together with luminance and chromaticity) using photogrammetry methods (Muller et al, 2023). This allowed more accurate calculation of retinal motion, and investigation of the visual information used for path planning. However, this reconstruction method faced challenges in accurately depicting scenes with reflective, transparent, or homogeneous surfaces, and struggled with high-spatial frequency elements (e.g., fine-grain textures). Consequently, the resulting retinal images lacked several natural scene features, notably the sky, bodies of water, and fine tree limbs. This limitation renders the data unsuitable for contexts where image realism is important, like perceptual straightening paradigms (Henaff el al, 2019). To enhance realism, we adopted a new rasterization technique called 3D gaussian splatting (Kerbl et al, 2023). 3D gaussian splatting converts point clouds into a smooth, continuous representation by applying Gaussian-based kernel blending, emphasizing visual information from nearby points in the cloud while gradually decreasing influence with distance. This technique provided a more accurate portrayal of the environment, closely resembling the perspectives captured by the scene camera used for reconstruction. As a result, it generated higher fidelity retinal images compared to photogrammetry meshes. However, this technique, while promising for stimulus presentation, is still in its nascent stage and lacks some critical features like collision calculation. Overall, 3D gaussian splatting emerges as a promising tool for stimulus presentation, though additional refinement is necessary to more accurately reflect the visual image properties.

Acknowledgements: NIH grant EY05729