Seeing in the Dark: Benchmarking Egocentric 3D Vision with the Oxford Day-and-Night Dataset
University of Oxford
TLDR: A large-scale egocentric dataset with GT 3D and lighting variation for benchmarking NVS and visual relocalization. Easy to use with HLoc, NerfStudio, and 3DGS.
Dataset Video
Abstract
We introduce Oxford Day-and-Night, a large-scale, egocentric dataset for novel view synthesis (NVS) and visual relocalisation under challenging lighting conditions. Existing datasets often lack crucial combinations of features such as ground-truth 3D geometry, wide-ranging lighting variation, and full 6DoF motion. Oxford Day-and-Night addresses these gaps by leveraging Meta ARIA glasses to capture egocentric video and applying multi-session SLAM to estimate camera poses, reconstruct 3D point clouds, and align sequences captured under varying lighting conditions, including both day and night. The dataset spans over 30 km of recorded trajectories and covers an area of 40,000 m2, offering a rich foundation for egocentric 3D vision research. It supports two core benchmarks, NVS and relocalisation, providing a unique platform for evaluating models in realistic and diverse environments. Dataset statistics available here.
Dataset Collection and Processing
We collect data using Meta ARIA glasses, which record raw sensor streams including IMU, RGB, and grayscale video. To capture varied lighting conditions, day, dusk, and night, sessions are recorded between 4-10pm, covering the natural transition from light to dark. Two individuals wear the glasses casually at each site. Recordings are grouped by location and processed with multi-session Machine Perception Service (MPS) provided by Meta, which estimates per-frame camera poses and semi-dense point clouds unified to a common coordinate frame.
Applications
We demonstrate the dataset with two applications: visual relocalization and novel view synthesis.
Novel View Synthesis (NVS)
We create a dataset variant for NVS tasks, which is obtained by subsampling the video by 5x. Camera poses and intrinsics are provided by ARIA MPS service. We also provide fisheye and undistorted images. This variant is easy to use with NerfStudio and 3DGS.
Visual Relocalization
We create a dataset variant for visual relocalization tasks by further spatial subsampling the NVS variant and splitting them into database, daytime queries, and nighttime queries.
Acknowledgement
This research is supported by multiple funding sources, including an ARIA research gift grant from Meta Reality Lab, a Royal Society University Research Fellowship (Fallon), the EPSRC C2C Grant EP/Z531212/1 (TRO), and a National Research Foundation of Korea (NRF) grant funded by the Korean government (MSIT) under grant number RS 2024 00461409.