The Aria Pilot Dataset is a collection of 159 sequences captured using Project Aria, to accelerate the state of machine perception and AI.
The Aria Pilot Dataset and accompanying tools provides researchers in computer vision access to anonymized Aria sequences, captured in a variety of scenarios, such as cooking, playing games, or exercising. In addition to ‘Everyday Activities’, the dataset also includes ‘Desktop Activities’ captured with a multi-view motion capture system, helping to accelerate research into human-object interactions.
We believe this dataset will provide a baseline for external researchers to build and foster reproducible research on egocentric Computer Vision and AI/ML algorithms for scene perception, reconstruction and understanding.
In addition to providing sensor data from Project Aria, the Pilot Dataset also contains derived results from machine perception services which provide additional context to the spatial-temporal reference frames.
Multi-user poses in shared reference frame
In addition to providing per-frame trajectory for every recording, sequences captured within the same environment have been aligned to the same reference-frame, allowing those sequences to be understood within the same context.
For a high-quality egocentric dataset, it is essential to understand how cameras perceive the world. The Project Aria Pilot Dataset provides full camera calibration parameters, including both intrinsics and extrinsics of every sensor.
Multi-view motion capture
To facilitate research into human-object interactions, the Aria Pilot Dataset includes a subset of “Desktop Activities” captured using a multi-view motion capture system.
Multi-device time sync
In addition to aligning the trajectories of sequences captured within the same environment, the Project Aria Pilot Dataset also provides precise time-alignment between sequences captured simultaneously.
For sequences where actors speak, we provide speech-to-text annotation. This supports egocentric communications research, such as predicting turn-taking in conversations and multi-speaker transcription.
Using data from Project Aria’s eye-tracking cameras, the Pilot dataset includes an estimate of the wearer’s eye-gaze. This can be used to accelerate research into user-object interactions.
How will the dataset be used?
Accelerating the state of Machine Perception and Artificial Intelligence
The Project Aria Pilot dataset consists of 159 sequences, which can be used to unlock several areas of research for progressing the state of machine perception and AI, including camera relocalization, and scene reconstruction.
Studying these research areas is crucial for researchers to engage with the challenges associated with AR devices.
To demonstrate a few representative scenarios in all-day-long activities with always-on sensing, multiple sequences were recorded using actors in five locations across USA.
If you are a researcher in AI or ML research, enter your email here to access to the Project Aria Pilot Dataset and accompanying tools.
By submitting your email and accessing the Project Aria Pilot Dataset, you agree to abide by the dataset license agreement and to receive emails in relation to the dataset.