Welcome! The TBD pedestrian dataset is a dataset primarily collected in pedestrian rich environments. Our dataset highlights the following components: human verified labels grounded in the metric space, a combination of top-down and perspective views, and naturalistic human behavior in the presence of a socially appropriate “robot”.

Eight example images from an overhead cameras. Traces of pedestrian motion are shown on each image.
Example scenes from the TBD pedestrian dataset (labeled at 2.5 Hz). a) a dynamic group. b) a static conversational group. c) a large tour group with 14 pedestrians. d) a pedestrian affecting other pedestrians’ navigation plans by asking them to come to the table. e) pedestrians stop and look at their phones. f) two pedestrians change their navigation goals and turn towards the table. g) a group of pedestrians change their navigation goals multiple times. h) a crowded scene where pedestrians are heading towards different directions.

Dataset Highlights

Human verified metric space labels:

Having labels grounded in metric space eliminates the possibility that camera poses might have an effect on the scale of the labels. It also makes the dataset useful for robot navigation related research because robots plan in the metric space rather than image space.

Top-down and perspective views:

A dataset that contains both top-down views and perspective views will be useful for research projects that rely on perspective views. This allows perspective inputs to their models, while still having access to ground truth knowledge of the entire scene.

Naturalistic human behavior with the presence of a “robot”:

The “robot” that provides perspective view data collection is a cart being pushed by human. Doing so reduces the novelty effects from the surrounding pedestrians. Having the “robot” being pushed by humans also ensures safety for the pedestrians and its own motion has more natural human behavior. As such, the pedestrians also react naturally around the robot by treating it as another human agent.


The button below will take you to our dataset Google drive folder. Descriptions of the data organization and label files are in the readme file.

This effort is IRB approved under Carnegie Mellon University protocols STUDY2019_00000116 and STUDY2021_00000199. Please contact Aaron Steinfeld for IRB related issues.


Wang, A., Biswas, A., Admoni, H., & Steinfeld, A. (2022). Towards Rich, Portable, and Large-Scale Pedestrian Data Collection. ArXiv, abs/2203.01974. [link]


Left: overhead corner view of the atrium with highlights on cameras and other features. Right, top: Static camera equipment. Right, bottom: mobile cart
Hardware setup used while collecting the TBD pedestrian dataset. Blue circles indicate positions of RGB cameras. Green box shows our mobile cart with a 360 camera and stereo camera which imitate a mobile robot sensor suite. The cart is manually pushed by a researcher during recording. The white area is where trajectory labels are collected.

We positioned three FLIR Blackfly RGB cameras surrounding the scene on the upper floors overlooking the ground level at roughly 90 degrees apart from each other. Compared to a single overhead camera, multiple cameras ensure better pedestrian labeling accuracy. The RGB cameras are connected to portable computers powered by lead-acid batteries. We also positioned three more units on the ground floor but did not use them for pedestrian labeling.

In addition to the RGB cameras, we pushed a cart through the scene, which was equipped with a ZED stereo camera to collect both perspective RGB views and depth information of the scene. A GoPro Fusion 360 camera for capturing high definition 360 videos of nearby pedestrians was mounted above the ZED. Data from the on-board cameras are useful in capturing pedestrian pose data and facial expressions. The ZED camera was powered by a laptop with a power bank.


Our dataset current contains labels for 1412 pedestrians in 8 recording sessions. More data are currently being collected on other locations. Stay tuned!

Session #Time Length# Of Pedestrians


This project was supported by grants (IIS-1734361 and IIS-1900821) from the National Science Foundation.


Allan Wang

Aaron Steinfeld