Have you ever wanted to see the world through a robot’s eyes? Boston Dynamics Inc. posted a video showcasing how its electric Atlas humanoid robot performs tasks in the lab, highlighting the robot’s own POV.

The World Through Atlas’ Eyes

According to CNET, the robot’s view combines two-dimensional images from onboard cameras, position information from Atlas’ joints to orient itself within its surroundings, a 3D picture of its environment for depth perception, and CAD files of objects on which the robot has been trained so it can make real-time predictions about how an object is positioned in space.

The video shows off the robot’s real-time perception, demonstrating how the humanoid registers its frame of reference for the engine covers and the pick-and-place locations. As Atlas is picking up an object, it evaluates the topology in terms of how to handle it and where to place it. It also learns to handle parts more effectively as it continues to update its understanding of the world around it.

Another noteworthy aspect of the video is when an engineer drops an engine cover on the floor. Atlas looks around, finds the part, picks it up, and places it with precision into the engine cover area.

Explore Tomorrow's World From Your Inbox

Get the latest science, technology, and sustainability content delivered to your inbox.


I understand that by providing my email address, I agree to receive emails from Tomorrow's World Today. I understand that I may opt out of receiving such communications at any time.

“In this particular clip, the search behavior is manually triggered,” Scott Kuindersma, senior director of robotics research at Boston Dynamics, told The Robot Report. “The robot isn’t using audio cues to detect an engine cover hitting the ground. The robot is autonomously ‘finding’ the object on the floor, so in practice we can run the same vision model passively and trigger the same behavior if an engine cover — or whatever part we’re working with — is detected out of the fixture during normal operation.”

Boston Dynamics Atlas robot; Photo: Boston Dynamics
Boston Dynamics Atlas robot; Photo: Boston Dynamics

The video not only demonstrates how Atlas can perceive and adapt to its environment, but also that it can handle chaos while sticking to its objective.

“When the object is in view of the cameras, Atlas uses an object pose estimation model that uses a render-and-compare approach to estimate pose from monocular images,” Boston Dynamics wrote in a blog post. “The model is trained with large-scale synthetic data and generalizes zero-shot to novel objects given a CAD model. When initialized with a 3D pose prior, the model iteratively refines it to minimize the discrepancy between the rendered CAD model and the captured camera image.”

“Alternatively, the pose estimator can be initialized from a 2D region-of-interest prior (such as an object mask),” said the company. “Atlas then generates a batch of pose hypotheses that are fed to a scoring model, and the best fit hypothesis is subsequently refined. Atlas’s pose estimator works reliably on hundreds of factory assets which we have previously modeled and textured in-house.”