A research team from Carnegie Mellon University and Facebook AI Research, which included CMU Machine Learning Department doctoral student Devendra S. Chaplot, has developed a common-sense robot.
A robot travelling from point A to point B is more efficient if it understands that point A is the living room couch and point B is a refrigerator, even if it’s in an unfamiliar place, the CMU report said.
That navigation system, called SemExp, in June won the Habitat ObjectNav Challenge during the virtual Computer Vision and Pattern Recognition conference, edging out a team from Samsung Research China, CMU said. It was the second consecutive first-place finish for the CMU team in the annual challenge, it said.
SemExp, or Goal-Oriented Semantic Exploration, uses machine learning to train a robot to recognize objects — knowing the difference between a kitchen table and an end table, for instance — and to understand where in a home such objects are likely to be found, the CMU report said.
This enables the system to think strategically about how to search for something, said Chaplot in the report. “Common sense says that if you’re looking for a refrigerator, you’d better go to the kitchen,” the Indian American researcher said.
Classical robotic navigation systems, by contrast, explore a space by building a map showing obstacles. The robot eventually gets to where it needs to go, but the route can be circuitous, CMU said.
Previous attempts to use machine learning to train semantic navigation systems have been hampered because they tend to memorize objects and their locations in specific environments, according to the report. Not only are these environments complex, but the system often has difficulty generalizing what it has learned to different environments, it said.
Chaplot — working with FAIR’s Dhiraj Gandhi, along with Abhinav Gupta, associate professor in the Robotics Institute; and Ruslan Salakhutdinov, professor in the Machine Learning Department — sidestepped that problem by making SemExp a modular system, according to the CMU report.
The system uses its semantic insights to determine the best places to look for a specific object, Chaplot said. “Once you decide where to go, you can just use classical planning to get you there.”