Meet the robot that learns by watching videos

University researchers, working under a DARPA program, develop a way for robots to turn visual cues into action.

January 30, 2015

“This system allows robots to continuously build on previous learning—such as types of objects and grasps associated with them—which could have a huge impact on teaching and training,” Ghanadan said. “Instead of the long and expensive process of programming code to teach robots to do tasks, this research opens the potential for robots to learn much faster, at much lower cost and, to the extent they are authorized to do so, share that knowledge with other robots.”

Computer scientist Yiannis Aloimonos, center, with Baxter.

Kids (and grownups) can learn a lot from instructional videos. So can chimpanzees. Now you can add robots to the list.

A research team at the University of Maryland, funded by a Defense Advanced Research Projects Agency program, has developed a system that goes enables a robot to interpret visual cues and then perform the task it just witnessed. Robot see, robot do. And the robot also will remember what to do the next time.

The university’s research, led by computer scientist Yiannis Aloimonos, is being conducted under DARPA’S Mathematics of Sensing, Exploitation and Execution, or MSEE, program, which aims to develop autonomous systems that use a minimalist grammar in order to respond to visuals.

“The MSEE program initially focused on sensing, which involves perception and understanding of what’s happening in a visual scene, not simply recognizing and identifying objects,” Reza Ghanadan, a program manager in DARPA’s Defense Sciences Offices, said in a release. “We’ve now taken the next step to execution, where a robot processes visual cues through a manipulation action-grammar module and translates them into actions.”

In this case, the action involved cooking. Several Baxter Research Robots basically watched a series of videos on how to cook and were able to recognize utensils on screen, grab the appropriate one in front of them and adroitly manipulate it the right way, even neatly pouring liquid into a moving container.

The robots also are able to retain that knowledge and share it with other robots, which DARPA said is an advancement for sensor systems, which tend to see everything freshly from moment to moment.

Baxter Research Robots, made by Rethink Robotics, are used at research institutions around the world. The company’s flagship Baxter also is used widely in manufacturing as an inexpensive platform for performing repetitive tasks, but the research robots are different in several key ways.

As Philip Dasler, a Ph.D. student in Maryland’s computer science department, points out, a Baxter Research Robot’s ability to watch and learn eliminates the programming required for each specific task, which can be time-consuming and difficult. Just plug the robot in, and start showing it what to do—by, for example, manipulating its arms to perform a task, after which the robot will be able to repeat it.

Baxter also differs from its industrial versions in its awareness of humans. Where it’s best to stay clear of industrial robots, which will perform their jobs at high speeds regardless of what’s around them, research Baxters have better manners, sensing their surroundings and complying with a human presence.

The MSEE research is taking Baxter to the next level, allowing it to learn through visual, rather than physical, instruction, which DARPA said could have an impact in military areas like repair and logistics.

NEXT STORY: SpaceX drops lawsuit, as Air Force continues ULA contract