Reinforcement Learning Key to Championship Soccer Robots, Robotics Pioneer Explains

by Brian Caulfield

Forget about robots stealing your job. Peter Stone is working on robots who can steal Ronaldo’s job. Let’s just say Stone won’t be out of work anytime soon.

Citing the mantra of “good problems make good science,” Stone and other computer scientists are building robots that they hope will compete with — and beat — a team of the world’s best soccer players by 2050.

While Stone’s robots can’t beat human players, they’re already among the best robot soccer players in the world, thanks to his team’s pioneering work putting reinforcement learning — a kind of machine learning — to work in robots.

Robust, Fully Autonomous, Real World Agents

The technology is key to the long-term goal in AI and robotics of creating “robust, fully autonomous agents in the real world,” Stone explains.

Stone — chair of the robotics portfolio program at the University of Texas, Austin — addressed a crowd of several hundred engineers — and hundreds more online — gathered to hear him speak Wednesday at NTECH, our annual internal engineering conference at our Silicon Valley campus.

Stone’s research interests in AI include machine learning — especially reinforcement learning, multiagent systems, robotics and e-commerce. He’s also the co-founder of Cogitai, a startup focused on continual learning.

But he’s best known for his passion for robot soccer, which has become a holy grail for AI and robotics researchers around the world.

Turning the Soccer Pitch into a Proving Ground

Stone and his team are among the best in the world, scoring championships in the annual RoboCup in 2011, 2012, 2014, 2015, 2016 and 2017. The key to Stone’s success: advancements in an area of machine learning known as reinforcement learning.

Most modern machine learning relies on something called supervised learning. With supervised learning, neural networks are trained with labeled examples of real-world images they’re supposed to understand — such as handwritten numbers — and given immediate feedback if they get it wrong.

The problem: that’s rarely how people learn; and it won’t help robots master complex tasks, such as playing soccer.

The Difference Between Reinforcement Learning and Supervised Learning

Reinforcement learning algorithms, by contrast, must deal with delayed feedback — a system won’t know if it’s got it right until after it makes a long string of decisions. Think about winning a game of chess or navigating to a destination in your car, tasks that rely on getting a long string of decisions right, Stone explains.

Even more importantly, data isn’t given to the learning algorithm in advance. Rather, it has to generate its own experience based on what actions it selects as it strives to achieve a goal that requires a long string of decisions.

While that approach has yielded many breakthroughs over the past two decades — from computers that can beat humans at backgammon to Google’s victory last year at the ancient game of go — the challenges become even more complex when teaching robots how to master soccer.

Not only do these robots have to learn, for example, to walk, dribble a ball and shoot a goal, they have to adapt to an environment where they have to work together to compete against other teams.

The key: Stone’s team is training machines to layer different skills — so they master clusters of skills at once, much the way people do — rather than learning one at a time in isolation.

Real-World Implications

Such technologies are essential if robots are to move out of factories and become useful in our homes and offices.

“The question that connects all of them is to what degree can autonomous intelligent agents learn in the presence of teammates and/or adversaries in real-time, dynamic domains,” Stone says.

Finding answers to tough questions like these means that, in the coming decades, the cluster of technologies known as AI will have a growing impact in industries as diverse as transportation and healthcare.

Some will be positive, others may require us to find creative ways to adapt. “We don’t think all jobs will go away, but the gap between rich and poor may widen,” Stone says.

Of course, there’s little to worry about, for now, particularly if you’re a world-class soccer player. Even a team of middle-aged computer scientists can easily beat the best robots in the world at real-world soccer.