HFO-DQN

PLAYING 2D ROBOCUP WITH DEEP REINFORCEMENT LEARNING

What is this all about?

RoboCup is an ongoing global project, aiming to push forward the state-of-the-art in robotics, machine learning and artificial intelligence, by posing a formidable challenge: by the mid twenty-first century, a team of fully autonomous humanoid robots shall win a game of soccer against the winning team of the most recent World Cup.

Since its inception, the RoboCup challenge has been broken down into multiple domains which focus on various areas of research, ranging from robotics, through computer vision and control theory, to strategy algorithms and machine learning. One such domain is RoboCup 2D, which can simulate a full soccer game on a 2D screen, allowing participating teams to ignore the mechanics of autonomous robots in favor of focusing on strategy, game play and learning. This domain has been subject to much research over the years, and is still considered a challenging domain for learning systems.

The challenges of RoboCup 2D as a platform for machine learning are numerous, the major ones being (a) an extremely large and continuous state space, resulting from the variable nature of the game; (b) a very dynamic environment, with parameters (such as locations of the ball and players, player speeds and the game state) changing on-line and in quick succession and (c) a multiple agent environment, which requires the learning algorithm to react and adapt to actions of other agents on the field.

In recent years, an exciting new method which could solve this type of problem has resurfaced, namely deep reinforcement learning. More specifically, Deep-Q-Networks (DQN, see link to the right) equip reinforcement learning agents with a powerful non-linear function approximator, which enables them to tackle the problem of large state spaces and complex environments. DQN has been shown to reach high level control results in various game environments, in a mostly domain independent fashion, using only the pixels of the screen image as input. This makes DQN a powerful tool to use when trying to solve RoboCup.

This project approaches the RoboCup 2D challenge in this more general approach, without relying heavily on prior knowledge of the domain and the specific information available to the agent. We attempt to have an agent learn how to play RoboCup, using only the screen image of the game as input. This approach makes DQN the perfect tool for the task, as it has been shown to be successful in similar setups. We use a sub domain of RoboCup 2D, Half Field Offense (HFO; source here), which is simpler to control and understand than a full RoboCup 2D simulation.

Previous attempts at solving RoboCup 2D (and specifically, HFO) using various methods have been made, including some using reinforcement learning, and even the DQN algorithm. However, our work is the first time the screen image of the game itself is used, as opposed to hand crafted features relying on previous knowledge of the game. This receives support from multiple previous examples of usage of reinforcement learning to make sense of image data. We present the first setup allowing an agent to learn in the RoboCup 2D environment using the image as input. This image-based approach shows good results compared to other proposed solutions. Additionally, this is one of the first solutions to use the relatively new DQN, and pushes it outside the relatively simple environment of Atari 2600 games into the much more dynamic and complicated RoboCup domain.

This site will be updated periodically with our progress, as we strive to build on what we have learned and create more complex scenarios for our learning agents to tackle. Stay tuned!

More on DQN:

Read the Nature article here

HFO-DQN

PLAYING 2D ROBOCUP WITH DEEP REINFORCEMENT LEARNING

What is this all about?

More on DQN:

Yona Cohen

Orr Krupnik