Driven by Reinforcement Learning - a day at AICamp

Home · Insights · Driven by Reinforcement Learning - a day at AICamp

Date posted

21 October 2019

Reading time

25 Minutes

Marc Nevin

Driven by Reinforcement Learning - a day at AICamp

At this year's Kainos AI Camp in Belfast and Birmingham, the Applied Innovation team worked with some staff from AWS to deliver two DeepRacer Day Workshops at the camps!

AICamp is a two-week long camp full of theory, workshops and practical sessions to give students a strong introduction to the field of artificial intelligence and machine learning, and how it is being used to benefit our everyday lives.

This year, two days of each camp were dedicated to AWS; the first covering AWS SageMaker and deploying machine learning models. The second day was a workshop and racing day for the AWS DeepRacer ? teaching students about applications of AI models, reinforcement learning and getting some competitive spirit going before the hackathon that's at the end of AI Camp.

What is the AWS DeepRacer?

An AWS DeepRacer RC Car — A DeepRacer forward-facing

A fully autonomous 1/18th scale race car driven by reinforcement learning, 3D racing simulator, and global racing league. Featuring a Linux distribution with enough computing power and storage to run a fairly intense reinforcement learning model at some consistent speeds, the car is built on top of a popular model RC racing car chassis.

After developing and training, the model is then downloaded to a physical car to be raced around a massive 8m x 4m track. We followed the AWS competition rules used for their racing events;

You have four minutes on the track to complete as many laps as you can
The fastest lap is your submitted time
Come off the track and your racer is placed where it came off
Come off the track three times and your lap is marked as 'Did Not Finish'

Man walking around a large RC car track — The DeepRacer ReInvent 2018 Track setup we had at the Birmingham AI Camp

Using the AWS Console you can develop reinforcement learning models for the car and train them by setting the model to race around a simulated track for a few hours. The racers are aimed not only at just their racing competition, but they're also for education, mostly on how reinforcement learning works and its applications. For both of our events, we started with some refreshers on different aspects of machine learning before covering what exactly reinforcement learning is, with the model building acting as the exercise to reinforce what they've learnt.

If you want to build some models or are interested in how it works, here are the basics of Reinforcement Learning that we covered with the students;

Reinforcement Learning

If we look at Artifical Intelligence as an umbrella term, one of the biggest aspects under it is Machine Learning, which is often divided down into Supervised and Unsupervised Learning. Reinforcement learning is often forgotten in place of its more popular siblings, but still merits itself a category of its own for being somewhere between Supervised and Unsupervised.

The quote AWS gives us is;

Reinforcement learning, in the context of artificial intelligence, is a type of dynamic programming that trains algorithms using a system of reward and punishment
AWS DeepRacer Training

To break that down; a reinforcement learning algorithm, or agent, learns by interacting with its environment. The agent receives rewards by performing correctly and penalties for performing incorrectly.

The agent learns without intervention from a human by maximizing its reward and minimizing its penalty.

To explain that further, let's take the example of a robot jumping over some obstacles; first, you could give the robot the ability to jump at various distances while approaching an obstacle.

For the first test, let it jump either 2 metres away, 1 metre away, half a metre away, or even 0 metres away which causes it to crash into the obstacle before actually launching upwards.

After a lot of trial and error with when to jump, the robot will eventually learn that it will receive a point by jumping just before hitting an obstacle (say 0.5 metres) ? because jumping then leads to a successful clearing of the obstacle.