How I learned to stop worrying and love training autonomous cars using AWS DeepRacer?

Press enter or click to view image in full size

1. Origin

Amazon wanted to put Reinforcement Learning in the hands of developers. Something like hands-on learning. And also Amazon venture into autonomous cars segment.

Deepracer is a 1/18 scale autonomous car, which runs on ubuntu running on Intel Atom processor and has a RC car chassis and engine. The secret sauce is Amazon’s deeplens camera.

Press enter or click to view image in full size

The DeepLens camera on DeepRacer

Press enter or click to view image in full size

AWS DeepRacer includes a fully-configured cloud environment that you can use to train your Reinforcement Learning models. It takes advantage of the new Reinforcement Learning feature in Amazon SageMaker and also includes a 3D simulation environment powered by AWS RoboMaker. You can train an autonomous driving model against a collection of predefined race tracks included with the simulator and then evaluate them virtually or download them to a AWS DeepRacer car and verify performance in the real world.

Press enter or click to view image in full size

AWS Deepracer

2. RL — Reinforcement Learning

Press enter or click to view image in full size

Types of machine learning

Supervised learning
Unsupervised learning
Reinforcement learning

Press enter or click to view image in full size

Reinforcement Learning terms

Agent = Deepracer
Environment = track
State = 1 round around the track
Action (that agent can take) = steer right or left
Reward (when the agent does a good thing)
Episode (start to the end of the state, eg. 1 round around a track )

Press enter or click to view image in full size

REWARD FUNCTION

It’s the core of RL.

Press enter or click to view image in full size

Here it’s better to have a reward function such that the agent gets more reward for the central line.

Press enter or click to view image in full size

Agent(deepracer) learning from Env(a track)

Once the car runs across a track, it takes 15 pictures/sec.

Taking pictures is 1 step /per state

R = reward function.

R comes after they have taken action, and completed 1 state.

So R is a cumulative reward.

Now we use two different functions

VALUE FUNCTION: One for reward
POLICY FUNCTION: one to determine the action

Deepracer they use VANILLA POLICY GRADIENT and PPO (proximal policy optimization)(https://openai.com/blog/openai-baselines-ppo/)

Using gradient ascent, since one wants to maximize award.

Press enter or click to view image in full size

3. Virtual simulator

Press enter or click to view image in full size

AWS Sagemaker & AWS Robomaker

What AWS services are being used?

Press enter or click to view image in full size

Simulation video on the console using Amazon Kinesis

Cloudwatch: to save logos

while (training)

{

ROBOMAKER: takes photos and passes to Sagemaker

SAGEMAKER : does training, after training saves the model .

Pass back to robomaker.

}

Press enter or click to view image in full size

To train and simulate

Track info

HyperParameters to play around Once you create your own model, there are parameters once can edit, in terms of

ACTION INFO FUNCTION

Press enter or click to view image in full size

REWARD FUNCTION

Press enter or click to view image in full size

HOW TO TRAIN USING AWS DEEPRACER

TRAINING STARTED for DEFAULT MODEL

Press enter or click to view image in full size

RL- Training Model

Press enter or click to view image in full size

SOFTWARE ARCHITECTURE

Press enter or click to view image in full size

I had a chance to learn about this amazing technology and participate in the deepracer league. Although a bit sad that after getting to #1 a few times, I ended up at position #9. Then too it was super fun.

Press enter or click to view image in full size

References :

Workshop

2. DeepRacer page

3. Robomaker

https://aws.amazon.com/robomaker/