Motivations, Opportunities and Challenges
Dr. N. Kemal Ure – Director of AI, Eatron Technologies

Part 2: Motivations and Opportunities

In the first part of these series of blog posts on Reinforcement Learning (RL), we presented an overview of RL and how it rose to fame as one of the most popular sub-fields of Artificial Intelligence (AI). In this part, we go through motivations and opportunities in applying RL to autonomous driving.

1. Control/Decision Making in Noisy Environments

One of the main challenges in designing control systems is handling uncertainty in dynamics of the model and external disturbances. The classical design techniques are well-suited to handle uncertainties structured at the signal level, such as sensor noise or parametric uncertainty in system dynamics. On the other hand, dealing with unstructured uncertainty, such as maneuvers of the surrounding traffic are out of scope for these methods. This incapability leads to conservative designs, where the surrounding traffic dynamics are grossly simplified, which leads to safe but low-performance autonomous driving systems. On the other hand, since RL does not make any assumptions regarding the structure of the underlying model, it can learn how to navigate the traffic by simply observing more data. Being free of model assumptions enables us to tackle much more complicated traffic scenarios, such as high-density traffic jams, handling road accidents, and navigating in urban environments with complex road layouts and rulesets.

2. Going Beyond Heuristics, Rule of Thumbs and Domain Knowledge

Almost any decision-making task can be approached by rules and heuristics. For instance, we can automate the lane changing decision of an autonomous car by dividing problem into distinct traffic scenarios and then use the expert knowledge to map each scenario to a decision. This approach is popular in industry due to its cheapness, since it involves minimal algorithm design or data processing. On the other hand, the performance of such rule-driven approaches is strictly limited by the quality and depth of the experience involved in designing a such system. In addition, there will always be edge-cases that are note well covered by rules, and suboptimal decisions due to limited resolution of the considered scenarios. RL approach on the other hand, does not suffer from these limitations, since it makes no hard-coded assumptions about the problems it aims to tackle. At the expense of collecting data, either from real-world or in conjunction with simulator, RL explores the most of the solution and scenario space, which usually results in systems that have significant edge against rule-driven systems.

3. Adaptability to Changes in The Problem Dynamics

It is often the case that dynamics/parameters of the problem changes with time/task. For instance, traffic rules and dynamics at some specific part of the world might not align with our previous experience. Adapting existing decision-making systems to these new tasks usually involves considerable amount of engineering and manual work. Once again, RL’s lack of assumptions about the problem it needs to tackle enables it to adapt to these changes. Training a pre-trained RL agent on a new task allows it to update its behavior on these new dynamics, and given enough training experience with the new environment, agent can smoothly change its behavior to tackle the challenges in the new environment.

In short, RL offers a solution to automation of critical decision-making tasks, that is scalable with data, adaptable to changes, and potential to beat rule-based baseline solutions. In the third part of this series, we will take a look at challenges in applying RL to real world autonomous driving scenarios.