Flow: a deep reinforcement learning framework for mixed autonomy traffic

Flow is a traffic control benchmarking framework. It provides a suite of traffic control scenarios (benchmarks), tools for designing custom traffic scenarios, and integration with deep reinforcement learning and traffic microsimulation libraries.

Flow is developed at the University of California, Berkeley.

Flow process diagram
The various components of Flow.
Flow training process
The Flow training and evaluation process.


The following are successful controllers developed with Flow. For more details visit our gallery.

Phantom shockwave dissipation on a ring

Inspired by the famous 2008 Sugiyama experiment demonstrating spontaneous formation of traffic shockwaves (reproduced on the left video), and a 2017 field study demonstrating the ability of AVs to suppress shockwaves, we investigated the ability of reinforcement learning to train an optimal shockwave dissipating controller.

In the right video, we learn a controller (policy) for one out of 22 vehicles. By training on ring roads of varying lengths, and using a neural network policy with memory, we were able to learn a controller that both was optimal (in terms of average system velocity) and generalized outside of the training distribution.

Intersection control

We demonstrated also the ability of a single autonomous vehicle to control the relative spacing of vehicles following behind it to create an optimal merge at an intersection.

As can be seen in the videos, without any AVs, the vehicles are stopped at the intersection by vehicles in the other direction; we show that even at low penetration rates, the autonomous vehicle "bunches" all of the other vehicles to avoid the intersection, resulting in a huge speed improvement.

Bottleneck control

Inspired by the rapid decrease in lanes on the San Francisco-Oakland Bay Bridge, we study a bottleneck that merges from four lanes down to two to one.

We demonstrate that the AVs are able to learn a strategy that increases the effective outflow at high inflows, and performs competitively with ramp metering.

Bottleneck control design
Control structure of the bottleneck. Scale of segments are distorted for visualization.
Bottleneck control design
Without control, congestion rapidly forms in the bottleneck.
Bottleneck control design
Control structure of the bottleneck; at high inflows the outflow is improved by 25%.
Bottleneck control design
Comparison of inflow, outflow curves for AV control vs. ramp metering. At high inflows they perform comparably.

On-ramp shockwave dissipation

We demonstrate the ability of shockwave dissipation, previously demonstrated on a ring to scale to an actual highway situation. Vehicles travelling on a highway are perturbed by an aggressive on-ramp merge that the autonomous vehicle needs to dissipate.

As can be seen in the videos, the autonomous vehicle learns to slow its following vehicles to either avoid the perturbation or smooth its effects.