http://ai.stanford.edu/~asaxena/rccar/ High Speed Obstacle Avoidance using Monocular Vision and Reinforcement Learning Jeff Michels Ashutosh Saxena Andrew Y.

Transcript http://ai.stanford.edu/~asaxena/rccar/ High Speed Obstacle Avoidance using Monocular Vision and Reinforcement Learning Jeff Michels Ashutosh Saxena Andrew Y.

http://ai.stanford.edu/~asaxena/rccar/
High Speed Obstacle Avoidance
using Monocular Vision and
Reinforcement Learning
Jeff Michels
Ashutosh Saxena
Andrew Y. Ng
Stanford University
ICML 2005.
Problem




Drive a remote control car
at high speeds
Unstructured outdoor
environments
Off the shelf hardware,
inexpensive cameras and
little processing power
Vision and Driving Control
ICML 2005.
Jeff Michels, Ashutosh Saxena & Andrew Y. Ng
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
Prior Work: Vision

Estimating depth from multiple
images:




Stereovision (e.g., Scharstein &
Szeliski, 2002)
Depth from Defocus (e.g.,
Klarquist et al., 1995)
Optical Flow/Structure from
motion (e.g., Barron et al., 1994)
Motivation #1: Monocular vision.
Stereo vision has limits
 baseline distance between cameras
 vibration and blur
 We would like to explore the use of
monocular cues.

ICML 2005.
Jeff Michels, Ashutosh Saxena & Andrew Y. Ng
Prior Work: Driving Control

Driving




Stereo-vision for driving (LeCun, 2003)
Highways with clear lane markings (Pomerleau, 1989)
Single camera for indoor robot, but known color and texture of ground
(Gini & Marchi, 2002)
Motivation #2: Reinforcement learning
 Many past successes used model-based RL.
 Does model-based RL still make sense even for tasks requiring
complex perception?
 (To simulate vision input, we need to use computer graphics!)
ICML 2005.
Jeff Michels, Ashutosh Saxena & Andrew Y. Ng
Approach
Vision System
 Estimate distance to nearest obstacle in each
possible steering direction.
Driving Control
 Map from the output of the vision system into
steering commands for the car.
 Use reinforcement learning to learn the policy.
ICML 2005.
Jeff Michels, Ashutosh Saxena & Andrew Y. Ng
Vision System:
Training Data



Image divided into vertical
columns corresponding to
possible steering
directions.
Image labeled with depth
for each vertical column
Laser range finder -ground truth distances
ICML 2005.
Jeff Michels, Ashutosh Saxena & Andrew Y. Ng
Vision System: Monocular Cues

Monocular Cues used by
humans for depth perception





Texture Variations - Laws’
Texture Gradient (Linear
Perspective) - Radon, Harris
Haze - Color
Occlusion
Known Object Size
(Loomis, Nature 2001)
ICML 2005.
Jeff Michels, Ashutosh Saxena & Andrew Y. Ng
Feature Vector: Monocular Cues



Texture Variation
Texture Gradient
Occlusion, Object Size,
Global structure



Overlapping windows
Appending adjacent stripe’s
vectors
The feature vector size is
858 dimension
ICML 2005.
Jeff Michels, Ashutosh Saxena & Andrew Y. Ng
Learning Algorithm


Supervised learning to estimate the distance d in each
column of the image.
Learn weights w via ordinary least squares with
quadratic cost.
depth
weights
arg minw ∑i (di - wT xi )2
i = columns, images

features
Other regression methods (SVR, robust regression)
gave similar results
ICML 2005.
Jeff Michels, Ashutosh Saxena & Andrew Y. Ng
Results: Learning Depth Estimates
0.4

Errors on a log scale
E = ∑ | log10(d) – log10(destimated) |
0.35
0.3

0.25
Able to predict depth with
a average error of 0.26
orders of magnitude.
0.2
Radon
(Texture
Gradient)
ICML 2005.
Harris
(Texture
Gradient)
Laws
(Texture
Variations)
All
Jeff Michels, Ashutosh Saxena & Andrew Y. Ng
Synthetic Graphics Data



Graphics images for
training the vision
system.
Variable degree of
graphical realism
Can a system trained
on synthetic images
predict distances on
real images?
ICML 2005.
Jeff Michels, Ashutosh Saxena & Andrew Y. Ng
Results: Combined Vision System

Hazard Rate (%)
25
20

15
10
When the distance to
nearest obstacle in the
chosen direction is less than
5 m, then it is a hazard.
Hazard rate improves by
combining the real and
synthetic trained system.
5
0
Random
ICML 2005.
Graphics
Real
Combined
24% hazard rate reduction
over using only real images.
Jeff Michels, Ashutosh Saxena & Andrew Y. Ng
Control: Reinforcement Learning





Model based RL -- hard perception problem
Randomly generated environment in
graphics simulator
Pegasus (Ng & Jordan, 2000) to learn
control policy
Car initialized at (0,0) and ran for fixed time
horizon.
Learning algorithm converged after 1674
iterations of policy search.
ICML 2005.
Jeff Michels, Ashutosh Saxena & Andrew Y. Ng
Reinforcement Learning: Parameters





1: spatial smoothing of predicted distances
2: threshold distance for evasive action
3: steering angle parameter
4, 5: evasive action parameters
6: throttle parameter
ICML 2005.
Jeff Michels, Ashutosh Saxena & Andrew Y. Ng
QuickTime™ and a
decompressor
are needed to see t his pict ure.
ICML 2005.
Jeff Michels, Ashutosh Saxena & Andrew Y. Ng
Results: Actual Driving Experiments
ICML 2005.
Jeff Michels, Ashutosh Saxena & Andrew Y. Ng
QuickTime™ and a
Cinepak decompressor
are needed to see this picture.
ICML 2005.
Jeff Michels, Ashutosh Saxena & Andrew Y. Ng
QuickTime™ and a
decompressor
are needed to see this picture.
ICML 2005.
Jeff Michels, Ashutosh Saxena & Andrew Y. Ng
Results: Driving Times
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
ICML 2005.
Jeff Michels, Ashutosh Saxena & Andrew Y. Ng
Summary



Monocular depth estimation is an interesting
and important problem.
Supervised learning for depth estimation.
Model-based RL, using computer graphics
simulator, to learn controller.
ICML 2005.
Jeff Michels, Ashutosh Saxena & Andrew Y. Ng
Extensions/Future Work


Learn complete
depth maps
Markov Random
Field (MRF) to
estimate depths.
Learning depth from single
monocular images,
Ashutosh Saxena, Sung H. Chung,
Andrew Y. Ng.
In NIPS 2005.
Image
ICML 2005.
Ground Truth
Jeff Michels, Ashutosh Saxena & Andrew Y. Ng
Predicted
[also with Sung Chung.]
Contact:
Ashutosh Saxena, asaxena@cs.stanford.edu
http://ai.stanford.edu/~asaxena/rccar/
http://ai.stanford.edu/~asaxena/learningdepth/
ICML 2005.
Jeff Michels, Ashutosh Saxena & Andrew Y. Ng

http://ai.stanford.edu/~asaxena/rccar/ High Speed Obstacle Avoidance using Monocular Vision and Reinforcement Learning Jeff Michels Ashutosh Saxena Andrew Y.

Transcript http://ai.stanford.edu/~asaxena/rccar/ High Speed Obstacle Avoidance using Monocular Vision and Reinforcement Learning Jeff Michels Ashutosh Saxena Andrew Y.

Directory