DenseNet: Replacing HOG with Deep Convnet Pyramids for Object Detection Forrest Iandola, Sergey Karayev, Ross Girshick, Matt Moskewicz, Yangqing Jia, Kurt Keutzer, and Trevor.
Download ReportTranscript DenseNet: Replacing HOG with Deep Convnet Pyramids for Object Detection Forrest Iandola, Sergey Karayev, Ross Girshick, Matt Moskewicz, Yangqing Jia, Kurt Keutzer, and Trevor.
DenseNet:
Replacing HOG with Deep Convnet Pyramids for Object Detection
Forrest Iandola
, Sergey Karayev, Ross Girshick, Matt Moskewicz, Yangqing Jia, Kurt Keutzer, and Trevor Darrell
forresti@eecs.berkeley.edu
University of California, Berkeley
1
Overview Object Detection • • • Selective Search + ConvNets Multiscale Pyramid Descriptors DenseNet: ConvNet Pyramids for improved efficiency • DenseNet code is available – give it a try in
your
pipeline Forrest Iandola forresti@eecs.berkeley.edu
2
Deep Convolutional Neural Networks 1989: high-quality digit recognition (Bell Labs – LeCun) 2012: best ImageNet Classification (Toronto) 2013: best PASCAL Detection (Berkeley)
2014: efficient detection + replace HOG with ConvNets
Forrest Iandola forresti@eecs.berkeley.edu
3
Regions with CNN Features Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik.
Rich feature hierarchies for accurate object detection and semantic segmentation
. ArXiv 2013.
Forrest Iandola forresti@eecs.berkeley.edu
4
Regions with CNN Features "Selective Search" region proposals (Uijlings et al, IJCV 2013) Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik.
Rich feature hierarchies for accurate object detection and semantic segmentation
. ArXiv 2013.
Forrest Iandola forresti@eecs.berkeley.edu
5
Regions with CNN Features "Selective Search" region proposals (Uijlings et al, IJCV 2013) Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik.
Rich feature hierarchies for accurate object detection and semantic segmentation
. ArXiv 2013.
Forrest Iandola forresti@eecs.berkeley.edu
6
Regions with CNN Features "Selective Search" region proposals (Uijlings et al, IJCV 2013) Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik.
Rich feature hierarchies for accurate object detection and semantic segmentation
. ArXiv 2013.
Forrest Iandola forresti@eecs.berkeley.edu
7
Regions with CNN Features "Selective Search" region proposals (Uijlings et al, IJCV 2013) Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik.
Rich feature hierarchies for accurate object detection and semantic segmentation
. ArXiv 2013.
Forrest Iandola forresti@eecs.berkeley.edu
8
Regions with CNN Features Forrest Iandola forresti@eecs.berkeley.edu
Caffe
– efficient ConvNet GPU implementation from Berkeley http://caffe.berkeleyvision.org
9
Regions with CNN Features
Linear Classifier > 50% mAP on PASCAL 07 detection
Forrest Iandola forresti@eecs.berkeley.edu
10
Efficiency Issues with R-CNN Forrest Iandola forresti@eecs.berkeley.edu
11
Efficiency Issues with R-CNN
2000 windows = 100x the input image size
Forrest Iandola forresti@eecs.berkeley.edu
12
Sliding-Window Detection on HOG Pyramids Forrest Iandola forresti@eecs.berkeley.edu
13
Sliding-Window Detection on HOG Pyramids
pyra = featpyramid(image)
Forrest Iandola forresti@eecs.berkeley.edu
14
Sliding-Window Detection on HOG Pyramids
pyra = featpyramid(image)
Forrest Iandola forresti@eecs.berkeley.edu
15
Sliding-Window Detection on HOG Pyramids
pyra = featpyramid(image)
Forrest Iandola forresti@eecs.berkeley.edu
16
Sliding-Window Detection on HOG Pyramids
pyra = featpyramid(image)
Forrest Iandola forresti@eecs.berkeley.edu
17
Sliding-Window Detection on HOG Pyramids
pyra = featpyramid(image)
Forrest Iandola forresti@eecs.berkeley.edu
18
Sliding-Window Detection on HOG Pyramids
pyra = featpyramid(image)
Forrest Iandola forresti@eecs.berkeley.edu
19
Sliding-Window Detection on HOG Pyramids Can add parts, if desired Forrest Iandola forresti@eecs.berkeley.edu
33% mAP on PASCAL 07 detection
20
Efficiency of HOG Pyramids
Pyramid = 8x the input image size
Typical settings: 5 octaves 10 scales per octave Forrest Iandola forresti@eecs.berkeley.edu
21
Sliding-Window Detection on
ConvNet
Pyramids
Pyramid = 8x the input image size
Forrest Iandola forresti@eecs.berkeley.edu
22
Sliding-Window Detection on
ConvNet
Pyramids
Pyramid = 8x the input image size
Forrest Iandola forresti@eecs.berkeley.edu
23
Sliding-Window Detection on
ConvNet
Pyramids
Pyramid = 8x the input image size
Forrest Iandola forresti@eecs.berkeley.edu
24
Sliding-Window Detection on
ConvNet
Pyramids
Pyramid = 8x the input image size
Efficiency of HOG + Accuracy of Deep Learning Easy to use:
pyra = convnet _featpyramid(image)
Forrest Iandola forresti@eecs.berkeley.edu
25
Implementing ConvNet Pyramids Forrest Iandola forresti@eecs.berkeley.edu
26
Implementing ConvNet Pyramids State-of-the-art ConvNet implementations (e.g. Caffe): • Can handle any input image size • BUT, need batches of same-sized images to saturate GPU 27 Forrest Iandola forresti@eecs.berkeley.edu
Implementing ConvNet Pyramids State-of-the-art ConvNet implementations (e.g. Caffe): • Can handle any input image size • BUT, need batches of same-sized images to saturate GPU 28 Forrest Iandola forresti@eecs.berkeley.edu
Implementing ConvNet Pyramids State-of-the-art ConvNet implementations (e.g. Caffe): • Can handle any input image size • BUT, need batches of same-sized images to saturate GPU 29 Forrest Iandola forresti@eecs.berkeley.edu
Implementing ConvNet Pyramids State-of-the-art ConvNet implementations (e.g. Caffe): • Can handle any input image size • BUT, need batches of same-sized images to saturate GPU 30 Forrest Iandola forresti@eecs.berkeley.edu
Implementing ConvNet Pyramids Easy to use:
pyra = convnet _featpyramid(image)
Forrest Iandola forresti@eecs.berkeley.edu
31
Computational Performance
Selective Search Pyramids
Forrest Iandola forresti@eecs.berkeley.edu
32
Computational Performance
Selective Search Pyramids
Forrest Iandola forresti@eecs.berkeley.edu
33
Computational Performance
Selective Search
2000 windows = 100x the input image size
1/10 fps Pyramids
Pyramid = 8x the input image size
1fps
34 Forrest Iandola forresti@eecs.berkeley.edu
Future Applications
for each of the 6000 papers citing HOG: pyra = featpyramid(image) #HOG Pyramid
Forrest Iandola forresti@eecs.berkeley.edu
35
Future Applications
for each of the 6000 papers citing HOG: pyra = featpyramid(image) #HOG Pyramid pyra = convnet _featpyramid(image)
Exemplar-SVM (Alyosha Efros) RGB-D Recognition (Saurabh Gupta) Tracking Algorithms (TTI-Japan) 36 Forrest Iandola forresti@eecs.berkeley.edu
Future Applications
for each of the 6000 papers citing HOG: pyra = featpyramid(image) #HOG Pyramid pyra = convnet _featpyramid(image)
Exemplar-SVM (Alyosha Efros) RGB-D Recognition (Saurabh Gupta) Tracking Algorithms (TTI-Japan) 37 Forrest Iandola forresti@eecs.berkeley.edu