CSc 83020 Special Topics on Computer Vision

Advances in Video Target Detection & Tracking

Spring 2010

Professor Zhigang Zhu
Department of Computer Science
City College of New York  and Graduate Center
The City University of New York (CUNY)

Time: Tuesday 11:45 am -1:45 pm  

Room: 3305
CRN Code: 10444
Credits: 3.0

Office Hours: Tuesday 10:00 - 11:30 am, Rm 4439

Course Update Information

Course Description

The course will discuss the state-of-the-art of target (e.g., human, vehicle) detection, classification and tracking using vision and multimodal sensing.  We will review the major algorithms and methods for human and other target detection, classification and tracking from the most recent conference and journal papers (CVPR, ICCV, PAMI, IJCV, etc). The course will include several lectures by the instructor on the fundamentals, a few readings and presentations by the students, and a final project by each student.

Tentative Topics

1. Detection: Hypothesis Generation

1.0. Stationary Cameras or Mobile Vehicles?

1.1. Brute Force Approach: Sliding Window Technique

1.2. Motion: Background Subtraction or Optical Flow

1.3. Appearance/Color: Interest Point Detectors

1.4. Stereo: 3D Cues


2. Classification: Model Matching

2.1. Generative Models – a Bayesian Approach

    A. Shape, Texture and 3D Cues

    B. Exampar-based models: distance transformation

    C. GMMs and EM-based approaches

    D. Combined Shape and Texture Models

2.2. Discriminative Models- a Classification Approach

    A. Features (Wavelet, Codebook, HOG, Salient structures, Spatio-temporal features)

    B. Classifiers (a. SVM b. AdaBoost c. ANN …)

2.3. Integration of Generative and Discriminative Models

    A. A Mixed Generative-Discriminative Framework

    B. Pictoral Structures Approach

    C. Hybrid Body Representation


3. Tracking:  Temporal Association

3.1. Kalman Filtering

3.2. Particle Filtering

3.3. Integration of Classification and Tracking


4. Use of 3D, Motion and Multiple Cues

4.1. Monocular or Stereo?

4.2. Ground Plane Assumption

4.3. Building and Background Removal

4.4. 3D in Matching and Tracking

4.5. More on Multiple Cues

References and Readings (to be updated)

Tutorials on Basic Techniques

3D Computer Vision: Motion and Stereo
Support Vector Machines (SVMs)
Support Vector Machines: Hype or Hallelujah?, Kristin P. Bennett and Colin Campbell, SIGKDD Explorations, 2,2, 2000, 1-13
AdaBoost Tutorial / Yet another AdaBoost Talk
A decision-theoretic generalization of on-line learning. and an application to boosting. Yoav Freund. Robert E. Schapire. AT&T Labs
Gaussian Mixtue Models (GMMs)

Survey Papers

Monocular model-based 3D tracking of rigid objects. V. Lepetit and P. Fua, Source, Foundations and Trends® in Computer Graphics and Vision, 2005
Monocular Pedestrian Detection: Survey and Experiments. M. Enzweiler and D. M. Gavrila. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), available online: IEEE Computer Society Digital Library,, 17 Oct. 2008.

Multi-cue Pedestrian Detection and Tracking from a Moving Vehicle, Gavrila D., Munder S., IJCV(73), No. 1, June 2007, pp. 41-59.


Pedestrian Detection: A Benchmark, Piotr Dollár, Christian Wojek, Bernt Schiele, Pietro Perona, CVPR 09

Towards Practical Evaluation of Pedestrian Detectors, Mohamed Hussein, Fatih Porikli, Larry Davis, TR2008-088, MITSUBISHI ELECTRIC RESEARCH LABORATORIES April 2009


Detections using Motion, 3D, Color and Multi-Cues

A Sensor for Urban Driving Assistance Systems Based on Dense Stereovision. Nedevschi, S. Danescu, R. Marita, T. Oniga, F. Pocol, C. Sobol, S. Tomiuc

Results from a Real-time Stereo-based Pedestrian Detection System on a Moving Vehicle, Max Bajracharya, Baback Moghaddam, Andrew Howard, Shane Brennan, Larry H. Matthies

Detecting Pedestrians Using Patterns of Motion and Appearance. Paul Viola. Michael J. Jones. Daniel Snow, ICCV 2003

Real-Time Human Detection in Uncontrolled Camera Motion Environments. Mohamed Hussein. Wael Abd-Almageed. Yang Ran. Larry Davis, ICVS 2006

Hierarchical Part-Template Matching for Human Detection and Segmentation. by: Zhe Lin and Larry S. Davis and David S. Doermann and Daniel DeMenthon, ICCV 2007

Stereo-Based Pedestrian Detection for Collision-Avoidance Applications, Nedevschi, S.; Bota, S.; Tomiuc, C. Intelligent Transportation Systems, IEEE Transactions on

On page(s): 380-391, Volume: 10, Issue: 3, Sept. 2009

Feature based person detection beyond the visible spectrum. Kai Juengling (FGAN-FOM), Michael Arens (FGAN-FOM). CVPR 2009 Workshop

Pedestrian Detection for Driving Assistance Systems: Single-frame Classification and System Level Performance. Amnon Shashua. Yoram Gdalyahu. Gaby Hayun, IV 2004

Monocular Pedestrian Recognition Using Motion Parallax, M. Enzweiler1, P. Kanter2 and D. M. Gavrila23, IV 2008

Dynamic 3D Scene Analysis from a Moving Vehicle, B. Leibe, N. Cornelis, K. Cornelis, and L. Van Gool. in IEEE Conference on Computer Vision and Pattern Recognition (CVPR'07 Best Paper Award)

Background Subtraction for Freely Moving Cameras (PDF), Yaser Sheikh, Omar Javed, Takeo Kanade. ICCV09


Histograms of Oriented Gradients for Human Detection. Navneet Dalal and Bill Triggs, CVPR 2005

Human Detection Using Oriented Histograms of Flow and Appearance, Navneet Dalal, Bill Triggs, and Cordelia Schmid, ECCV 2006

Fast Human Detection in Crowded Scenes by Contour Integration and Local Shape Estimation, Csaba Beleznai, Horst Bischof, CVPR 09

Multi-Cue Onboard Pedestrian Detection, Christian Wojek, Stefan Walk, Bernt Schiele, CVPR 09

Human Detection Using Partial Least Squares Analysis. William Robson Schwartz, Aniruddha Kembhavi, David Harwood, Larry S. Davis, ICCV 2009

Pictorial Structures Revisited: People Detection and Articulated Pose Estimation, Mykhaylo Andriluka, Stefan Roth, Bernt Schiele, CVPR 09

Hybrid Body Representation for Integrated Pose Recognition, Localization and Segmentation. Cheng Chen and Guoliang Fan, CVPR 2008

Tracking/ Events /Behaviors

Multi-sensor Detection and Tracking of Humans for Safe Operations with Unmanned. Ground Vehicles. Susan M. Thornton, Mike Hoffelder and Daniel D. Morris, Workshop on Human Detection from Mobile Platforms, ICRA 2008

Tracking in Unstructured Crowded Scenes : Mikel Rodriguez, Saad Ali, Takeo Kanade, ICCV 2009

Correlated Probabilistic Trajectories for Pedestrian Motion Detection. Frank Perbet, Atsuto Maki, Björn Stenger, ICCV 2009

Detection Driven Adaptive Multi-cue Integration for Multiple Human Tracking. Ming Yang, Fengjun Lv, Wei Xu, Yihong Gong, ICCV 2009

People-Tracking-by-Detection and People-Detection-by-Tracking, M. Andriluka, S. Roth and B. Schiele, CVPR 2008

Abnormal Crowd Behavior Detection using Social Force Model, Ramin Mehran, Alexis Oayama, Mubarak Shah, CVPR 09

Copyright @ Zhigang Zhu ( zhu at ), Spring 2010.