CSc Senior Capstone Sequence Spring 2007-Fall 2007
Computer Science - The City College of New York

Multimodal Imaging, Modeling and Presentation

Instructor: Professor Zhigang Zhu
Computer Science and Computer Engineering
The City College of New York and Graduate Center
The City University of New York (CUNY)

Spring 2007 Class Meet at NAC 6/213, M &W 12:30-01:45PM
Office Hours: Wednesday 10:30 am - 12:00 pm

Fall 2007 Class Meet: Wednesday 1:00 pm- 2:00 pm at NAC 8/210
Office Hours: Wednesday 11:00 AM- 1:00 PM, Room: NAC 8/210

Course Update Information

January 29, 2007. First day of class. Please send an email to me (zhu at cs dot ccny dot cuny dot edu) , with your full name, the last 4 digits of your ID, and including "Capstone 2007" (exact please) in your Subject line. Otherwise I may not able to receive it.

January 29, 2007. Some of the leave slides could be found at Class Schedule.

Feb 27, 2007. You will find the lecture notes and the source code for the example in the class of 02/21.
Project 1: Please install Qt if you still have not done so. The source code provided will compile, but two functions (toGrayscale(), toLuminance()) have been deleted in the file 'ops.cpp'. Please fill these in, and feel free to do modify and play with the source code. If you have any questions feel free to email Mr. Edgardo Molina ( molina at cs dot ccny dot cuny dot edu). Send your program to him by March 05, 2007 (midnight).

March 02, 2007. You will find the lecture notes in the class of 02/21.
Project 2: For this project you will continue using the source code from Project 1. We want to be able to move our image around the image panel when the user holds the left-mouse button and moves the mouse. If the user presses the right-mouse button on the image you will draw a 5x5 square (color of your choice) at the mouse's position on the image. You will have to add the appropriate mouse-event functions to the program to allow users to do this. Send your program to Mr. Edgardo Molina by March 12, 2007 (midnight).
Extra-Credit: Add a QLabel or another widget that displays the RGB values of the pixel where the user presses the right-mouse button.

March 21, 2007. Multimodal Interface slides and code in digitalInterfacePresentAndCode.zip. Here is a simple ReadMe file.

April 13, 2007. Answers for Projects 1 and 2.

May 16, 2007. Final Grading.

Lecture Topics

This Capstone course will last for two semesters. The first semester will be a study on a number of basic principles of relative technologies in Sensing, Imaging, Computing Vision, and Video Computing. In the second semester, the project teams formed in the first semester will mainly focus on the implementations of the projects. Students are going to work in teams, each of them with about 3~5 students working on the same project. Basically, you can select any programming languages you would like to use for your projects, but you need to aware that you must be familiar with the appropriate programming languages used for a selected project.

The topics will include:

Multimodal Sensing and Imaging (video, thermal, Doppler vibrometry, acoustic, etc.)

Color, Image Enhancement and Feature Extraction
Computer Vision and Recognition Basics

Video Processing and Computing

The lectures will be combined with the discussions of Capstone projects for real world applications.

Class Schedule

Project Topics (the second semester):

The following topics integrate advanced sensors, computer vision, video processing and virtual reality applications, and are quite open for further research and/or development in robotics, inspection, surveillance, mapping, entertainment. The projects are subject to change based on the first semester course progress. Students will be working closely with PhD students working in the related research projects.

1. Multimodal Sensor Integration for Surveillance and Inspection
Student will have opportunities to use state-of-the-art sensors in the Visual Computing Lab for some important applications such as human detection and tracking, building inspection, medical diagnosis, and other robotic applications. The sensors and platforms include: various video cameras (omnidirectional, and PTZ cameras), (omnidirectional and unidirectional) microphones, a thermal camera, a laser vibrometer, and etc. Some of the basic tasks include
(1) Graphic User Interface (GUI) for mutlimodal sensor integration.
(2) Video/Audio data acquisition via Firewire or USB ports.
(3) Image loading, display and manipulation.
(4) Video stream reading, playing and analysis.
(5) Some simple functions such as motion detection and face deception from color and/or thermal images.

2. Stereo Mosaics and Image-Based Rendering
Many people have known and maybe have used tools to perform image stitching to generate panoramic mosaics. But how about panoramic mosaics with 3D displays? In this project, students will learn and practice the imaging methods and algorithmic development of generating and displaying stereoscopic panoramic images using a single video camera. We will also use very cost-effective ways (devices) to view 3D panoramic images that you generate! Some of the basic tasks include
(1) PTZ camera control and video capture
(2) Mosaic generation algorithms
(3) Web interface and mosaic browsing

Here are some references:

OmniStereo: Panoramic Stereo Imaging, S. Peleg, M. Ben-Ezra, and Y. Pritch, IEEE Trans. on PAMI, March 2001, pp. 279-290.
Mosaic-Based 3D Scene Representation and Rendering, Z. Zhu, A. R. Hanson, Special Session on Interactive Representation of Still and Dynamic Scenes, the Eleventh International Conference on Image Processing, Genova, Italy, September 11-14, 2005, pp I-633 -636
See a stereo mosaic pair of the City College of New York

3. Multimodal Registration and Colorization
Today it is a trivial task to take a color picture by using a digital camera, but it was quite an effort in the past. See how people generated color images without having a color camera. We will use this approach to learn color representation, rigid image transformation (for static scenes), and non-rigid image transformation (for water etc.), since pictures with three different color filters were captured at three slightly different locations and times. Note that the three bands of a color image are in three different modalities (spectra) so we will also learn how to perform multimodal registration. This will also be useful to register, for example, a color image and an IR image taken from same or different locations.

Here are some references:

Photographer to the Tsar: Sergei Mikhailovich Prokudin-Gorskii. You may find an image in which the river was not well-registered. What to give a try?

Image Colorization Project, Alexei (Alyosha) Efros, CMU. People have used this for course projects, which include issues of image representation, memory management (for huge images), efficient algorithm design, and image transformation. But only rigid transformations were tried so far.

Textbook and References

Textbook:

References:

“Introductory Techniques for 3-D Computer Vision”, Trucco and Verri, 1998.

More references will be added.