CSc Senior Capstone Sequence Spring 2007-Fall 2007
Computer Science - The City College of New York
Multimodal Imaging, Modeling and Presentation
Instructor: Professor Zhigang Zhu
Computer Science and Computer Engineering
The City College of New
York and Graduate Center
The City University of New York (CUNY)
Spring 2007 Class Meet at NAC 6/213,
M &W 12:30-01:45PM
Office Hours: Wednesday 10:30 am - 12:00 pm
Fall 2007 Class Meet: Wednesday 1:00 pm- 2:00 pm at NAC
8/210
Office Hours: Wednesday 11:00 AM- 1:00 PM,
Room: NAC
8/210
Course Update
Information
January 29, 2007. First day of
class. Please send an email to me (zhu at cs dot ccny dot cuny
dot edu) ,
with your full name, the last 4 digits of your ID, and including "Capstone
2007" (exact please) in your Subject line. Otherwise I may
not able to receive it.
January 29, 2007. Some of the
leave slides could be found at Class
Schedule.
Feb 27, 2007. You will find the
lecture notes and the source code for the example in
the class of 02/21.
Project 1: Please install Qt
if you still have not done so. The source code provided will compile,
but two functions (toGrayscale(), toLuminance()) have been deleted in
the file 'ops.cpp'. Please fill these in, and feel free to do modify
and play with the source code. If you have any questions feel free to
email Mr. Edgardo Molina ( molina at cs dot ccny dot cuny dot
edu). Send your program to him by
March 05, 2007 (midnight).
March 02, 2007. You will find
the
lecture notes in
the class of 02/21.
Project 2: For this
project you will continue using the source code from Project 1. We want to be able to
move our image around the image panel when the user holds the
left-mouse button and moves the mouse. If the user presses the
right-mouse button on the image you will draw a 5x5 square (color of
your choice) at the mouse's position on the image. You will have to add
the appropriate mouse-event functions to the program to allow users to
do this. Send your program to Mr.
Edgardo Molina by March 12, 2007 (midnight).
Extra-Credit: Add a QLabel or
another widget that displays the RGB values of the pixel where the user
presses the right-mouse button.
March 21, 2007. Multimodal
Interface slides and code in digitalInterfacePresentAndCode.zip.
Here is a simple ReadMe file.
April 13, 2007. Answers for Projects 1 and 2.
May 16, 2007. Final
Grading.
Lecture Topics
This Capstone course will last
for two
semesters. The first semester will be a study on a number of basic
principles of relative technologies in Sensing, Imaging, Computing
Vision, and Video
Computing. In the second semester, the project teams formed in
the
first semester will mainly focus on the implementations of the
projects. Students are going to
work in teams, each of them with about 3~5 students working on the same
project. Basically, you can select any programming languages you would
like to use for your projects, but you need to aware that you must be
familiar with the appropriate programming languages used for a selected
project.
The topics will include:
- Multimodal Sensing and Imaging (video, thermal, Doppler
vibrometry, acoustic, etc.)
- Color, Image Enhancement and Feature Extraction
- Computer Vision and Recognition Basics
- Video Processing and Computing
The lectures will be combined with the discussions of Capstone projects
for real world applications.
Project Topics (the second semester):
The following topics
integrate advanced sensors, computer vision, video processing and
virtual
reality applications, and are quite open for further research and/or
development in robotics, inspection, surveillance, mapping,
entertainment. The projects are subject to change based on the
first semester course progress. Students will be working closely
with PhD students working in the related research projects.
1. Multimodal Sensor Integration for
Surveillance and Inspection
Student will have opportunities to use state-of-the-art
sensors in the Visual Computing Lab for some important applications
such as human detection and tracking, building inspection, medical
diagnosis, and other robotic applications. The sensors and platforms
include: various video cameras (omnidirectional, and PTZ cameras),
(omnidirectional and unidirectional)
microphones, a thermal camera, a laser vibrometer, and etc. Some of the
basic tasks include
(1) Graphic User Interface (GUI) for mutlimodal sensor integration.
(2) Video/Audio data acquisition via Firewire or USB ports.
(3) Image loading, display and manipulation.
(4) Video stream reading, playing and analysis.
(5) Some simple functions such as motion detection and face deception
from color and/or thermal images.
2. Stereo Mosaics and Image-Based Rendering
Many people have known and maybe have used tools to perform image
stitching to generate panoramic mosaics. But how about panoramic
mosaics with 3D displays? In
this project, students will learn and practice the imaging methods and
algorithmic development of generating and displaying stereoscopic
panoramic images using a single video camera. We will also use very
cost-effective ways (devices) to view
3D panoramic images that you generate! Some of the basic tasks include
(1) PTZ camera control and video capture
(2) Mosaic generation algorithms
(3) Web interface and mosaic browsing
Here are some references:
- OmniStereo:
Panoramic Stereo Imaging, S. Peleg, M. Ben-Ezra, and Y. Pritch, IEEE Trans. on
PAMI, March 2001, pp. 279-290.
- Mosaic-Based
3D Scene Representation and Rendering, Z. Zhu, A. R. Hanson,
Special Session on Interactive Representation of Still and Dynamic
Scenes, the Eleventh International Conference on Image
Processing, Genova, Italy, September 11-14, 2005, pp I-633
-636
- See a
stereo mosaic pair of the City College of New York
3. Multimodal
Registration and Colorization
Today it is a trivial task to take a color picture by using a
digital camera, but it was quite an effort in the past. See how people
generated color images without having a color camera. We will use this
approach to learn color representation, rigid image transformation (for
static scenes), and non-rigid image transformation (for water etc.),
since pictures with three different color filters were captured at
three slightly different locations and times. Note that the three bands
of a color image are in three different modalities (spectra) so we will
also learn how to perform multimodal registration. This will also be
useful to register, for example, a color image and an IR image taken
from same or different locations.
Here are some references:
- Photographer to
the Tsar:
Sergei Mikhailovich Prokudin-Gorskii. You may find an image in which
the river was not well-registered. What to give a try?
- Image
Colorization Project, Alexei (Alyosha) Efros, CMU. People have used this for
course projects, which include issues of image representation, memory
management (for huge images), efficient algorithm design, and image
transformation. But only rigid transformations were tried so far.
Textbook and References
Textbook:
"Digital Image Processing", 2nd Edition, by
Gonzalez
and Woods , Prentice Hall © 2002
References:
“Introductory Techniques for 3-D Computer
Vision”, Trucco and Verri, 1998.
More references will be added.
Copyright @ Zhigang
Zhu (email zhu@cs.ccny.cuny.edu
), City College of New York, 2006.