It is exciting to do true interdisciplinary research which is part of the revolution that brings advances in computation and automated observation (data) to solve problems in science and engineering. Yet even when in pure mathematics, I valued the exploited automatic computation on concrete examples to gain insight. In moving from math to computer science I was able to apply my background to networking at Yemini’s lab (Columbia) then to computer vision at Nayar’s lab (Columbia). I worked on general models of camera geometry, modeling radiometric response functions, and the development of systems combining projectors and cameras.
At CCNY, my collaboration with Gladkova introduced me to the NOAA-CREST center. Initially we focused on sounder (infra-red spectrometer) data compression. As a baseline, we comprehensively evaluated current compression algorithms on many samples of sounder data over globally and over time. We developed a method to estimate the optimal compression rate through estimating the entropy. We then developed and patented a novel compression algorithm for the sounder on the upcoming GOES-R NOAA satellite. We showed superior performance in an international compression competition but the instrument was cancelled, ending the project.
While designing an implementation of an operational compression algorithm we addressed reconstruction of data from lost scan-lines. This led us to studying the NASA’s MODIS 1.6- micron band which is critical in separating snow and ice from cloud. The MODIS on the Aqua satellite is missing 75% of the scan-lines due to damage. Over two years we were able to develop and evaluate an algorithm to accurately estimate the missing data through statistical regression. This accurate estimation improved both NASA’s snow and cloud products. Our work is currently in operational use at NASA for the collection 6 snow product.
I also worked on other projects which restored and enhanced data from satellite remote sensors using statistical regression. One project created virtual sensors with learning based super resolution. Gladkova and I developed algorithms to reconstruct a virtual green band, and true color images for the upcoming GOES-R mission. We also developed an algorithm to reconstruct the 13.3-micron band for VIIRS, which is needed to estimate cloud top pressure. This algorithm is being evaluated for use within the official NOAA product.
With Gladkova and Ahmed, we applied machine learning classification to detect harmful algae blooms from satellites. More recently, with Gladkova we have developed a classifier of sea ice for the National Ice Center. Our ice product pulls together both microwave and visible imaging. We built a web app for monitoring and visualization of all the major ice products for comparison. The design of the app is based on a prototype I built for NOAA, now operational, to monitor regional sea surface temperature.
In addition to remote sensing, I have worked with Krakauer, and my PhD student Aizenman, on climate research to apply information theoretic measures to evaluate long term forecasting models. We generalized evaluations of point forecasts to information-gain metrics of probabilistic predictions. The most recent work, which has been accepted but under revision, shows that the current long term physics-based forecasts improve when fused with statistical models. A second climate related project with that student, as well as Vörösmarty’s Crossroads Initiative, addresses the risk of climate change to the world’s river deltas. Using unsupervised machine learning led us to identifying deltas heavily dependent on engineering that will be at risk as sea levels rise and construction costs increase. This work appears in the journal Science.
I also collaborate with Mageras and my PhD student Hu at Memorial Sloan Kettering Cancer Center (MSKCC) on segmentation of volumetric medical images. Current clinical practice involves manual contouring of each structure by a medical professional. Automatic methods work in some cases (e.g. healthy organs), but problems remain. Legal and ethical considerations necessitate human oversight. Automatic methods often produce errors requiring manual intervention. A semi-automatic method, we developed, dramatically accelerates manual segmentation, automating where possible, and integrating correction with on-line statistical learning. The interface replaces labor intensive contour delineation with rough brush strokes to indicate examples of different tissue types. The examples are input to a statistical model from which a proposed segmentation is built. A user can correct and then accept the segmentation updating the statistical model. Subsequent segmentations progressively automate much of the work. We extended the method over the years from a Markov to a Conditional Random Field, with fewer assumptions. Starting with greyscale CT images, we eventually tackled multi-modal images (e.g. MRI). Initially we produced bi-class discrimination for normal tissues whereas our recent work gives multi-class segmentation for tumors. Our algorithm was incorporated in MSKCC’s next generation system.
Over the years, my work has been supported from individual grants from NIH (P20), NOAA and from NASA. I have also been collaborating on ONR supported research and participated in several successful interdisciplinary and center grant proposals.