Qi Shan was remarkably calm demonstrating his advanced photo-manipulation research Wednesday at the University of Washington.
Shan’s “photo uncrop” technology was among dozens of demos that students presented during the Computer Science & Engineering Department’s annual showcase for industry partners. The audience included recruiters, investors and researchers, including representatives of big tech companies such as Microsoft, Intel and Samsung. Some flew across the country to see what and who are emerging from the labs.
Demos were just part of the event. The agenda included several recruiting session and activities highlighting research that seems ready for commercialization.
It was also an opportunity for department heads to tout progress and drop hints about the need for donations to support its growth.
Chairman Hank Levy noted that 17 professors were added over the past three years, including six who joined this year, bringing expertise in fields such as natural-language processing, programming languages and security.
“People realize we have jumped over a lot of departments and are among the top groups in the country,” Levy said during the opening presentation.
The department also is growing because a larger percentage of undergraduates are taking introductory computer-science classes. That’s happening because computer-science understanding is increasingly needed in all sorts of fields, noted Ed Lazowska, the Bill & Melinda Gates Chair in Computer Science. This fall’s introductory course has a record 1,000 students being taught with the help of 65 teaching assistants.
An increase in graduate students over the past few years, from about 160 in years past to a current group of about 260, is also putting pressure on the department’s capacity.
Lazowska said the overall program has outgrown the swanky, six-story Paul G. Allen Center for Computer Science & Engineering, which opened in 2003. The department is gearing up to construct a $110 million building nearby, assuming it can line up donors and an initial $40 million in state funding it’s seeking.
Shan (pictured) may be in a position to chip in one of these days, perhaps when the department starts raising money for a third structure.
As part of Ph.D. work he expects to complete in early 2015, Shan developed technology that can assemble a panoramic image, pixel by pixel, with “computational photography.” He worked on the project with UW faculty members Brian Curless and Steven Seitz, along with Google employee Carlos Hernandez and Yasutaka Furukawa, a former UW researcher who worked on Google Maps and is now an assistant professor at Washington University in St. Louis.
With the technology, an image of a person standing in the doorway of a cathedral could be expanded to include the rest of the building and the block on which it sits. The expanded picture would be generated through an analysis of imagery found online, including aerial images and photo collections such as Flickr.
It’s more than just stitching together photos. To “uncrop” the images and add details of the surrounding area, the software analyzes a set of images, then builds a model and generates a photorealistic image of the area. The photo above is an example provided by Shan. Here’s another:
You’d think this could be done by matching up GPS information attached to photos, particularly those taken with smartphones. But it turns out only a fraction of available images have GPS data attached and it’s not precise enough anyway, Shan said. So his system figures out where cameras are positioned by analyzing photos of the same area. It uses the scene’s geometry to reconstruct a larger image that captures more of the landscape.
So far it works best in places with enough images available to analyze. Working with photos of 59 landmarks in Rome, the project had a 70 percent success rate combining aerial imagery and photos taken on the ground.
The technology could be used to generate photos of an area larger than you can capture with a camera, because of obstructions and viewing angles.
It could also be used to remove “photobombs” or occluders, such as the person who walks past when you’re taking a picture of your family in front of the Space Needle. Or it could potentially relight photos — turning a cloudy day into a sunny one — or generate stereo images.
There could also be applications for developers of content for virtual-reality systems, which use headsets with widescreen displays to immerse wearers into virtual environments.
Among those throwing out questions during the presentation was Rico Malvar, a distinguished engineer and chief scientist at Microsoft Research. Malvar said it was “one of the best I’ve seen” at the UW event.
“It’s impressive work … really good stuff,” Malvar said.
When I asked Shan whether the technology could end up with a company such as Adobe, Google or Microsoft, he said Google is a possibility — especially since he was heading there for a job interview a few hours after finishing the demo.
Here’s a video showing the “uncropping” with several travel photos:
[do action=”custom_iframe” url=”http://www.youtube.com/embed/k9v48TYJt_g” width=”630″ height=”500″ scrolling=””/]