It was hard to resist Google’s blatant pitch for publicity.
The search giant was asking if I wanted to roam around downtown Seattle, starting at Pike Place Market, for a mobile demonstration of the Goggles application for Google’s Android platform. (Here’s a video of the demo)
Goggles, which has steadily added new capabilities since it debuted last December, conducts searches based on images taken by a phone’s camera.
Point it at a building or a sign, and Goggles will call up information about what you’re seeing. It worked well when aimed at the sign for Pike Place Fish but that was a relatively easy one. Google has lots of images of that place for reference, and the location is marked on Google Maps, which Google checks against the phone’s location.
Goggles takes advantage of the powerful processors in today’s smartphones and faster cell networks to offload the processing to its datacenters.
The app appears to be slowly scanning things, with a wavy blue line moving across the screen like a copy machine, but that’s just a visual cue that something’s happening somewhere on Google’s network.
“It’s not realy an app on the phone so much as this is a conduit into a massive, mobile supercomputer,” said Jason Freidenfelds, manager of global communications and public affairs.
If you’re in a foreign country, you can use Goggles to translate signs or menus. Point the camera at a block of text and in a few seconds Google returns a translation.
It worked well on German and French foreign magazines at the First and Pike News stand, but stumbled over a Danish magazine with the crown princess (a former Microsoft employee) on the cover.
Over at Left Bank Books, Goggles scanned the cover of a book and offered to display a scanned preview of the book, in case we didn’t want to put down the phone and leaf through it in person. It also listed other places we could buy the book for a lower price.
Before we could fool around much more, an employee asked us to leave, even though Goggles runs on the open-source Android platform.
Goggles is still an experimental application — released as a Google Labs project. When it works, it’s really cool, but it can give wildly incorrect results.
Freidenfelds several times said Goggles makes your phone like a device from “Star Trek.” But I wouldn’t trust Goggles for mission critical searches, at least until the system has been refined.
Maybe the inaccuracy of the fledgling technology is why Microsoft hasn’t released any applicaitons as ambitious as Goggles, even though it began working on the same sort of technology before Google.
In 2007 Microsoft researchers revealed a project called Lincoln, which conducted searches based on images by comparing the image to a database. Two years ago Craig Mundie, the company’s research boss, demonstrated a prototype handheld device that could scan around an area and provide details about businesses in the vicinity.
The Lincoln project contributed to a Bing search app released for the iPhone in December and Android phones last week. Apparently there won’t be anything similar to Goggles available when the Windows Phone 7 platform launches this holiday season.
Image recognition has become more useful now that so many people are taking and sharing digital photos, explained Larry Zitnick, a research in Microsoft Research’s Interactive Visual Media Group who worked on Lincoln.
“The end result that eveybody’s trying to do is link up the real world to the Internet through imagery,” he said.
Goggles is part of Google’s broader effort to build a universal image index, a massive collection of imagery that it can use for reference when people conduct visual searches in the future.
The bigger the index, the better the results will be. A challenge right now is figuring out how to scale up the index. Google’s largest index has 50 million objects and now needs to grow to 100 million or 200 million.
To build a truly universal index, it will need to recognize perhaps a billion objects, according to Hartwig Adam, who is developing the system at Google’s Santa Monica, Calif., office.
Adam was previously employed by Neven Vision, a company that grew out of military-funded research at the University of Southern California into facial recognition technology. Google bought Neven in 2006 to improve its Picasa photo service.
Now the group is working on a much broader application.
Within three years, the technology could become a tool that people use regularly in their daily lives, pointing phones at a real-estate sign to call the agent, for instance.
“Basically I hope that we can in a few years give you answers about anything that’s in front of you that you would like to know more about,” Adam said.
It may take longer for Goggles to work with faces. Not because of technical challenges, but because of privacy issues. Adam is especially sensitive to the topic, having come from Germany.
“There seems to be an interest, but it has to be done in a way that people can feel comfortable, that their privacy can be preserved,” he said.
“If there’s a way to make it happen, then we would look into it,” he said. “At the moment we treat it with the utmost care.”
Here are a few screenshots of Goggles – first taking an image, then returning a search based on what it saw: