Follow us:

Brier Dudley's blog

Brier Dudley offers a critical look at technology and business issues affecting the Northwest.

May 3, 2010 at 2:48 PM

“Looming data tsunami” coming, UW prof warns

At the Grand Challenges Summit in Seattle today, the University of Washington’s Ed Lazowska channeled Bill Gates.

Lazowska resurrected Gates’ “digital decade” line to describe the advances that computer science will bring to scientific research.

“You’re going to see a revolution in discovery in the next 10 years,” said Lazowska, the Bill & Melinda Gates Chair in Computer Science.

He opened a discussion of “eScience” and new systems for doing research with massive amounts of data, a “looming data tsunami” that’s pushing scientists to develop shared computing clusters at schools and ultrafast dedicated Internet lines between research centers.

Dedicated supernetworks are needed to handle data generated by systems such as gene sequencers that produce a terabyte of data per day — and the UW has 25 of them — or the Hadron Collider, which produces 20 petabytes of data per year.

Among the speakers in the session, part of a National Academy of Engineering event hosted by the UW, was Facebook data architect, Jonathan Chang, who said the social network’s vast data could be a boon to social scientists.

Chang said the site has the “richest social data set in the world,” with more than a petabyte of data about its users and more than 1 terabyte generated every day.

This has the potential to “really get to and anwer a lot of the longstanding questions in things like social science, which we’ve been unable to answer before.” It’s also easier to slice Facebook data than it is to do conduct surveys.

Among the tidbits revealed by Chang were terms used in discussions of vodka. Younger males tend to use the word “drunk” and older females mention “cranberry” when discussing vodka, for instance.

Plotting the terms “party” and “hangover,” the data show people mention party regularly on Saturdays, and on Sundays people mention “hangovers” “with incredible regularity,” he said.

More seriously, the site’s also done work correlate negative and positive sentiment in status updates with surveys of users’ self-reported happiness. Survey responses were predictive of status updates,” he said.

Catharine van Ingen, an architect in Microsoft Research’s eScience group, said there’s an amazing flow of data from satellites, sensors, computers and Web services.

“While we’re at the center of this perfect storm,” she said, “there’s a lot of challenges left turning those ones and zeroes into actual science.”

Comments | More in | Topics: Education, escience, grand challenges

COMMENTS

No personal attacks or insults, no hate speech, no profanity. Please keep the conversation civil and help us moderate this thread by reporting any abuse. See our Commenting FAQ.



The opinions expressed in reader comments are those of the author only, and do not reflect the opinions of The Seattle Times.


The Seattle Times

The door is closed, but it's not locked.

Take a minute to subscribe and continue to enjoy The Seattle Times for as little as 99 cents a week.

Subscription options ►

Already a subscriber?

We've got good news for you. Unlimited seattletimes.com content access is included with most subscriptions.

Subscriber login ►
The Seattle Times

To keep reading, you need a subscription upgrade.

We hope you have enjoyed your complimentary access. For unlimited seattletimes.com access, please upgrade your digital subscription.

Call customer service at 1.800.542.0820 for assistance with your upgrade or questions about your subscriber status.

The Seattle Times

To keep reading, you need a subscription.

We hope you have enjoyed your complimentary access. Subscribe now for unlimited access!

Subscription options ►

Already a subscriber?

We've got good news for you. Unlimited seattletimes.com content access is included with most subscriptions.

Activate Subscriber Account ►