The much-improved ability of Vista to follow spoken commands is the source of an early security brouhaha for the software, billed as Microsoft’s most-secure operating system ever.
ZDNet blogger George Ou, following up on an idea raised by Sebastian Krahmer, tested whether speech recognition could be a way to remotely execute commands:
I recorded a sound file that would engage speech command on Vista, then engaged the Start button, and then I asked for the command prompt. When I played back the sound file with the speakers turned up loud, it actually engaged the speech command system and fired up the Start menu. I had to try a few more times to get the audio recording quality high enough to get the exact commands I wanted but the shocking thing is that it worked! Anyone that’s ever visited MySpace knows how many annoying webpages out there that will start blasting loud MP3 music as soon as they enter the page.
Microsoft, on its Security Response Center blog, acknowledged that this is “technically possible,” but several specific conditions have to be in place for it to work.
Ou concluded this is a “serious exploit,” as it has been used to delete files.
“The fact that a website can play a moderate level sound file to interact in a way with the desktop by activating an idle speech command system and be able to delete user documents with zero user interaction is serious by any stretch of the imagination,” Ou wrote.
But it can’t be used to get around the User Account Controls (UAC) built into Vista as a defense against outside attacks being carried out without user consent.
“It is not possible through the use of voice commands to get the system to perform privileged functions such as creating a user without being prompted by UAC for Administrator credentials. The UAC prompt cannot be manipulated by voice commands by default,” wrote a Microsoft employee, identified on the company’s security blog only as Adrian. “… While we are taking the reports seriously and investigating them accordingly I am confident in saying that there is little if any need to worry about the effects of this issue on your new Windows Vista installation.”
He listed conditions necessary for this exploit to be carried out:
[T]he system would need to have speakers and a microphone installed and turned on. The exploit scenario would involve the speech recognition feature picking up commands through the microphone such as “copy”, “delete”, “shutdown”, etc. and acting on them. These commands would be coming from an audio file that is being played through the speakers. Of course this would be heard and the actions taken would be visible to the user if they were in front of the PC during the attempted exploitation. … There are also additional barriers that would make an attack difficult including speaker and microphone placement, microphone feedback, and the clarity of the dictation.
Adrian wrote that the exploit appears in Vista as a result of improvements to the speech-recognition functions to help users with disabilities. We took a close look at the features and how they were developed in this story.