Victor Zue –– an expert in speech recognition –– was born in Sichuan, China and raised in Taiwan and Hong Kong. He barely spoke English when he came to the U.S. at 18 to study at the University of Florida.
“It was tough,” he says. “It was important for me to fit in because I was living in a dormitory in the South. But no matter how hard I tried to pronounce words, I just didn’t sound like an American.
“Americans,” he says, “don’t say, ‘Did you?’ They say, ‘Didju?’And they don’t say, ‘I miss you,’ but rather, ‘I mishu.’”
Zue soon realized speech has less to do with how you pronounce words than how words are put together. He grew fascinated with how humans produce and perceive speech. Soon he was interested in the idea of human machine interaction and in building systems that can understand and talk with us. Later, at MIT, he formed the Spoken Language Systems Group at the then Laboratory for Computer Science (now part of the Computer Science and Artificial Intelligence Laboratory.) Zue is now teaching the computer to talk –– and listen –– so that one day it will understand us the way we understand each other.
Co-director of MIT’s Computer Science and Artificial Intelligence Lab (CSAIL), Zue also chairs the steering committee of Project Oxygen, which aims to radically change the way people deal with information-related activities. The project involves 30 faculty members and is supported by the federal government and computer companies on three continents.
The goal of Project Oxygen is to make computing human-centric and available everywhere, just like oxygen in the air we breathe. Now, Zue says, we keep computers in air-conditioned rooms or carry them around, interacting with them using their artificial interfaces and languages. One day, he adds, we’ll interact with the computer the way we are comfortable – using speech, vision, and gestures. We’ll just tell the computer, “Send this to Harry” or “Book the next flight to New York.”
One day our houses and offices will be equipped with cameras, sensors, and microphones all connected to a “central computer. The computer will recognize our face, our voice, our gait. We won’t need to key in commands. We’ll just tell the computer to switch on the air conditioner or lock the doors.
The goal of Project Oxygen is for people to do more by doing less. Imagine, Zue says, one day we’ll walk into a shopping mall and Oxygen will whisper, “I notice you’re looking at that pair of shoes. You know, if you walk down two stores, the same pair is selling for 20 percent off.”
TALKING WITH COMPUTERS
Zue is an authority on spoken-language processing, and along with his MIT colleagues, he built a system that can tell telephone callers the weather in 500 cities around the world. The system works in Japanese, Chinese, Spanish, and English, and is available via a toll free number.
Ask the computer if it will rain tomorrow in New York City, or the temperature in Hong Kong, and it can reply verbally. It knows the difference between Fahrenheit and Celsius but sometimes it gets mixed up. On a recent day, the system confused ‘tomorrow’ for ‘Idaho.’
Teaching a computer to understand is especially challenging, says Zue, because it is hard to teach it the differences between Boston and Austin; let us pray and lettuce spray; Meet her at the end of Main Street, and Meter at the end of Main Street.
“These are tough problems,” he says. “The machine doesn’t understand these subtle things. You often need to know the context.”
Zue says it’s much tougher to teach the computer to listen than to talk. “If (the computer) talks, it only needs to sound like one person. If it is to listen, it needs to understand everybody’s speech, including large vocabularies, continuous speech, and various accents.”
One future application of Project Oxygen is health maintenance. Zue says that because of medical improvements, people are living longer. In 30 years, the number of people living past 65 will nearly double. “We want to help people live independently.”
One day, he says, the system will remind you, say, at 4 o’clock to take your medicine. It will remind you where you put your eyeglasses. Sensors in the bathroom will make sure the bathtub does not overflow and that the water temperature is not too hot. The system will track your favorite TV shows and alert you when similar programs are on. When you visit the doctor, you can bring along the system’s records to help doctors identify new medical problems. And you can use the system to call a doctor if you fall while you’re alone.
Is Zue ready for the world of Oxygen?
“Yes and no,” he says. “We engineers think of wonderful ideas, but often they have unanticipated consequences. One of our biggest concerns right now is security and privacy. Do we really want to track everybody all the time?
“We are not spending enough time studying the social implications of this work. The technology has the potential to be great, as long as we make sure it doesn’t become intrusive.”