In a new series of articles, we’d like to find out what the ideal mobile phone might look like if it were designed by giffgaff members. What kind of specifications would it have? Would it be voice-operated or touchscreen-operated? Would you use a personal assistant feature like the iPhone 4S’ Siri or is it a bit of a gimmick?
In this article, we look at voice recognition in mobile phones. A rapidly emerging technology, voice recognition is being integrated into all kinds of mobile applications from text messaging to translation. It’s taken us on a path towards the voice-operated computers envisioned by Gene Roddenberry in Star Trek where verbal commands are used to control the USS Enterprise and where all spoken languages are automatically translated through the Universal Translator.
Today’s smartphones are beginning to give us a glimpse of this voice-operated future of mobile computing. The iPhone 4S’ Siri is a voice-activated “intelligent personal assistant” which can do most of the things you’d normally do on your iPhone – including being able to send text messages, schedule meetings and find local businesses. Meanwhile, Google’s Translate application hopes to bring down language barriers with on-the-fly speech-to-speech translation between 14 languages.
In this article, we explore the cutting-edge of voice-operated mobile phones and voice-powered applications.
Computer-based voice recognition technologies were being developed as early as in the 1950s. An early public demonstration of voice recognition came from IBM in 1962 - the IBM Shoebox was a very basic calculator which could recognise just 16 spoken words such as the numbers (zero to nine) and arithmetic operations such as ‘plus’, ‘minus’ and ‘equals’.
Fast forward nearly 50 years and there’s voice recognition technology in almost every pocket. Many of our current-day smartphones include voice technology such as voice-operated dialling and texting. These features are even available on relatively inexpensive smartphones such as the Orange San Francisco and HTC Wildfire S.
Voice Recognition on Android
Android 2.1+ has speech recognition built directly into the keyboard allowing you to speak your text messages, e-mails or Facebook status updates. For long e-mails and long text messages, it can certainly save you a lot of time by dictating your e-mail rather than typing it on a slow on-screen keyboard. Note that Android speech recognition uses data connectivity so it’s worth having a look at one of our goodybags with unlimited data.
With free applications such as Vlingo or Google Voice Actions, Android users can take things one step further and command their entire phone by voice. These applications allow you to make phone calls to friends simply by speaking their name or allow you to obtain turn-by-turn GPS-powered directions by speaking your destination.
Siri: Personal Assistant for iPhone 4S
The iPhone has supported a feature called “Voice Control” since the iPhone 3G S. iPhone 3G S and iPhone 4 users can activate “Voice Control” by holding down the home button for a couple of seconds. This mode allows you to control your iPhone by dictating commands such as “Play music”, “Pause music” or “Call John Smith”.
With the Siri feature of the iPhone 4S, Apple have integrated together voice recognition with natural language processing techniques and artificial intelligence technologies. The goal is to design a “personal assistant” which understands natural language: it should be able to answer your questions and fulfill requests regardless of how they are phrased. This approach is in contrast to Google’s “Voice Actions” and the iPhone’s earlier “Voice Control” feature – both of these implementations require you to learn specific commands.
Siri can do a wealth of things such as managing your appointments, sending messages and finding local businesses (although the latter feature is not currently available in the UK).
Another fascinating application of voice recognition technology comes in the form of the Google Translate application for Android. The “Conversation Mode” in Google Translate provides instant speech-to-speech translation between 14 languages. You’d say something in English (or your preferred language) which would be picked up using voice recognition. That’d then be translated into another language before being read back out using a text-to-speech engine. The process works both ways making it possible to have a conversation with someone even if you don’t share a common language – the phone bridges the gap.
Unfortunately, Google’s version of the “Universal Translator” still isn’t perfect. There are often errors in the voice recognition process and errors in the translation process – so it’s not quite seamless. Yet, “Conversation Mode” gives us an exciting glimpse of a future where the mobile phone helps us to communicate across lingual barriers. The mobile phone has already brought down the barriers of distance and allowed us to communicate with people regardless of where they are in the world. The next logical step is to break down the language barriers and to allow us to communicate with people regardless of where they are in the world and which languages they speak.
Is voice the future of mobile?
For Gene Roddenberry, voice was the way that we were going to interact with technology in the future.
Some people argue that voice is the most natural way of interacting with other people: it’s why we love having all those inclusive minutes in our giffgaff goodybags and it’s why a text still can’t make up for an good old-fashioned phone call. It follows naturally that voice should be the most natural way of interacting with technology: technology should understand our intentions regardless of how we express them rather than forcing us to express our intentions in a certain way. Most of us also speak much faster than we can type so voice-controlled mobile technology could save a lot of time too.
Others will argue that voice interaction is impractical on a mobile phone. Voice interaction looks great on TV: it’s much easier to tell a story when you can tell the computer to do something – never mind the fact it’s not particularly exciting to watch someone tapping away on a keyboard. But is it really practical to use a voice-controlled mobile phone in everyday life given the amount of background noise in noisy environments and the loss of privacy associated with speaking everything? Would we really want to dictate every e-mail we write on our phone with the risk of people overhearing? Does everybody in your train carriage really want to know what you’re doing on your phone? Would you use voice dictation for your password or credit card number?
Voice recognition in mobile phones is still in its infancy and we'd like to know what you think. Is it something you’ve tried on your mobile phone? Would you like to be able to control your mobile phone with your voice? What would it mean to you if your mobile phone could seamlessly translate between languages? How might it change the world if your phone allowed you to converse with anyone in the world regardless of which language they speak? Who would you speak to first?
giffgaff member trivia: How many different languages are spoken in the UK?