Get Help
Community

Mobile Voice Recognition: iPhone 4S Siri & Google Translate

grand master

In a new series of articles, we’d like to find out what the ideal mobile phone might look like if it were designed by giffgaff members. What kind of specifications would it have? Would it be voice-operated or touchscreen-operated? Would you use a personal assistant feature like the iPhone 4S’ Siri or is it a bit of a gimmick?

 

In this article, we look at voice recognition in mobile phones. A rapidly emerging technology, voice recognition is being integrated into all kinds of mobile applications from text messaging to translation. It’s taken us on a path towards the voice-operated computers envisioned by Gene Roddenberry in Star Trek where verbal commands are used to control the USS Enterprise and where all spoken languages are automatically translated through the Universal Translator.

 

Today’s smartphones are beginning to give us a glimpse of this voice-operated future of mobile computing. The iPhone 4S’ Siri is a voice-activated “intelligent personal assistant” which can do most of the things you’d normally do on your iPhone – including being able to send text messages, schedule meetings and find local businesses. Meanwhile, Google’s Translate application hopes to bring down language barriers with on-the-fly speech-to-speech translation between 14 languages.

 

In this article, we explore the cutting-edge of voice-operated mobile phones and voice-powered applications.

 

Voice Recognition

 

Computer-based voice recognition technologies were being developed as early as in the 1950s. An early public demonstration of voice recognition came from IBM in 1962 - the IBM Shoebox was a very basic calculator which could recognise just 16 spoken words such as the numbers (zero to nine) and arithmetic operations such as ‘plus’, ‘minus’ and ‘equals’.

 

Fast forward nearly 50 years and there’s voice recognition technology in almost every pocket. Many of our current-day smartphones include voice technology such as voice-operated dialling and texting. These features are even available on relatively inexpensive smartphones such as the Orange San Francisco and HTC Wildfire S.

 

android-voice-control.jpgVoice Recognition on Android

 

Android 2.1+ has speech recognition built directly into the keyboard allowing you to speak your text messages, e-mails or Facebook status updates. For long e-mails and long text messages, it can certainly save you a lot of time by dictating your e-mail rather than typing it on a slow on-screen keyboard. Note that Android speech recognition uses data connectivity so it’s worth having a look at one of our goodybags with unlimited data.

 

With free applications such as Vlingo or Google Voice Actions, Android users can take things one step further and command their entire phone by voice. These applications allow you to make phone calls to friends simply by speaking their name or allow you to obtain turn-by-turn GPS-powered directions by speaking your destination.

 

Siri: Personal Assistant for iPhone 4S

 

iPhone 4S - Siri.jpgThe iPhone has supported a feature called “Voice Control” since the iPhone 3G S. iPhone 3G S and iPhone 4 users can activate “Voice Control” by holding down the home button for a couple of seconds. This mode allows you to control your iPhone by dictating commands such as “Play music”, “Pause music” or “Call John Smith”.

 

With the Siri feature of the iPhone 4S, Apple have integrated together voice recognition with natural language processing techniques and artificial intelligence technologies. The goal is to design a “personal assistant” which understands natural language: it should be able to answer your questions and fulfill requests regardless of how they are phrased. This approach is in contrast to Google’s “Voice Actions” and the iPhone’s earlier “Voice Control” feature – both of these implementations require you to learn specific commands.

 

Siri can do a wealth of things such as managing your appointments, sending messages and finding local businesses (although the latter feature is not currently available in the UK).

 

Universal Translator

 

Google Translate.jpgAnother fascinating application of voice recognition technology comes in the form of the Google Translate application for Android. The “Conversation Mode” in Google Translate provides instant speech-to-speech translation between 14 languages. You’d say something in English (or your preferred language) which would be picked up using voice recognition. That’d then be translated into another language before being read back out using a text-to-speech engine. The process works both ways making it possible to have a conversation with someone even if you don’t share a common language – the phone bridges the gap.

 

Unfortunately, Google’s version of the “Universal Translator” still isn’t perfect. There are often errors in the voice recognition process and errors in the translation process – so it’s not quite seamless. Yet, “Conversation Mode” gives us an exciting glimpse of a future where the mobile phone helps us to communicate across lingual barriers. The mobile phone has already brought down the barriers of distance and allowed us to communicate with people regardless of where they are in the world. The next logical step is to break down the language barriers and to allow us to communicate with people regardless of where they are in the world and which languages they speak.

 

Is voice the future of mobile?

 

For Gene Roddenberry, voice was the way that we were going to interact with technology in the future.

 

Some people argue that voice is the most natural way of interacting with other people: it’s why we love having all those inclusive minutes in our giffgaff goodybags and it’s why a text still can’t make up for an good old-fashioned phone call. It follows naturally that voice should be the most natural way of interacting with technology: technology should understand our intentions regardless of how we express them rather than forcing us to express our intentions in a certain way. Most of us also speak much faster than we can type so voice-controlled mobile technology could save a lot of time too.

 

iPhone Voice Control.jpgOthers will argue that voice interaction is impractical on a mobile phone.  Voice interaction looks great on TV: it’s much easier to tell a story when you can tell the computer to do something – never mind the fact it’s not particularly exciting to watch someone tapping away on a keyboard. But is it really practical to use a voice-controlled mobile phone in everyday life given the amount of background noise in noisy environments and the loss of privacy associated with speaking everything? Would we really want to dictate every e-mail we write on our phone with the risk of people overhearing? Does everybody in your train carriage really want to know what you’re doing on your phone? Would you use voice dictation for your password or credit card number?

 

Voice recognition in mobile phones is still in its infancy and we'd like to know what you think. Is it something you’ve tried on your mobile phone? Would you like to be able to control your mobile phone with your voice? What would it mean to you if your mobile phone could seamlessly translate between languages? How might it change the world if your phone allowed you to converse with anyone in the world regardless of which language they speak? Who would you speak to first?

 

giffgaff member trivia: How many different languages are spoken in the UK?

30 Comments
prophet
Maybe in the upcoming article, mention Iris, made by an Android developer in 8 hours, being worked on to be like siri. What annoys me about Apple, is they bought Siri, they didn't make it.
grand master

Hi jamesd2010. Good point! I tried out Iris and Speaktoit (another voice-operated virtual assistant programme) but personally I wasn't that impressed at either one! Do you have any recommendations/thoughts on Android virtual assistant apps?

prophet

@kenlo

 

The best one I've used in Vlingo - that's impressive.

 

However, I don't think we should be judging Iris yet - 8 hours of development time where as Apple have had a year, It's amazing what the Android community have done here, and will create competition between the two - Considering Iris is in alpha stage... I'm impressed!

enigma

I'm not really up to speed with the latest voice recognition software, but the importance of it to BLIND people cannot be underestimated..... more research needed there I think, imagine a spoken internet???

good article, i'll have nose about iris....

beginner

Very interesting, definalty a hot topic due to the 4S.

Great Blog.

newcomer
Background noise isn't that much of a problem but privacy can be. There is always going to be a need to use our fingers sometimes, at least until mind control comes along Smiley Wink I've been using voice recognition for ages. 10 years ago I could use my bluetooth headset on an old basic Nokia to call people while driving. The tricky thing has always been remembering the commands! Siri has really upped the game in terms understanding human language. Microsoft phones do a good job of voice control, dictating messages, even identifying music playing on the radio but I was really impressed when I saw their next-generation service in this TellMe video http://www.youtube.com/watch?v=x-3XWl4srng. It does annoy me that a)people will think Apple thought of this 1st and b) it's only available on new iPhone4S' at the moment but at least it gives the competition a kick up the backside!
consultant

quality article..

expert

Just realised Iris is Siri backwards Smiley Frustrated

Competitors have the same name backwards??

novice

I think this is a verry fascinating featureSmiley Wink

consultant
woops nvm (delete)