Microsoft is bridging the communication gap with advanced voice recognition and translation services

Microsoft is combining quick, reliable voice recognition with instant language translation and embedding it into more and more of its products - bridging the communication gap across countries and eroding the barriers of disability.

In the future of computing, your voice will be heard

Talking to your computer, smartphone or ever-helpful home assistants such as Microsoft/ Harman Kardon's up-coming Invoke speaker with Cortana (pictured below), the Amazon Echo or Google Home is now so normal that it’s become second nature for many of us. Of course they’re still far from perfect, but anyone who has experienced the uncanny ability of these assistants to correctly answer shouted questions from across a busy room is left with a feeling of wonder and a strong desire to ask another dozen or so questions to experience the magic some more.

Cortana and Harman Kardon Invoke speaker

For people with disabilities the magic is taken to a whole new level and there is little doubt that natural language smart assistants are going to play a major part in the future of computing.

The ability for these devices to help us with our requests is partly due to the AI smarts behind the scenes, but a large part is the increasingly accurate ability for them to understand exactly what we’re saying regardless of our accent, environment and language.


HAVE YOU READ? 5 ways the NHS disables blind people, and 5 ways it could help 


There’s some way to go with regards extreme levels of background noice (although machine learning is doing it’s best to learn how to filter out extraneous sounds) and recognising people with a speech impairment or disability, but the collosal number of people now using these devices is feeding vast amounts of valuable data into the central AI systems that drive these devices. Without buying a new gadget, or even having to update their software, these assistants will become better and better and more and more useful.

Microsoft - bringing language smarts to many platforms

While almost every operating system and device now has dictation capabilities built-in, Microsoft seems to be in the vanguard when it comes to embedding voice recognition and translation services across their most popular platforms and apps.

Speech recognition has been built into Windows for many years now and is getting better all the time. You can dictate into any application at up to 300 words a minute – faster than any touch-typist and with 100% accurate spelling too.

A more recent development is the ability to have live subtitles viewable by audience members while you present using PowerPoint. Check out this excellent article on their new Presentation Translator capability that not only provides live subtitles, but can also instantly translate them into several dozen languages. Individual attendees can even view synchronised subtitles in their own preferred language on their phone.

skype translator

Similar smarts have been available in Skype for some time. Skype Translator allows you to have voice conversations instantly translated into other languages and spoken out using clear synthetic speech – bridging the communication gap between millions of people world-wide. Not only that, but it’s available built-in to all recent versions of Skype on computers running Windows 7 and above.

Synchronised speech and text - assisting communication across impairments

Now consider someone with a hearing impairment. With Skype Translator the recognised text is not only spoken out but also comes up on-screen. This means that they can now read the other half of the conversation real-time and are now able to participate in a typical Skype voice call. Not a multi-lingual conversation?

No problem – simply set the language to be translated into to be the same as your own - i.e. set the language of both speaker’s to ‘English’, for example.
If you also have a communication impairment (many people who are born hard of hearing do) then no problem, you type your responses and the other speaker talks as normal and you’ll get their recognised speech as text on-screen.

This seamless integration of text and speech is a truly powerful combination of technologies, bridging the gap in communication across countries and disabilities.

Kudos to Microsoft for providing such power in these hugely popular applications – and when you combine this with the excellent image and object recognition  services they’re developing (that similarly disproportionately assists those with a disability or impairment) and the all-round excellent accessibility across all their platforms – Microsoft are really on fire at the moment when it comes to inclusive design. Guys, keep up the good work.

Here’s to a future in which the voice of every user can be heard and understood.