An appealing outlook on the future of language and speech technology

Text: Dr. Ir. Arjan van Hessen

Jan was desperately looking through the more than hundred audiovisual messages on his system. Brigitte had left him a message a couple of weeks ago, about a dinner reservation at Utrecht with information regarding the date and location. But now he couldn’t remember the exact details. Apart from Utrecht, Jan didn’t remember anything. He turned on the Home-Info system and started looking: “a message from Brigitte about a restaurant in Utrecht or something…”

The next few years will show a change in the way we communicate with organizations and friends. The current modes will still exist (text, illustrations and speech), but they will start to mingle more and more so that it is no longer clear whether a spoken message is a telephone conversation, an instant messaging message or a skype conversation. The general view will continue to last (customer has a question or remark), but the way it happens will change a lot. This entails consequences for the receiving side of the message because it won’t be immediately clear where the multimodal message should be handled. Think about a spoken message with a document containing pictures of a broken apparatus for instance. The most probable division will not be according to mode but online/offline.

Artificial Intelligence (AI)

AI will definitely find its way into the modern customer contact centre (CCC). For the field of customer contact, IA is the science that creates a human-machine dialogue showing a “form of intelligence”. Taken to the extreme we can state that the artificial intelligent entity is the ultimate model of the customer contact employee. It is not easy to indicate what that intelligence is precisely, but it is clear that the human “linguisticness” forms a great part of it. Questions need to be interpreted and answered. Smart counter questions need  to take away the unclearness and references to previous statements (“like I just said…”) need to be interpreted in the right way.

The fruitful combination between speech recognition, text interpretation (interpreting and classifying of the text) and information retrieval (getting the correct information based on the text) provides “intelligent” dialogues at the CCC. A good example of this are the dialogues based on open speech recognition where callers formulate an answer to the question why they call the company. The next steps in the development of even more intelligent dialogues are the use of emotion and the visualization of the artificial employee.


Right now we are satisfied when the computer can give the correct answer to a question. We realize that the “other side” is a computer and don’t expect empathy (“how bothersome that the mechanic didn’t show up again”). Telecats is working on a technology that makes sure the “tone of voice” of the dialogue adjusts itself to the situation.

Visualization, the future, also for contact centres

Another equally important development is visualization. Systems that enable us to communicate will get a “face”, as shown by the rise of virtual assistants. Now these assistants are just little figures on the screen, but this will definitely change in the years to come.

A good example of the integration of speech, text, vision and emotion can be seen in Microsoft’s project NATAL. In this project there is a completely virtual world where a virtual employee exists and on the other hand there is a communication layer that enables a real human to communicate with this virtual employee on an auditory and visual level. This will have a big impact on the service providing world. What’s remarkable, but not surprising, is that for this project 90% of the communication is done via speech: speech is the favourite mode for us humans. The portrayed communication unites the benefits of the online and offline worlds; you can talk in a way and at a moment you prefer. The technology of Telecats is of course prepared for this.

Dr. Ir. Arjan van Hessen is Head of Imagination at Telecats/University Twente

