Gesture and Voice...
Beyond the Keyboard

by Francesco Levantini

New assistive technology: futurology only for those who really don't believe in it.


“Wolfe was still not quite convinced.
Archie, so he was the one who hid the business card? Shrugging my shoulders, I turned and looked at the chest of drawers next to the red couch. It was always there, in Miss Lucy's agenda, attached on the last page.”
It is a simple example of communication between two people and, like 99% of communication it is not “multimedia” but “monomedia”.
Information travels in fact on two different channels, each one containing only one part: Archie Goodwin's gesture, looking at the chest of drawers, and his words bringing attention to the agenda.
What would have Nero Wolfe understood if he had been deaf?
The gesture of the person he was talking to would have led him to the chest of drawers, and how much time would have passed before he realized that the chest of drawers was of no help?
And what if he had been blind?
Archie's words would have made him think of a book (something with pages) but how much fumbling in the room would he have had to do before finding it?
Communication is even more effective because it is multimedia and is even more efficient because it is essential and monomedia. What should Goodwin have done to be multimedia?
Go against the principles of communication by saying: the business card is on the last page of the agenda which is in the chest of drawers near the red couch in front of the desk. And, at the same time, he should have stood up, walked to the chest of drawers, taken the agenda in his hands and opened it on the right page.
In cognitive psychology, this phenomenon is called the “usability paradox”.
The more information is accessible in a particular context and the less it is transferable in others.
Nothing new so far, it is the problem that experts in accessible technology face each day working on Web pages, on operating systems or on the interfaces used by disabled users and which they often solve, thanks to their professionalism.
There is however another trap that the use of multimedia is setting for persons living with vision loss: our old and friendly keyboard is being replaced by two emerging technologies for the input of information: gesture and voice. They are so efficient and effective in our everyday life that it does not make sense to question the opportunity.
We will have to overcome the gaps. There remains the only way to recreate the model of participation in social life which is called “normality”. Touch screen, body motion, gaze direction and vocal commands today are top of the range high technology products that are the multimodal alternative to T9 or the QWERTY keyboard.
What is missing at the moment in which they will acquire monomedia valency inconvertible on different channels? Let's just take Nintendo's WII. What joystick or keyboard will ever be able to emulate the efficiency of the movement of a hand holding a tennis racket or a golf club?

Foto - Cellulare di ultima generazione

And when from the video games and the technology status symbol will the gesture and voice enter our professional life?
When the representative of a company will present himself in the office of his client and will have images of the contract appear by simply putting his own iPhone on the desk?
The client will be able to move it with the fingers like a real piece of paper and sign it with an imprint of the finger in a virtual box or with a real signature done with a fountain pen if it's at hand's reach, but it would be enough to only use the nail as a pen.
It is one of the many examples of monomedia in input. To attempt to export its efficiency on the keyboard would be like attempting to write a piece of music with the written word. There are two factors that make the problem today easier to challenge than it seems: first of all, we are at the beginning of a route where everything is still possible, and, secondly, whatever way this is all made possible, it will be with the flexibility of the bits of the Web, of the zeros and the ones of the computer. That is really the starting point of research on new assistive technologies of the Wireless RERC network where Italy plays a very important role, thanks most of all to the Osservatorio Mobile & Wireless Business of the Polyclinic's School of Management in Milano. There are two words for this new ICT model: delegation and intelligence. Operatively, all this means that the continuous control of the user with the classic transactional and interactive ICT is replaced with the declaration to the systems of one's own intentions with an actual request to have the service done and to the resulting necessity of control and research that it entails.
“I would like to drink my coffee tomorrow morning when I wake up”, would probably be the vocal request to the radio alarm. It is up to it to know if and when I got up receiving the information from the light switch in the bathroom and inform the coffeemaker that it's time to prepare the coffee.
In February this year at the 3GSM World Congress in Barcelona, we were able to observe that these examples constitute futurology only for those who don't believe in them and persist in thinking that computer science means one's own laptop computer on the desk. It is in fact at the 3GSM that we discovered among others that the laptop can stay in the carrying bag where via a FM radio signal it can connect to the car radio or the speakers of the home theatre and receive vocal commands from the microphone of the telephone that we have in the pocket.

And the touch screen? Is it really an insurmountable obstacle for the blind person? It is always the Wireless RERC that answers this question through the environmental analyzers of gestures (gesture toolkit for those who want to do research on Google) and it was Apple who first applied this modality to the iPhone demonstrating that the gesture of touching the video is no longer the simple touching of an area on a sensitive screen. Today, we use two, three fingers to work and we do it by touching the screen with the fingernails, the fingertips, tapping on the screen or touching it lightly. This new ergonomics opens great perspectives for assistive technologies. A screen reader, for example, where by resting the fingertip on the display a voice synthesis reads, while a touch of the fingernail does the work. Where by tapping the screen with six fingers, it is possible to write Braille. There is, however, a price to pay for all that. For the blind user, abandoning the keyboard means having to give the faithful and familiar screen reader that's inside the computer and open up to gestures and interactions that are completely new to him. It is somewhat the same cultural trauma with which were challenged sighted users when, in the 90s, they first saw a mouse on their desk. But it will be the revenge of the young over the old!
Only a short while ago, a colleague asked me:
“Can you help me, my dad is losing his sight, he loves to read but he can't do it anymore. Can we try showing him how to use a computer so he can read using the voice synthesis? It saddens me so much to see him like this, he sits all day in front of the television that he now can only listen to”.
To teach someone who is 80 years old how to use a screen reader is to submit that person to torture he does not deserve.
Why the computer? I said to him, let's prepare some audio books on CDs that he will be able to read directly from his television DVD using the remote control.
It's been a while since then; today my colleague's dad is 82 and, the other day, to ask me for a new book he called me via voice over IP! No, not with the computer, but the normal telephone connected via WiFi with a desktop he has at home which he probably does not remember where it actually is. He called my nick with the telephone's small keyboard. On Skype, I am Levantini but for him it is 538268464, my name in T9, but it doesn't matter... and it is I who answered and, old computer user that I am, I did it from my PC. And here I am working for the new generations of home automated ICT, preparing him a book on CD.