Cost 219ter logo Skip to main content

Cost 219ter

Towards an inclusive future: Impact and wider potential of information and communication technologies

2. Current examples of existing products and services for people with disabilities


2.1 Introduction

By Julio Abascal and Patrick Roe

This chapter brings together a number of examples of good practice that have been chosen with the objective of providing some insight on the possible evolution from current telecommunication technologies to future “intelligent environment” services. The main aim is to give a snapshot of current trends in services that are accessible to people with disabilities and to discuss the possible impact on people with physical, sensory or cognitive restrictions (that may be due to a disability, ageing or to the special conditions or equipment they use). The emphasis is on presenting current services and how they are likely to evolve in the future to show what the potential impact could be on people with disabilities and elderly users. This will also serve as a baseline of what is the current situation in comparison to the possible future scenarios discussed in chapter 4.

The chapter is structured into four main sections (apart from this introduction): 2.2. New technologies to help people with disabilities and elderly people; 2.3.New remote services; 2.4.Evolution of text telephony; and 2.5. User participation in technology. A summary of the contents is given below.


2.2 New technologies to help people with disabilities and elderly people

Technological advancement in the field of robotics has provided devices and techniques for sensoring, positioning, mapping, navigating, etc.These techniques have made it possible to develop devices to help people with physical, sensorial or cognitive restrictions to navigate both outdoors and indoors.The section "Safe navigation with modern technology" makes a detailed description of current technology to support human navigation and discusses the possibilities for the near future.

It is known that speech is the main means of communication between people. Nevertheless a number of users with disabilities experience restrictions in their speech capacity that limit their communication skills.Current speech technology is able to translate text-to-voice and voice-to-text (the latter still without enough quality and reliability), enabling the design of diverse mediation devices and services. These include, for instance, reading texts aloud for people with sight restrictions, and controlling devices in a more natural way through the voice.The section entitled "Speech processing for people with disabilities" reviews current and more particularly, future applications of speech technologies that can enhance the communication of elderly people and people with disabilities.

2.3 New remote services

Broadband communication technologies are already available. They can sustain advanced services to support people with disabilities and elderly people.Relay services, virtual communities, enhanced communication, etc., are being successfully tested in a number of countries. The section entitled "Novel broadband-based services:new opportunities for people with disabilities" describes seven trials of advanced broadband-based support services, conducted by the National Post and Telecom Agency in Sweden (Post- och telestyrelsen, PTS), to test the validity of current and future broadband telecommunications services in providing remote support that is tuned to the needs of specific groups of people with disabilities.

Relay services usually act as communication mediators between users, one of whom at least has a disability that prevents them from using standard communication devices or services. These services are able to translate from signsto-voice (and vice versa), from text-to-voice (and vice versa), etc. They can also provide other services such as the description of a received image to a blind person. Some pre-existent relay services may be enhanced, universalised and made less expensive by means of the currently available advanced telecommunication technologies. The section entitled "Access to video relay services through the pocket Interpreter (3G) and Internet (IP)" presents two experiences developed by the Swedish National Post and Telecom Agency: The IP access project, a video telephony relay service based on IP and the pocket interpreter for mobile video communication, both for signing deaf people.

Efficient use of relay services requires that a number of steps be closely followed in order to speed up the service.The section entitled "Convenient invocation of relay services" describes the best way to invoke various relay services currently existing in Sweden. These experiences may be taken as examples of good practice that help optimise the design of the access to future relay services.

The rise of Short Message Services (SMS) tied to the expansion of mobile telephony, is frequently associated in our minds to young people. Short messages are cheaper than voice calls and don’t require that both interlocutors are simultaneously engaged. Nevertheless, SMS can be also useful for other groups of the population. A remarkable application of SMS is shown in section "Ways of using mobile telephones by people with dementia", revealing that elderly people with cognitive restrictions can take advantage of this technology for verbal, text or symbol communication and support.

SMS technology is also used in the "Implementation of an SMS-based emergency service in Finland" to allow not only deaf people, but any other user, to contact the universal 112 emergency service sending emergency text messages. After making contact the user receives an acknowledgement message and can be located for assistance.

2.4 Evolution of text telephony

Text telephony is currently the basic means of communication for many people with disabilities, such as deaf people. The technology supporting mobile telephony does not allow the extension of traditional text telephony. For this reason, many users substitute mobile text telephony by the use of SMS messages, but they do not allow full interactive communication, hence the need to develop novel mobile text telephony services.

Since the next generation of text telephony in Europe is under development, it is necessary to establish basic design guidelines that guarantee the quality of the service. "The recommendations of the Nordic countries regarding functionality for text telephony" section compiles criteria that include mobility, interoperability, continuity, accessibility from the internet, and availability of relay services.

Diverse experiences have been developed to provide mobile text telephony through the access to internet servers. The section entitled "Mobile & IP-based text telephony" shows the deployment of such a service in Sweden, while "Mobile text telephony based on GPRS communications" explains the results obtained by a Spanish project.

2.5 User participation in technology

With the attraction of a growing market, there is a greater likelihood that more and more companies will be marketing devices in the near future that can be accessed by elderly people and/or people with disabilities. Since these concepts can be interpreted in diverse ways, consumers may find that devices advertised as fully accessible, straightforward and easy to use, do not really fulfil their needs. It is within this context that the availability of functional specifications of terminals becomes essential, so that products can be checked and certified in order to give to the customer a guarantee of the appropriateness of a given product or service in relation to his or her needs. The section entitled "Functional specification for terminal procurement" presents an example of good practice from Sweden in what will become an important area for the future.

 

2.2 New technologies to help people with disabilities and elderly people


2.2.1 Safe navigation with wireless technology

By Jan-Ingvar Lindström


Background

How can I be sure to find my way? Can I walk safely here? What happens if I get lost? Do I dare to try a new route? What if I suddenly fall ill and need help? The lack of good answers to these and similar questions have prevented a number of vulnerable people to move around in outdoor as well as indoor environments which they are not familiar with.

And who is not vulnerable? Basically, all of us sometimes are in need for help because we have lost our way or feel unsafe or have made a mistake in our way-finding effort.Among us, however, are people who feel more at risk than others, not least people with various kinds of disabilities. And among these, people with visual disabilities and those who suffer from cognitive impairments have expressed strong interest in finding solutions to overcome their problems.

Historically, blindness and partial sight have inspired engineers and psychologists to find solutions to way-finding problems for these groups, both in terms of personal navigation aids and landmarks in the environment.Early on, the long cane became a well known attribute to blind pedestrian’s navigation, and later efforts have been made to improve the cane by adding remote sensors. Examples are laser emitting diodes end sensors, magnetic field probes and – most recently – RFID detecting devices. Other ideas have been to simulate bat’s navigation technique, i.e. the development of various kinds of ultra sonic devices to scan the environment and get some idea of what it looks like.

The common denominator for all these examples has been the individual characteristics of the solutions. Also, they only provide information about the very near environment.

Given these historical facts, over the last few decades, navigation problems of other groups have been acknowledged. An example is the large group of people with cognitive impairments, including e.g. those with dyslexia, mental disabilities, dementia and stroke, but also people with mobility problems, including wheel chair users. The problems here are wide ranging from being able to read and understand a map or remember information to learning in advance about obstacles, on-going road works and similar matters. Even people who are deaf or hard-of-hearing have experienced great problems in moving from their home to e.g. a school or working site by public transport as so much information is given about changes in time tables and alternative means of transport, etc., has been given orally. Slow improvements have come about in society as much information has successively been given both as voice information and presented on visual displays. These solutions, however, have been generic, and not been of much help to people who suffer from dementia, mental disabilities and other cognitive disorders.

A break through came about with the installation of the American Global Positioning System – GPS, that has been used since the late 1980s for positioning purpose, mainly as a tool for finding the way for car drivers and boat and aircraft navigation.As it will be discussed later, the GPS system per se does of course not solve the problems displayed above, but it forms a basis for further development that can lead to powerful tools for all groups with significant navigation problems.

Positioning, orientation, navigation, communication and localization

Mobility outdoors

Knowing one’s position is important, but not enough for safe moving around in an unknown environment.A system should also make it possible for users to orientate themselves, i.e. to know in which direction they are standing in relation to, for example, the points of the compass, to navigate independently, i.e. be able to move from one given position to another, and also if necessary, raise an alarm or communicate with an information or alarm centre for personal support and assistance. It should also be possible, for those who so wish, to be found without having to consciously trigger a localization function themselves.

Positioning

Satellite systems

The most widely used and available system – the GPS system – is based on the use of radio signals transmitted from satellites orbiting the Earth and with whose assistance it is possible, with the use of special receivers, to get a position on the Earth's surface in the form of coordinates. This kind of reference can be transformed into, for example, an indication on an electronic map on a GPS receiver. This can be linked to a mobile telephone, handheld computer or the like.

At present there are two existing systems in use:the American GPS (Global Positioning System) and the Russian GLONASS (Global Navigation System). The latter does not have any marketing in Europe and is currently being extensively updated.For many years, a system has been planned in Europe known by the working name Galileo. This system is designed to be well-adapted for European environments in particular. However, it is still presently at the development phase and will not be fully accessible until 2008 at the earliest.

GPS is designed to provide the best possible coverage some hundred miles north and south of the equator. This means that the further north and south one goes, the worse coverage one gets with GPS owing to the satellites all appearing to lie rather close to the horizon.

In it simplest form, GPS provides a positioning accuracy of some tens of meters. However, there is an extensive system of terrestrial stations that can take care of and process signals before they are received in the individual GPS receiver.This is known as Differential GPS or DGPS. With such support, it is possible to get down to an accuracy of just a few meters. In principle, it is possible to achieve even greater accuracy in this way (to within centimetres) but, for various reasons, it is not practically feasible for the navigation application in question.One reason is that access is not available everywhere to the terrestrial stations required for processing the signal. Another reason is that it may take up an unacceptably long time to process the signal – sometimes several seconds, which is too long in a real orientation situation.

Another possibility is Assisted GPS – AGPS – which can be used in situations where the signals from the satellites are too weak. This may be appropriate indoors, but also outdoors under less favourable circumstances. Examples of such circumstances are when only a small number of satellites can be reached or when moving around on narrow streets surrounded by high buildings or other similar environments – the so called canyon-effect.

It should be pointed out in this context that GPS receivers with much greater sensitivity than before – iGPS – are now starting to come onto the market, which may allow navigation with sufficiently good precision even in environments that are currently problematic from a radio perspective.
(see www.gpsworld.com and www.esa.int/esa)
 
Mobile telephone cells

A less precise, but not uninteresting method is what’s called 'Cell Global Identity', CGI.This is based on the possibility to register and identify the communication between a telephone and its activated base stations. There is consequently a technical possibility to determine the approximate position of a particular mobile telephone at any given moment. However, the technology is far too imprecise and is not yet adequately established to be of interest in the present context.

The utilization of GPS and CGI results in some form of coordinate references. These are only meaningful if they can be related to reality in the form of an appropriate map reference. Accordingly, access to maps and an appropriate user interface is necessary. This must be available in several alternative designs in order to adapt to the user's special capacities, for example people with visual impairments, people with reading and writing difficulties, people with cognitive problems and people with intellectual disabilities.

Landmarks

A landmark means here some kind of identifiable point in the surroundings that one can relate to in order to determine ones position.Such points are virtually everywhere for people who have sight and full control of their surroundings – it may be a familiar sign, a church tower or a distinctive large tree.

For people with visual impairments, different kinds of acoustic landmarks (sound beacons) have been tested for position determination.Examples are the ticking devices at pedestrian crossings that both confirm a position and to some extent guide the user to the post. Among the more exotic ones are recorded bird’s songs used in Japan!

Today, there are various technical possibilities to provide this kind of guidance:

All these systems have pros and cons for the user. WLAN and Bluetooth technology are already commercially available and have been implemented in various contexts, while the most common usage of RFID applications is in logistics. All have the advantage of functioning both indoors and outdoors. The disadvantage is that they require varying degrees of attention and maintenance.

Where there is a risk of radio black spot, the possibility of using landmarks like RFID, Bluetooth and WLAN for secure navigation indoors and outdoors should be considered.

Orientation

Some kind of compass is required for orientation. A traditional type of magnetic compass, i.e. a needle compass, can of course be used, but this is not particularly practical, especially for people with visual impairments. In this context, it would probably be more practical to have a magnetic field sensor and presentation in a visual or acoustic form. However, all magnetic compasses are affected by fields of magnetic disturbance – a strong deviation may be directly misleading and thereby be dangerous for the user. A more secure way is to make use of 'inertial navigation' in some form, but accessible systems are voluminous, expensive and require a lot of power. A further possibility is to utilize the compass function offered by the GPS system. The principle for this is that the system registers two consecutive points and calculates the angle between the points on the basis of these measurements, which in general is the same as the angle of travel. However the disadvantage is that this only functions when one is moving. It is consequently not possible to start from a given point and at that point determine which direction one is facing.

At present, the GPS system offers the best opportunities available for direction orientation while moving and an integral digital compass function in a handheld unit when stationary.

Navigation

The GPS system constitutes the basis for navigation, i.e.support to move from point A to point B. The simplest form of navigation means that one receives almost continuous backup support – visually or acoustically – in the form of appropriate road descriptions. However, this can also mean information about what is available on the route during the journey, in the form of ancillary information, for example the shops that are available in the vicinity and the range of products that they offer. These facilities will probably use local transmitters based on, for example, Bluetooth technology, RFID or WLAN.

How the system is used can vary according to need. In general one knows where one is and wants to go, but needs feed back during the route. It should also be possible to tell the system where one wants to go and let the system find the best route. An extreme case is when one has got lost and just wants to get back to the starting point – the 'back to base function'.

Maps

Maps are of great importance for navigation for most people. This applies not least for people with different disabilities.For people using wheelchairs, for example, it is important to have an overview of the route to be taken and, if possible, to assess any slopes, the nature of the route, etc.For people with visual impairments, this is perhaps even more important. Here, it is necessary to assimilate a mental map of the route to take. This can basically be done in two ways:
There are digitally-stored maps for satellite-based systems that can be entered into the navigator. One of the many advantages of these is that they can be kept up-to-date and, in certain systems, are almost continuously fed into the navigator.

Both digital and analogue maps are required. It should be possible to download digital maps onto the user's handheld unit and onto a computer at a service centre. Analogue maps in visual and tactile form – raised-line maps – are provided primarily when planning a travel route.

Most digital land-maps of today are intended for car drivers. They are of very limited use for pedestrians, especially those who are visually impaired. Therefore, maps must be developed that show safe ways for pedestrians, i.e. sidewalks, pathways, stairs etc.

Another method for people with visual impairments is a verbal description where the route is explained in sequence of the type: "Go along Main Street towards Main Square. Go past two street crossings. Take a right at the third. Walk for approximately 100 meters. You are then close to a pedestrian crossing with a ticking acoustic signal. Cross at this pedestrian crossing." This kind of information can, for example, be recorded on a pocket memory and be retrieved subsequently as the user is moving along, which however requires that someone assumes the role of recording the information. One disadvantage is that there is no help if something goes wrong on the way – there is nothing to put the user back 'on track'. Nor is there, of course, anything that gives a warning of impediments in the form of road works and the like.

Digital map showing the importance of including sidewalks and stairs for pedestrian safety.

Figure 2.1 Digital map showing the importance of including sidewalks and stairs for pedestrian safety.


Communication

For communication – everything from a call to an alarm – it is necessary to have a manned centre with which users can communicate. In its simplest form, this comprises a person who can answer the telephone and by talking to the user can assist with orientation. In a more advanced system, a 3G telephone can be used, where users can send pictures or video clips from their surroundings to a support person, who can then assist them more easily. In its more advanced form, the support person has access to an electronic map on a screen, where the user’s position is automatically entered as a point of reference.

Localization

The function ‘localization’ aims at being found if lost and not able to call for help. In principle, there are two ways of achieving this.

One is to use a combination of GPS and mobile communication in such a way that the user’s own mobile telephone automatically transmits information to a service or an alarm centre, where the position is shown on a map on a screen terminal.

The other way is radio direction finding, which means that a transmitter position is located with the help of one or more antennae for radio direction finding.In this case, the user has to wear a special transmitter designed much like a wrist watch. These transmitters can be activated via Minicall (an RF-based technology, used for distribution, e.g. of text messages on 169,800 MHz), after which the transmitted radio signals can be picked up by a special radio direction finding receiver.

Users have stressed the importance of it being possible to locate them when they have lost the capacity to orientate themselves during a journey. Methods for position determination on a map on a computer screen through, for example, a service centre have been developed and implemented by, among others, the Swedish police.

Indoor navigation issues

One condition for the use of a GPS receiver is that it can be reached by signals from at least three satellites. Basically, a clear line of sight to the satellites is required from the receiver as the signal strength is very weak. This means that reception indoors cannot be deemed reliable. AGPS can to some extent be used for indoor orientation. More reliable, however, is an inertial navigation system – gyrocompass and accelerometer in combination with a system for 'dead reckoning' – to keep track of where someone is located. However, the situation may rapidly change. Technology is developing towards increasingly sensitive receivers and, a s mentioned earlier, the European Galileo system will allow reception where the current GPS system is too weak. However, it is wise for the moment to rely in practice on other methods for indoor navigation.

The most obvious is to rely on transmissions of radio signals from locally placed transmitters, for example, in shopping centres and arcades. – The disadvantage of this method is that it requires the placement of transmitters at many sites. This requires a great deal of organization, standardization, maintenance, etc., something which has not been established completely today.

Many of the technologies and assistive devices that have been described in the Landmarks section are of course also applicable for indoor environments. This applies not least to maps, which can be essential to enable users to find their way around shopping centres and arcades.

The user’s device

User devices utilizing the GPS system have been on the market for several years, and today stationary as well as hand held navigators are available for private use in boats and cars. Most of them are dedicated for the purpose, integrated in a device with a screen in full colour. There are also some separate units to be connected to PDAs (Personal Digital Assistants) or Mobile telephones.


An example of a hand held GPS navigator

Figure 2.3 An example of a hand held GPS navigator.

There are ergonomic advantages with the integrated solutions. The drawback is that it might be difficult to find an optimal position for the GPS receiver at the same time as the device should be manipulated or the screen read. An interesting compromise has been developed in the Canadian Trekker, where the GPS receiver is mounted on a belt to be hung on the shoulder and the processing device – in this case a PDA – is positioned at hand level. The output device – in this case a loudspeaker – is also mounted on the belt close to the ear of the user.


Victor Trekker, designed and manufactured by Canada-based company VisuAid, was launched in March 2003

Figure 2.4 Victor Trekker, designed and manufactured by Canada-based company VisuAid, was launched in March 2003.


The Trekker solution allows for independent navigation but does not supply any service or alarm function. For this a separate mobile phone has to be used.

Swedish activities

In Sweden, a study aiming at initiating a few trials was made in 2005.In the study, possible technologies were investigated and planned and on-going activities as well as available technical equipment identified. Also, representatives from Swedish handicap organizations were interviewed and given the possibility to put forward demands and desires on equipment and system.

Among other things it became clear that the functionality, reliability and easy-tohandle matters were priority tasks. Also, all interviewed persons wanted a kind of “life-line”, i.e. the possibility to get help if the track was lost, some unexpected obstacles appeared or an emergency situation came about. Therefore, there was a demand for a kind of service centre which could be reached via a mobile telephone, preferably with video transmission facilities.

The persons interviewed also pointed out that they did not want another technical gadget to take care of, but preferably a mobile telephone with built in facilities for GPS-navigation and access to RFID- and Bluetooth based information. ( A report, “Navigation, alarming and positioning – A preliminary study conducted in Sweden by the Royal Institute of Technology (KTH), Department of Speech, Music and Hearing in the assignment of the National Post and Telecom Agency (PTS) 2005” is available at www.pts.se/Dokument/dokument.asp?Sectionid=&Itemid=5678&Languageid=EN).
 
Swedish trials

The study revealed three on-going and planned pre-studies. These are localized in the three biggest cities of Sweden – Stockholm, Gothenburg and Malmoe.

In Stockholm, the focus is upon people with visual disabilities. A digital pedestrian map has been developed for an area in the city by name Sodermalm.The intention is to start a study towards the end of 2006 with a small group of people with visual impairments. The technology that will be used in the first phase of the study includes server based map and obstacle data, route planning functionality, a mobile phone and positioning technologies. Later on additional functionality such as individualization of required information, alarm functions and points of interest is intended to be added.

In Gothenburg the primary target group is people with cognitive impairments. Even here the study is intended to start late 2006, and embrace a small group of people to start with. The project will be linked to intentions by the local public transport authorities to facilitate the use of public transport by elderly and people with disabilities.

The study in Malmoe will aim at people with visual impairments as well as those with physical disabilities.

A service centre that can handle alarms and be contacted via the user’s mobile telephone facility will also be included.

A schematic overview of the functions is shown in figure 2.5.


A schematic overview of the planned navigation systems in Sweden

Figure 2.5 A schematic overview of the planned navigation systems in Sweden.


The user – the focal point of the system – is assumed to have impaired vision, hearing, motor or cognitive functions. Software for implementing speech synthesis, speech control and Braille presentation (on a separate display) and the possibility of an individual design and adaptation of the visual presentation on the screen (for example, zooming in and pictograms) are required.

It is assumed that the user has a mobile telephone or handheld computer with mobile communications facilities with the above-mentioned adaptations.There are many different mobile telephones available on the market, appropriate for this purpose, for example Nokia Serial 60 phones, e.g.6630, N70 etc. but also Sony Ericsson UIQ phones or stronger Java phones.

A handheld computer (PDA) is interesting from many perspectives, but must be supplied with a telephony attachment. Only a few have integral telephone functionality. Regardless of what one chooses, a terminal with a digital compass, camera and Bluetooth function is recommended.

The telephones must have open operating systems. Symbian and Windows Mobile can both be used. The latter is more powerful and quicker, but requires more power. Symbian is considered to be preferable, not least because there are many telephone models to choose from with this system.

The telephone is linked to a GPS receiver. This can either be integral or separate. The latter is preferable, first because reception is generally better if the GPS receiver can be placed independently of the handheld communication unit and second for power supply reasons (the batteries last longer).

There is a large variety of software to choose from for navigation, for example various Garmin products, GPS Pilot Tracker, Mapmate, Navicore, Route 66, TomTom, Trekker and Wayfinder.

All systems have their pros and cons. Trekker is specially developed for people with visual impairments. Wayfinder is a system that can offer streamed downloading of route information, if this is required.

There are several digital road databases available in many countries, e.g. Navteq and TeleAtlas. They have limited wealth of details and actuality and are basically intended for vehicular traffic.

In Sweden an effort is made to collect and store more qualified data in a National Road Data Base – NVDB. Currently it’s limited to road information for car drivers. At the beginning of 2007 road information for cyclists will be possible to store in the NVDB.

The local municipalities are building up Local Road Data Bases, LV:s. They have capacity for more sophisticated information, like accurate pedestrian routes, and the information can be frequently up-dated. This work has started in Stockholm with the development of a Digital Pedestrian Foot-path Network (DG).

It’s important to point out, that the system is designed with open border lines so that other implementations can hook on.

Besides resources for autonomous navigation, it is expected that the user will need to communicate with a manned alarm or service centre via a mobile telephone. The centre should be able to take care of both 'soft' calls (including calls from camera mobiles) with oral and visual support information, and 'sharp' calls with a requirement for, for example, the support of rescue units.The alarm/service centre can be one and the same unit or they can be stationed at different locations.It is expected that the alarm centre will have rapid access to the rescue services. The service centre can be anything from a county alarm centre to a relative's home. In any event, it should be possible for all personal details to be extracted from, for instance, a database. It is also important that the alarm/service centre can locate the user.
 
Owing to the wide range of both hardware and software, a final decision on the choice of products must represent a balance between the various pros and cons. The most important thing, besides satisfying the needs of the user as far as possible and comprehensively, is to stick to non-proprietary solutions and, where this is not possible, to conclude contracts with those suppliers who will provide the greatest possible freedom for different component choices.

A system like this is generic and is the basis for all three trials in Sweden.It will be possible to use in any place in Sweden. It will be designed so as to be easily adaptable to local transport information systems.

Conclusion

Many groups of people with disabilities experience problems when moving around in an unknown environment. It has been anticipated that modern satellite navigation systems could form a basis for overcoming most of the problems.

A study has been made in Sweden on this issue. It concludes that there are significant possibilities to improve the situation for the groups in question with the aid of GPS-based navigation, combined with the use of mobile telephony and databases for storing maps, personal information etc. The study also suggests the National Post and Telecom Agency to support three pilot studies in Sweden to explore the pros and cons for a few groups of people with disabilities in the three largest cities in Sweden.

2.2.2 Speech processing

By Klaus Fellbaum and Diamantino Freitas


2.2.2.1 Introduction and state of the art

Communication is an essential part of human life. If communication is disturbed or impossible, the consequences are loneliness and isolation.

It is well known that speech plays a key role in communication and it explains why humans also want to have speech as a means of communication/interaction with computers. Although human-like speech dialogue with computers is still far off, even with current state-of-the-art technology, the benefits and potential of speech processing are obvious. As will be seen in the next sections, this is especially true in applications for persons with disabilities. Well-known examples are reading machines for blind people, voice control for wheel chairs or speech-based dictation systems for physically impaired computer users.

This chapter presents some new applications for speech-based systems that are (partly) still at the research or prototype stage.Since some of our readers may not be familiar with the principles of electronic speech processing and the state of the art, our presentation will start with some relevant basic definitions.

Speech recognition or equivalently voice recognition is the automatic recognition of spoken words or sentences by a machine. In many cases the result of the recognition is a displayed text and then the terms voice-to-text or dictation system are used. Other important areas for speech recognition are systems for the recognition of spoken commands and the control of basic functions of a personal computer.

There are three main modes for speech recognition.

a) Isolated word recognition up to a vocabulary in the order of 50 000 words and more is on the market. Most of the systems have to be trained before they reach a good level of reliability (up to 98 to 99% correct recognition in controlled environments) or they are speaker-adaptive, that means, at the beginning the recognition accuracy is very moderate, but after intensive use it continues to improve and the accuracy can also reach up to 98...99%.

b) Word spotting or key word recognition is another form of recognition with the aim of recognizing key words in continuous speech.Let us consider, for example, a flight information dialogue system where a user wants to know when is the next flight to Brussels, he might ask in a different way like: ‘next flight to Brussels’ or ‘when will be the next flight to Brussels?’ or ‘please give me the next flight to Brussels’. In all of theses cases the key words are obviously ‘next’ and ‘Brussels’ and the rest of the words are not relevant. The advantage of word spotting is that the flight destination can be formulated as desired which makes the dialogue much more user friendly.

c) Continuous speech recognition has also reached market maturity but the recognition accuracy still leaves to be desired as regards robustness. The main applications for continuous speech recognition are dictation systems which can recognize more than 1 Million word forms.The term ‘word forms’ is not equivalent to words. It has to be noted that most words may appear in different forms (basic form, flexions, different tenses etc.) and each word form has to be considered as another word (pattern).That’s why such a high number of word forms is needed for ordinary office vocabulary.

A serious problem of all speech recognizers is their sensitivity to noise.However, for certain applications in noisy environments (factory floor, aeroplane cockpit, cars in heavy traffic) very robust recognizers have been developed, but the vocabulary is of moderate size (in the order of some hundred words, isolated mode). This is, on the other hand, not very restricting because the vocabulary being used in such situations is rather limited anyway.

Speaker recognition tries to identify and/or verify the identity of the speaking person and is applied in many security-sensitive situations such as access control to secured areas or bank transactions. State of the art systems have an accuracy (correct recognition) of up to 98%.

Speech replay is the speech reproduction by a technical system (computer etc.). The speech being used was spoken in advance by a person and then stored in a fixed memory or disk. Typical applications are announcement systems (e.g. in public transportation) or system messages.A significant characteristic of a replay system is its limited vocabulary. The speech quality is usually good, in principle it can be increased to a high-quality level, this is only a question of the amount invested in the recording equipment and the storage capacity. It is important to mention that the adequate quality level strongly depends on the application [Jekosch, 2005].For example, a user accepts a lower quality in a telephone conversation than in a radio announcement.

Speech synthesis has, in contrast to speech replay, an unlimited vocabulary. The speech is concatenated artificially from more or less short speech elements like phonemes or diphones or even longer segments.Although speech synthesis has reached an advanced level of maturity, it still suffers from an audible ‘machine accent’ but since the intelligibility (not necessarily the naturalness!) of synthesized speech is comparable to natural speech, this kind of speech is usable in many practical applications. As a well-known example the screen readers for blind people can be mentioned.

A very important parameter which strongly influences overall speech quality (in both speech replay as well as speech synthesis) is intonation or, more generally, prosody. It is composed of several speech features such as intonation, speed and rhythm, pauses, intensity and is connected to other features such as voice quality (breathy, modal, creaky, etc ). All these features, together as a whole multidimensional set, carry so-called supra-segmental information to the utterance that enriches the meaning and can make speech human-like and intelligent. Prosody is the underlying speech layer that conveys pragmatic information. It can also provide para-linguistic and non-linguistic information like intentional and emotional information, respectively [Botinis, 1997].

For more details about the principles of electronic speech processing, the interested reader is referred to the literature; recommended are for example [Furui, 2001], [Gardner-Bonneau, 1999], [Vary, 2006] and, for an extended description of the mathematic principles of speech processing, [Deller, 2000].


2.2.2.2 Speech-based applications for persons with disabilities

Advances in synthetic speech

Multilingual speech synthesis

We are living in a multilingual world. Especially in Europe, different languages are closely related and usually we are trained from school to speak different languages. The same situation exists with written documents or websites. It is thus obvious that most speech synthesis applications (for example enquiry systems or reading machines for blind persons) have to be multilingual.

There are several multilingual systems on the market. One of these was produced by the Bell Laboratories (AT&T, Murray Hill, New Jersey). It functions as a synthesizer for English, French, Spanish, Italian, German, Russian, Romanian, Chinese and Japanese. Interestingly, the underlying software for both linguistic analysis and speech generation is identical for all languages, with the exception of English. However, it is clear that the acoustic elements, used for the concatenation into continuous synthetic speech, must be spoken from a native speaker and thus this part of synthesis is language-dependent. The same holds for the base data of linguistic analysis. However, these components are stored externally in tables and parameter files and they can be loaded when needed in real-time. A detailed description of the AT&T Synthesis can be found in the book of Sproat [Sproat, 1998]; the synthesis of different languages is demonstrated on [Sythesis testsite AT&T].

Another system which became very popular in the speech synthesis society is the MBRO LA system. “The aim of the MBROLA project, initiated by the TCTS Lab of the Faculté Polytechnique de Mons (Belgium), is to obtain a set of speech synthesizers for as many languages as possible, and provide them free for non-commercial applications. The ultimate goal is to boost academic research on speech synthesis, and particularly on prosody generation, known as one of the biggest challenges taken up by Text-To-Speech synthesizers for the years to come.” More details and demos are presented on the home page of MBROLA [MBROLA].

Emotional speech

Emotional speech can remarkably extend the content and expression of spoken information. Moreover, sometimes the way how items are expressed is more important than what is expressed. The key parameter which determines the emotional content is the prosody as discussed before.

A great deal of work has been done in the recognition and production of emotional speech; among others, there was the EU FP6-IST project HUMAINE (Human-Machine Interaction Network on Emotion).For more information visit the home page which is under [HUMAINE].

In a man-machine communication, let’s consider a speech-based dialogue system, emotional speech can be used in two directions:

a) The user speaks with emotions and the system has to recognize these emotions in addition to the ‘pure’ speech recognition.As an example, a situation might occur where the system does not sufficiently recognize the user and reacts in an unsatisfying manner. This is very often annoying and leads to an angry voice. If this angriness is recognized by the system, then it might be wise for it to react with excuses and/or an explanation why the recognition failed [Lee, 2002].
 
b) If the system produces speech (be it stored or synthetic speech), it can in principle be used to express emotions.Everyone has a need to transmit emotions. But if we think of deaf persons or those with severe speech disorders or people suffering from muscular dystrophies and cerebral diseases that often have also aphasia along with body paralysis, these persons are unable to express their emotions through speech although they have a strong desire to do so.

Several research groups have investigated emotional speech. Concerning the speech quality and, above all, the naturalness of the recognizability of the emotions, the results are encouraging; see for example [Burghardt, 2006].

Iida, Campbell and Yasumura [Lida, 1998] describe an application concept of an affective communication system for people with disabilities who cannot by themselves express their emotions.They get help from some buttons for the selection of emotions. These ‘emoticons’ are very helpful and they can be easily added to an ordinary text-to-speech synthesis (figure 2.6).

Emoticons - emotion keys

Figure 2.6 EMOTICONS (Emotion keys).


In many cases, a user (who cannot speak as well as a normal speaking person) needs a synthesizer to produce a specific voice from a selected person or with specific features. The underlying concept which fulfills this requirement is called voice personalisation. This facility is very interesting when there is the need to transmit synthetic speech from a text given by a specific person. Voic e personalisation is nowadays available at a constantly decreasing cost with the advent of statistical speech-model-based speech synthesizers [Barros, 2005].

Support of a speech conversation for hard of hearing or deaf persons

In this application two persons have, for example, a telephone conversation.One person has normal hearing, the other has a severe hearing impairments. The idea is now to support the hard of hearing person with additional visual information, either in the form of an animated face or as text or in both forms which are presented on a screen (figure 2.7).

The technical implementation works as follows. The speech of the normal hearing person is automatically recognized by a high-level speech recognition system. The result is a text which can be displayed. In the next processing step the text is converted into control parameters for a talking head. At least the person with hearing problems can receive the message in three versions: as original speech, as text and as an animated face. It is assumed here that the hard of hearing person speaks normally, which is quite common.


A telephone conversation where a hard of hearing person uses additional visual information

Figure 2.7 Telephone conversation, the partner on the right is hard of hearing.


If the person is deaf, he or she will not have serious problems to understand the message by reading the text and watching the animated face. But problems arise when the deaf person wants to respond to the normal hearing person. This problem will be discussed in the next section.

There are several research projects dealing with speech to text or speech to animated faces. One of it is SYNFACE which was developed at the KTH in Stockholm until 2004 [SYNFACE, 2005].In the meantime it has become a commercial product. The speech recognition is based on phoneme recognition and a speech synthesizer activates the talking head, mainly the lips.The movements of the talking head are synchronized with the telephone speech and thus the listener can directly complete the part of the information which he or she does not hear.

A similar system that is on the market is iCommunicator [icommunicator]. The system aims mainly at the group of deaf persons, but also at those who are hard of hearing.The kernel of the system is the Dragon Naturally Speaking Professional Engine [DRAGON, 2006], at the moment one of the best and most powerful speech-to-text systems on the market. iCommunicator runs on a higher end laptop computer. Among other features, iCommunicator converts in real-time, speech to text, speech to video sign language, speech to computer generated voice, text to computer-generated voice or to video sign language.

A third system, which can be mentioned here, was developed in a project called MUSSLAP at the University of West Bohemia in Pilsen, Czech Republic. One of the outcomes was a real-time recognizer which presents its results as text on the screen.As a very impressive example, an ice hockey match is shown on a tv screen and the system automatically recognizes the comments of the reporter and displays the result as text in real-time [MUSSLAP].

Speech processing for the communication of a deaf person

If deaf persons communicate over a distance (telecommunication), a very common method since a long time is text telephony or fax which also has the advantage that the communication between deaf and normal hearing persons is possible without any problem. For several years, SMS has also served as a cheap and widespread communication tool. Above all, the Internet with its many services (for example Web and email) has dramatically widened the communication in general and specially between deaf and normal hearing persons.

On the other hand, text communication has some drawbacks: text information is rather impersonal and the typing procedure is laborious and time consuming and not all deaf people have a sufficiently high level of understanding of written language to be able to access text.

For these reasons most deaf persons prefer sign language communication.This form of communication has remarkable advantages:
The adequate tool for a sign language communication is obviously video telephony, mostly using a standard like H.320 which is also compatible to ISDN. With the advent of UMTS (3G) and WLAN, a mobile video communication became reality. In both cases usually relay services are applied to connect deaf users, but also, with the aid of a sign language interpreter, deaf and normal hearing subscribers can (indirectly) communicate.

Several projects exist which work on sign language transmission.One is the European IST project WISDOM (Wireless Information Services for Deaf people On the Move, lifetime from 2000 to 2004), in which several advanced wireless services for the Deaf were developed and evaluated [WISDOM].

The situation is different when a direct (face-to-face) situation between a normal hearing and a deaf person is considered.As a first observation it comes out that the communication is obviously much easier from the side of the deaf person because he or she has learned to understand a speaking person by lip reading and watching face expressions and gestures. Although this special form of ‘human speech recognition’ is never perfect (among other reasons because some sounds are invisibly produced inside the mouth), often fragmented utterances can be completed by the context.It is interesting to state that the recognition of emotions works rather well by watching the face movements and gestures.

For an additional support of the communication process for the deaf person, a speech to text and/or a speech to sign language transformation, as described in the previous section, might be useful. The result of such a transformation can be presented on a display or, more advanced; it could be beamed to little mirrors in the spectacles of the deaf person.

But looking at the other direction: what about the normal hearing person who does not understand sign language?

If we imagine this situation, we can state that - even without any knowledge of sign language – valuable information is transferred about the intention of the deaf person and his/her emotions when we watch gestures, mimic, body movements and other kinds of visual information. In this respect, the situation is similar to those of the other communication direction (from the speaking to the deaf person). The key problem is the recognition of the objective, content-carrying part of the message. For this we can come back to the relay service solution.The deaf person has a camera (maybe as a part of a mobilephone) which records the gestures to the interpreter who translates them into speech, which is then audible for the hearing person.This procedure works well as several projects (also the WISDOM project) have shown, but the problem here is the availability of the interpreter and the fact that a face-to-face situation often happens unforeseen.

Obviously a better solution would involve an automatic gesture recognition which transforms gestures into synthetic speech. In this case the normal hearing person receives the information of the deaf person twice: as gestures and as voice and both forms of information complete each other. There is no need for emotional synthetic speech because emotions are optimally expressed by gestures and the face, as mentioned before.

It is important to state that automatic gesture recognition or, more extended, automatic sign language recognition, is probably one of the most difficult research tasks in the area of communication aids. Difficulties are:

The first systems for sign laguage recognition were based on the data glove(s). These gloves are well-known tools, mostly used in the Artificial Intelligence research and in entertainment applications.The advantage of such a glove is the precision with which hand positions are recognized.But for many situations in the daily life, the use of gloves might be too uncomfortable.

A better (but much more complicated) alternative are video-based systems. The deaf person uses sign language, a video camera recognizes gestures and facial movements (above all lip movements) and as result of the video processing, the sign language is transformed into text which can be displayed somewhere and/or the text can be transformed into synthetic speech.A very detailed description of problems and solutions in that area are presented in a recently published book on human interaction with machines [Kraiss, 2006].

We will now briefly mention some research projects.

In the framework of the European IST research program ARTHUR, the Lab. of Computer Vision and Media Technology, Aalborg University Denmark investigated the automatic recognition of hand gestures used in wearable Human Computer Interfaces [Moeslund, 2003]. Different gesture detection devices are described, among others the ‘classical’ data glove and reduced versions of it (index finger tracker with a wired or wireless connection to the receiver), a ‘Gesture Wrist’, a ‘Gesture Pendant’ and, of course, camera solutions.

A famous researcher, Christian Vogler, who is deaf himself, has made his PhD in automatic recognition of American Sign Language (ASL). He describes the problem of simultaneous events in sign language (for example, the handshape can change at the same time as the hand moves from one location to another, or hand(s) and face express signs simultaneously). Another important aspect is the segmentation of the continuous stream of movements into discrete signs and the breaking-down of signs into their constituent phonemes.If this works satisfactorily, the next steps, namely transformation of signs into text and then into synthetic speech, are relatively easy to manage. For more information see [Vogler, 2000].

Thad Starner and his group from Georgia University of Technology, Atlanta USA, are working on several projects in American Sign Language recognition. They use multiple sensors for the recognition, among others a hat-mounted video camera and accelerometers with three degrees of freedom mounted on the wrist and torso to increase the information of the video camera.For control reasons, the deaf user has a head-mounted display which shows what the camera captures [Brasher, 2003]. The aim of the activities is a flexible mobile system for the output of text or speech, depending on the application. Figure 2.8 shows the head-mounted camera and a recorded gesture.


Figure 2.8 Base-cab-mounted camera and a recorded gesture
(with kind permission of Thad Starner, Media Lab, MIT).


Visual and audio-visual speech recognition based on face or lip reading

A methodology which is quite similar to gesture recognition, mentioned before, is automatic facial reading or lip reading. The result is a text sequence which represents the content of the utterance.

The automatic recognition of facial images has been used for a number of years for the improvement of a (spoken) speech recognition under noisy conditions and it has been proved to be very successful [Kraiss, 2006], [Moura, 2006], although the accuracy, obtained with purely visual speech recognition, is not as high as in audio speech recognition. There are a number of reasons for this; one is that visual speech is partially phonetically ambiguous.

Nevertheless, for the communication between deaf and normal hearing persons, facial or lip reading is a very valuable help and, as previously mentioned, the human face can optimally express emotions and this information is dectectable for the visual recognizer.

Small-vocabulary preliminary trials have been reported [Moura, 2006] to obtain word recognition rates of about 65% for a one speaker lip-reading task with grammar correction. Interestingly, the performance of professional observer was in the range of 70%-80% for the same corpus.Figure 2.10 shows the situation under remarkable noise conditions and it demonstrates the advantage (in terms of recognized words error rate – WER) of a simple combination in a multi-stream recognition approach [Moura, 2006].


Chart showing the variation of the total word error rate in function of the signal-to-noise ratio

Figure 2.10 Variation of the total word error rate in function of the signal-to-noise ratio.


Correction of speech defects, unintelligible speech

If a person is unable to speak ‘normally’ resulting in unsatisfactory intelligibility, a speech recognition and synthesis system can be a valuable aid. The impaired speech is the input for the recognizer, which converts it into text and the text is then converted into clean synthetic speech.

It is very important to state that even totally unintelligible speech or any acoustic utterance can be recognized, the only prerequisite is the ability of the ‘speaker’ to reproduce utterances with sufficient similarities and to train the recognizer with this kind of ‘vocabulary’. As a matter of fact, even emotions can be expressed, using emotional speech synthesis. Finally, visual speech recognition, as mentioned before, can significantly contribute to better speech recognition.

A system for speech therapy

It is well known that many deaf persons have fully functioning speech organs but the problem is that they cannot control articulation because they do not have acoustic feedback through the ears.

When the deafness occurred after the complete language/speech acquisition, the deaf person can maintain (with restrictions) his/her speaking ability with the help of a speech therapist. But there is the necessity of a permanent training with a therapist which is obviously not always possible.

Many attempts have been made to develop systems which perform a visual control of a spoken utterance. The time signal or the spectrum of the speech are not very suitable because the relation between the sound production and the resulting signal is rather complicated and abstract.

A better solution is obviously a face animation showing two speaking faces: the ‘reference’ face and the (deaf) speaker’s face. Thus the deaf person can directly see deviations between the two faces and he or she can try to adapt. Since some sounds are produced invisibly inside the mouth, as mentioned earlier, a useful help is a transparent mouth region (figure 2.11).


Face animation with a transparent area of the mouth region

Figure 2.11 Face animation with a transparent area of the mouth region [Pritsch, 2005].


Screen readers for blind or partially sighted persons

The usual computer desktop metaphor practically leaves blind persons out because it is a Graphical User Interface (GUI), based on a more or less rich graphic display of icons, windows, pointers and text. Since blind persons require non-visual media, the alternative is, among tactile information (Braille), primarily an aural interface which can be called, analogous to GUI, Aural User Interface (AUI), based on the terminology supported by many authors including T.V. Raman [Raman, 1997].

Since the early 80’s, after some trials with special versions of self-voicing software, capable of driving a speech synthesizer and so providing access for blind persons, a more general concept appeared and a family of applications, called screen-readers, was initiated with the purpose of creating a vocal rendering of the contents of the screen under user control through the keyboard, using a text-tospeech converter [Wikipedia]. In this way, properly installed screen reader software stays active in the operating system and operates in the background, analysing the actual contents of the screen. From the initial command-line interface (CLI) to the now existing ubiquitous graphical user interface (GUI) screen reader software has evolved much in 2,5 decades.

Screen readers can also analyse many visual constructs like menus and alert or dialogue boxes and transform them into speech to allow interaction with a blind user.

Navigation in the screen is possible as well, to allow a non-linear or even random exploration and acquisition of the depicted information. Control of the produced speech is normally given to the user so that quite fast navigation becomes possible when the user works with shortcuts. A simulation of a screen reader is available at the WebAIM website [WebAIM].

Although many screen reader applications exist, there are many limitations that current screen readers cannot overcome per se, for instance those related to images and structured text (tables etc.). Screen readers cannot describe images, they can only produce a readout of a textual description of these and the user has problems to realize how the page is organized.

The basic requirement in terms of speech processing for screen reader applications is a robust text-to-speech converter with the possibilities of spelling and reading random individual characters and all kinds of text elements that may appear like numeric expressions, abbreviations, acronyms and other coded elements. Punctuation is also spoken in general, besides being determinant in introducing some prosodic manipulation in the synthetic voice.
 
Following this idea, the World Wide Web Consortium (W3C) in 1998, with the issue of the Cascading Style Sheet 2 (CSS2) recommendation, has introduced the Aural Cascading Style Sheet (ACSS); a chapter respective to the acoustical rendering of a web page is presented in [WDAC].

Auditory icons, sometimes also called earcons, are made audible to the user by means of a loudspeaker or earphone system that should have advanced acoustic features (high quality, stereo etc.). The acoustic elements contain voice properties like speech-rate, voice-family, pitch, pitch-range, stress, and others that are used as command parameters to the speech synthesizer.

An extended investigation of spatial acoustic features as a component of a screen reader was performed in the GUIB (Graphical User Interfaces for the Blind) project in the framework of the European TIDE initiative [Crispien, 1995]. The idea was to generate an acoustic screen in front of the user on which windows, icons and other graphic elements are audible on different places, and the mouse position is also audible when the mouse is moving.

In a former project (AudioBrowser, 2003-2005, see [Repositorium]), developed for Portuguese, but applicable for most other languages, the structure or outline of a web page can be discovered and used as a table of contents, and it was implemented successfully. The user in this application can freely navigate inside the contents of each window or jump between windows from contents to tables of contents or vice-versa in order to scan or navigate through the page in a more structured and friendly way. The blind or low-vision user is constantly helped by the text-to-speech device that follows the navigation accurately.

The W3C consortium, through its Web Accessibility Initiative (WAI) has been issuing a relevant set of Web contents accessibility guidelines (WCAG), now in version 2. These guidelines are greatly helpful in orienting web page design for accessibility [WAI]. Authoring Tool Accessibility Guidelines (ATAG), nowadays in version 2.0, are also important for developers of authoring tools.

Reproduction of complex documents for blind persons

Complex documents like mathematical and other scientific, technical or even didactic documents are usually equipped with graphical representations. Above all, equations and other mathematical expressions have posed a substantial barrier to the access by visually impaired persons. Most representations and charts may also be included in this group.
 
Representation in special Braille codes of complex mathematical elements can almost totally solve the problem for blind persons. The LAMBDA project [LAMBDA, 2005] has produced a mathematical rendering package using such a system.

In the case of more lengthy mathematical objects, more refined solutions might be preferable using audio rendering of the mathematical expressions through synthetic speech.Using the codification of the expression in MathML, a browsable textual description of the expression can be automatically derived from the MathML code by means of a special lexicon and a grammar. Both must be specially designed for the purpose according to the mathematical conventions and concerns of non-ambiguity of the textual description. This work has been carried out in the AUDIOMATH project [Ferreira, 2005] carried out at the Faculdade de Engenharia da Universidade do Porto. A demonstration page is available at [Ferreira].

Acoustical cues, contributing to the clearness of the speech rendering, are also important.Previous authors have used, for instance, prosodic modifications such as raising or lowering the pitch of the synthetic voice to signal upper or lower parts of the expression, respectively. In the work of AUDIOMATH the influence of pitch movements as well as of pauses during description of expressions was studied and rules were extracted. An intra-formula navigation mechanism was designed in order to allow the user to explore the formula at her/his own will thereby not putting too much stress on audio memory in the case of longer formulas.


2.2.2.3 Conclusions and future developments

The aim of this chapter was to show how electronic speech processing works and how persons with disabilities can benefit from it.

Since speech is man’s most important form of communication, all efforts must be done to make speech communication possible, and if the speech channel is disturbed, technical solutions have to be found to overcome the obstacles.

The accuracy and quality of modern speech recognition systems as well as synthesis systems has reached a state of maturity which allows the development of very poweful support systems for persons with disabilities and to bridge the gap between these persons and those without disabilities, as was shown, for example, between deaf persons and the rest of the world.

Looking into the future of speech technology, some important research areas can be identified as follows:

It should be mentioned here that the enumeration given in this chapter from being complete. Further examples will be given in other chapters, showing that speech technology and speech applications will play a dominant role whenever communication is discussed.


2.2.2.4 References

BARROS M.J., MAIA R., TOKUDA, K.RESENDE, F.G., FREITAS, D., (2005). HMM-based European Portuguese TTS System, artigo apresentado e publicado nas actas da Interspeech'2005 - Eurospeech — 9th European Conference on Speech Communication and Technology, Lisboa.

BOTINIS (ed.) et al., (1997). Intonation:Theory, Models and Applications. Proceedings of the ESCA Worksop Sept.18-20 Athens, Greece.

BRASHER, H., STARNER, T. et al., (2003). Using Multiple Sensors for Mobile Sign Language Recognition. ISCW White Plains, WA, Also: www-static.cc.gatech.edu/~thad/031_research.htm

BURG HARDT, F. et al., ( 2006). Examples of synthesized emotional speech http://emosamples.syntheticspeech.de/

CRISPIEN, K., FELLBAUM, K.(1995). Use of Acoustic Information in Screen Reader Programs for Blind Computer Users: Results from the TIDE Project GUIB. In: Placencia Porrerro, I.,& de la Bellacasa, R.P., (Eds.):The European Context for Assistive Technology - Proceedings of the 2nd TIDE Congress, Paris, IOS Press, Amsterdam.

DELLER, J.R., (2000). Discrete-time processing of speech signals. New York :Institute of Electrical and Electronics Engineers.

DRAGON Naturally Speaking Professional Engine, (2006). NUANCE communications www.nuance.com/naturallyspeaking/.

FERREIRA, H., FREITAS, D., (2005). AudioMath—Towards Automatic Readings of Mathematical Expressions”, 11th International Conference on Human Computer Interaction, Las Vegas, EUA.

FERREIRA. http://lpf-esi.fe.up.pt/~audiomath

FURUI, S., (2001). Digital speech processing, synthesis, and recognition 2nd ed., rev. and expanded. New York :Marcel Dekker.
 
GARDNER-BONNEAU, D., (1999). Human Factors and Voice Interactive Systems. Kluwer Academic Publishers, Boston.

HUMANE, Network of Excellence. http://emotion-research.net/aboutHUMAINE. [iCommunicator homepage. www.myicommunicator.com/].

IIDA, A., CAMPBELL, N., YASUMURA, M.(1998)., Emotional Speech as an Effective Interface for People with Special Needs, apchi, p. 266, Third Asian Pacific Comp. and Human Interaction.

JEKOSCH, U., (2005). Voice and Speech Quality Perception. Springer-Verlag Berlin, Heidelberg.

KRAISS, K.F., (ed.), (2006). Advanced Man-Machine Interaction. Springer Berlin Heidelberg, New York.

SYNFACE project research page www.speech.kth.se/synface/.

LEE, C.M., PIERACCINI, R., (2002). Combining Acoustic and Language Information for Emotion Recognition. Proc.of the International Conference on Speech and Language Processing (ICSLP 2002).Denver, Co.

LAMBDA (2005). www.lambdaproject.org/.

MB ROLA website http://tcts.fpms.ac.be/synthesis/.

MOES LUND, T., NORGAARD, L., (2003). A Brief Overview of Hand Gestures used in Wearable Human Computer Interfaces. Technical Report CVMT 03-02, Computer Vision and Media Technology Lab., Aalborg University, DK.

MOURA A., PÊRA V., FREITAS, D., (2006). (in Portuguese) Um Sistema de Reconhecimento Automático de Fala para Pessoas Portadoras de Deficiência”, artigo publicado nas actas da conferência IBERDISCAP’06, realizada em VitóriaES, Brasil.

MUSSLAP. University of West Bohemia, MUSSLAP website www.musslap.zcu.cz/en/audio-visual-speech-recognition/.

PRITSCH, M., (2005). Visual speech training system for deaf persons.Proceedings of the 16th Conference Joined with the 15th Czech-German Workshop “Speech Processing, Prague, Sept.26-28, 2005.TUD press Dresden, Germany.

RAMAN, T.V., (1997). Auditory User Interfaces, Kluwer Academic Publishers, August.
 
RAMAN, T.V., (1998). Conversational gestures for direct manipulation on the audio desktop, Proceedings of the third international ACM SIGACCESS Conference on Assistive Technologies, Marina del Rey, California, United States, pgs 51 – 58. ISBN:1-58113-020-1.

REPOSTIRORUIM. https://repositorium.sdum.uminho.pt/bitstream/1822/761/4/iceis04.pdf

SPROAT, R.(ed.) (1998). :Multilingual Text-to-Speech Synthesis. Kluwer Academic Publishers. Dordrecht, Boston, London.

SYNFACE - Synthesised talking face derived from speech for hard of hearing users of voice channels
www.speech.kth.se/synface/ and http://www.synface.net/.

SYNTHESIS TESTSITE, AT&T. www.research.att.com/—ttsweb/tts/demo.php.

VARY, P., MARTIN, R., (2006). Digital Speech Transmission.Enhancement, Coding and Error Concealment.J. Wiley&Sons.

VOGLER, C. et al. A Framework for Motor Recognition with Applications to American Sign Language and Gait Recognition. www.cis.upenn.edu/—hms/2000/humo00.pdf
see also Vogler’s homepage http://gri.gallaudet.edu/—cvogler/research/.

WAI.Web accessibility homepage. www.w3.org/WAI/

WDAC (1999). Aural Cascading Style Sheets (ACSS), W3C Working Draft www.w3.org/TR/WD-acss.

WebAIM Screen Reader Simulation. www.webaim.org/simulations/screenreader.php

Wikipedia about screenreader http://en.wikipedia.org/wiki/Screen_reader

WISDOM project page. www.bris.ac.uk/news/2001/wisdom.htm.


2.3. New remote services

2.3.1 Novel broadband-based services: new opportunities for people with disabilities

Broadband trials by the National Post and Telecom Agency (Post- och telestyrelsen PTS), in Sweden

By Patrik Bystedt


PTS seven broadband trials

Broadband technology has become accessible for a steadily increasing proportion of the population in Sweden.With the aid of more rapid data transmission it has become possible to send and receive large quantities of information via computer networks. The opportunities for communication have broadened with e-mail, chat and video communications in real time. It has become easier to choose the means of communication that best suits each individual. For people whose opportunities for communication are limited owing to a disability, IT technology in general and broadband in particular can often make things much easier.

In December 2001, the National Post and Telecom Agency (Post- och telestyrelsen, PTS), in Sweden was commissioned by the Government to conduct a number of trial operations where broadband technology was utilised to create new services for people with a disability. One important issue was how the new technology could be used and target-group adapted for these groups. The following seven trials have been conducted:

A common feature of these trials is that standardised technology has been used to the greatest extent possible. By using, whenever possible, existing aids and standard equipment, such as web cameras and ordinary personal computers, these solutions prove more cost effective for society and users.

A socioeconomic evaluation of these trial services has been undertaken with the assistance of the Center for Medical Technology Evaluation at Linköping University. The so-called ‘ICF-model’ (International Classification of Disability, Functioning and Health) has been used as an evaluation model.

Service centre for people who are deafblind

Being deafblind involves special problems that affect everyday life, for example, reading food packaging, trying to find something you have lost that is actually lying right under your nose, or quite simply checking whether you are neatly and properly dressed.

Communication with others is a common problem for people who are deafblind. This can sometimes be resolved with the help of a person with normal vision, a personal assistant, a relative or someone who can be around to help and who can communicate with a deafblind person.

The trial ‘Service centre for people who are deafblind’ aims to act as a supplement to this, by a person who is deafblind being able to make use of technology to get help in those cases where it suits them.Many situations can be solved rapidly and easily with the remote service. This means that people who are deafblind will not be so dependent upon help from people in their immediate surroundings.

In the trial conducted by the Association of the Swedish Deafblind (FSDB), a service was developed whereby a person who is deafblind can communicate with a manned service centre. With the aid of a computer-based terminal with cameras, the person who is deafblind can contact the service centre via broadband.The conversation is conducted through pictures, text and speech using the combination that is most suitable for the deafblind person.

The most common kind of conversation comprises the person who is deafblind using sign language to speak with the service centre, which responds with text.The user reads the text with the aid of a Braille display which is connected to the computer. If the user has residual vision or hearing, the service centre can also sign or speak. The controlling information can also be provided in another form, for example, by a vibrating device that the user can feel on their body.
 
With the aid of cameras, the user can be seen or display an object at home that the service centre can see. A moving, zoomable camera is used to show various parts of the room, for example, to find a particular object or in order to read the text on a jar.

During the trial period, a service centre was established and manned Monday to Friday, 08.00 to 17.00, by personnel with backgrounds as sign language interpreters. The trial has been received positively by the four deafblind persons who used the service. The assessment was made that this method of communication provides people who are deafblind with new opportunities to communicate with everybody and that the services provided by the service centre are important and valuable. It is also considered that people with visual impairment could benefit from a similar service centre.
 
Project Facts

Technology
Computer connected to broadband with software for total communication, screen reader program and speech synthesis. Braille display, a stationary analogue video camera for communication and a moving, zoomable and remotely controlled camera connected to a video server for presentations.

Target group
FSDB has 400 deafblind members, but there are also deafblind people who are not members. Another potential target group is people with visual impairment.

Number of users
4

Project period
February 2003 - March 2004

Distance education for people with mild aphasia

It is becoming increasingly difficult to justify why students should live away from home in order to participate in residential courses at folk high school. Distance courses allow students to continue living at home while they are studying, which is economical and often socially advantageous. If a prospective student has a disability, the need to be able to remain in the home environment increases while the need for contact with the outside world must also be met.People with aphasia are often affected by a combination of disabilities, primarily difficulties with communication, both spoken and written, together with some impaired motor functions.

The aim of the distance education trial was primarily to consider adapted forms for distance instruction using the best possible broadband technology available. The vision was to develop and expand the work within distance education so that people with mild aphasia would be given new opportunities for education and personal development. This provides participants with the opportunity to attain an enhanced quality of life and in some cases to return to working life.
 
For the group ‘people with mild aphasia’, the possibility of combining speech and pictures is important to be able to communicate as effectively as possible. To share documents, make use of a ‘whiteboard’ and make presentations using a computer were common components for the training.

Karlskoga Folk High School, which was responsible for the implementation of the project, has long experience of teaching people with aphasia. As part of the trial, they have conducted distance education at scheduled times in, among other subjects, Swedish and presentation techniques. Eight participants from various parts of Sweden participated in the trial. Regular tuition was provided, three times a week, with positive results, and the participants also made use of the opportunity for sound and video contact for their own discussions with each other. The social aspects of being able to use video conferencing to communicate with other people with aphasia outside the teaching, has also been very much appreciated by participants in the trial project.

The trial enhances the availability of effective adult education and means that it is also easy to reuse the courses that have been prepared.
 
Project Facts

Technology
Computer connected to broadband with web camera and headset, video conference software (Click-to-Meet).

Target group
People with aphasia.

Number of users
8

Project period
July 2002 - December 2003

Digital distribution of talking books to university students

University students with a reading disability – people with visual impairment, dyslexia and restricted mobility – are now able to get their course literature as talking books. Students can order the talking books, which are then sent via post from the Swedish Library of Talking Books and Braille (TPB) in Stockholm.Besides the time that the dispatch actually takes, the borrower is dependent upon the book being in stock, that is to say that no-one else has borrowed it. However, as talking books are now digital produced in the international standard DAISY, it is possible to handle them in a different way.

The trial in question is a broadband service that provides access to talking books via digital distribution to students with a reading disability. A central digital talking book archive, which has approximately 13 000 titles, is being built up by TPB, where all recorded university literature is made available for downloading. The aim is to provide access to literature through broadband technology to students with a reading disability, on equal terms with other students.

The project has been conducted by TPB, which is the authority responsible for satisfying the needs of people with visual impairment and other people with reading disability for literature in the form of talking books, Braille books and electronic media.

Through SUNET, the university computer network, the four university libraries that have participated in the trial have access to broadband with high transmission capacity. The average talking book is 225 MB in size and university books are often twice as big, which imposes demands for rapid connections and an acceptable download time. Equipment for downloading and burning CD-ROMs has been set up within the library. When a student comes to the university library to borrow course literature, the librarian simply downloads the talking book from the talking-book archive and transfers it onto a CD-ROM for the student to borrow. The system is simple to use and the talking books in the archive are always available, which means that there is no waiting list.

The project also has two sub-projects where sub-deliveries and so-called ‘streaming reading’ are being tested. If the book is in the process of being recorded, it can be downloaded to the student bit by bit in pace with the progress of the recording. This can sometimes be decisive for keeping up with course studies. Streaming reading of talking books over the Internet means that students can themselves connect from home and read the relevant books, without needing to go to the local library to download the talking books.

The trial subjects who participated in the project have made frequent use of the opportunity to download talking books and university libraries have demonstrated great interest, even those who have not participated in the trial. In addition to the trial subjects and the university libraries that participated in the project, approximately 100 other students have made use of the service and 13 new university libraries have gained access to the archive.The broadband service has now been established as a regular service.

Project Facts

Technology
University library: Computers with a rapid broadband connection and CD burners. The reading program EaseReaderOnline was used for streaming reading.

Target group
People with reading disability (for example, people with visual impairment, dyslexia, impairment to mobility)

Number of users
49

Project period
June 2002 - May 2004

Broadband for people with intellectual impairment

Intellectual impairment involves, among other things, difficulties in dealing with abstract concepts and contexts, for example, time, quality, quantity, cause and spatial relations. One consequence of this is that people who are disabled are limited by their capacity to communicate with each other at a distance, for instance, by telephone. If two people can see each other, and in this way perceive body language, pronunciation and tone on the part of the person they are talking to, communication is made significantly easier.

Being in control of your everyday life, for example, by gathering and understanding public information, news, participating in leisure interests with others, shopping, attending to your finances, writing to authorities and friends, are important activities that allow full participation in society and are essential for an independent life. For many people with intellectual impairment, participation and the opportunity to live independently is severely restricted. A computer with a broadband connection provides opportunities for enhanced participation and independence.

For the trial ‘Broadband for people with intellectual impairment’, two-way video communications with high audio and picture quality are essential for communication at a distance to function.The trial was implemented in collaboration with the Municipality of Bollnäs, the Grunden Association in Gothenburg and Höghammar School in Bollnäs.

The objective of the trial was to determine the benefits of broadband for people with intellectual impairment. One important aim was to be able to communicate and cooperate individually or in groups with the aid of two-way video communications via the Internet. Another aim was to test the possibilities of people with intellectual impairment to use the services on the Internet, for example banking, e-commerce and other services. In addition to these activities, the aim was also to provide participants with an opportunity to discover their own uses and benefits from broadband.
 
Members of the Grunden Association in Gothenburg and pupils at Höghammar School in Bollnäs participated in the trial. For example, they tested various services on the Internet and pooled their experiences in order to learn more about the Internet as an everyday tool. Video communication was often conducted from several people to several people, that is to say, between groups. The working groups also collaborated on a joint web newspaper, publishing results and experiences from the trial.

Experiences from the trial have shown that users have rapidly assimilated this new method of communication and felt both the benefit and joy of using it. The Internet and video-supported communications can facilitate distance communication.This particularly applies to the opportunity to establish new contacts beyond the individual’s circle comprising other people with disability.This is very important as many people with intellectual impairment encounter impediments that limit their opportunities to meet other people.

Project Facts

Technology
Studios (Gothenburg and Bollnäs): Computer connected to broadband, large screen, video-conference program (Click-to-Meet), web camera, digital video camera for better quality and documentation.

Home environment
Computer connected to broadband, video-conference program (Click-to﷓Meet), web camera, and headset.

Target group
People with intellectual impairment

Number of users
6 people at Grunden Media in Gothenburg and 5 pupils at Höghammar School in Bollnäs.

Project period
July 2002 – May 2004
 
Distance education in sign language
 
Sign language is the first language of deaf people and it is consequently important for people who are deaf to gain access to education in sign language. The adult education courses in sign language that are currently on offer today for people who are deaf are often arranged at boarding schools far from home. There is consequently a great need for and interest in distance education.

The aim of the project was to use the framework of a flexible course to create an opportunity for sign language interaction between course leaders and participants. Communication takes place by video over the Internet, either as direct communication or through the participants downloading video files or sending video messages. The project is being conducted by the Swedish National Association of the Deaf (SDR) in collaboration with Västanvik Folk High School in Leksand.
 
The project basically employs three methods of communication for interaction between the teacher and the participants:
The course should be easily accessible, through a user-friendly interface, and at a reasonable cost for the individual user, without lowering educational standards. The communication of high quality video over the Internet imposes great demands on high bandwidth, but the equipment that participants require is a standard computer, web camera and software. The course methods may also benefit people who need lip-reading or signs to support their understanding.

Experience from the trial is that the technology functioned beyond all expectation and the pupils were very positive towards the opportunity of communicating in their first language at a distance. “Wonderful contact from 200 kilometres away!”, according to one participant.

Project Facts

Technology
Computer connected to broadband, web camera, videoconferencing software (Click-to-Meet).

Target group
People who are deaf or people with hearing impairment with sign language as their first language.

Number of users
13

Project period
July 2002 – April 2004
 
Winning communication – distance guidance
 
Considerable resources are required to provide people with disability with effective guidance, for example, about labour market issues. Limiting factors include access to specialists in labour market guidance for deaf people and sign language interpreters. Resources are also unevenly distributed across Sweden, which means that it can take a long time for people with disability to get to meet these guidance specialists.

The aim of the trial project known as ‘Winning Communication’ was to develop ways of using video communication in regular work at the employment offices.The primary goals for the group comprising jobseekers with disability were to facilitate more rapid contact with specialists at the employment offices and thereby enhance opportunities of finding work. At the same time, this would reduce both travel costs and travel time.

Guidance can be provided individually and for groups. When the counsellor, together with the applicant, identified the service need and decided on the most appropriate method, the applicant was offered the opportunity of meeting an expert at a distance. Together with his/her caseworker, the applicant met up with, for example, a psychologist, teacher of the deaf/hard of hearing, teacher of the deaf, vision consultant, occupational therapist or other expert via video communication. Communication is conducted using video, text and voice and the counsellor is able to display documents and websites.The technical equipment is installed at ten employment offices in the counties of Uppsala and Västmanland, which means that people with disability can visit their nearest employment office.

Experiences from the trials have been very positive. Ten employment offices took part and guidance was provided both individually and with groups of up to five people. During the trial period, distance guidance was integrated into the regular operations. Meetings were considered to be effective and there were few limitations for the scope of use of the concept.

The project also provides the preconditions for greater cooperation and a more efficient transfer of competence between the staff at the employment offices who work with people with disability.

This trial was based at the County Labour Board in the County of Uppsala and has been co-financed by the National Labour Market Board (AMS). Methods for providing distance guidance for deaf people have been applied in Uppsala for the past couple of years with very good results. As distance guidance is now being integrated into the regular operation, it will also continue in the future after the conclusion of the trial project.

Project Facts

Technology
Computer connected to broadband with video camera, microphone and video communication software.

Target group
People with impaired mobility, deaf people, people with hearing impairments, people with visual impairments, people with intellectual occupational disability, applicants suffering from asthma/allergies, people with dyslexia, people with heart and/or lung diseases and other somatic-related occupational disabilities.

Number of users
37

Project period
July 2002 – June 2004
 
Mobile video communications for people who are deaf

Video calls via mobile telephones brought about a revolution in the communication opportunities for people who are deaf. Text messages (SMS) soon became an important means of communication for deaf people, although it uses the second language of deaf people, Swedish. Video calls make it possible for deaf people to use their first language – sign language – for mobile communications.

The third generation mobile telephony, 3G, has high capacity and is capable of transmitting moving pictures, essential for allowing sign language use with a mobile. The trial project ‘Mobile video communications for people who are deaf’ aimed to investigate how deaf people can use 3G telephones in order to communicate with sign language. The project period was May 2004 to February 2005.

The trial group conducted video calls in real time, and also sent video messages to each other. The possibilities that 3G technology offers were investigated and evaluated by testing the various terminals and 3G networks. In the course of the trial period, video calls became increasingly common in Sweden among people who are deaf.

The positive experiences from the first trial resulted in a new development project being initiated during the spring of 2005. This was to test a communications and interpreter service for 3G calls. These services mean that a deaf person can contact a sign language interpreter who interprets between sign language and speech. In this way, a deaf person can communicate directly with a hearing person. This may, for instance, involve distance interpretation, for example, when visiting the bank or during a spontaneous meeting. It may also involve a communicated call, for example, when one deaf person wishes to make a call to a hearing person or vice versa. In this way the deaf person becomes less dependent upon physical access to interpreter resources and the need to book such services well in advance, which creates opportunities for more spontaneous communication.

The new project aimed to develop both technology and methodology for receiving and dealing with 3G calls at an interpreter centre. The project is now finalised and the functionality is since September 2006 integrated in the service provided by the Relay service for video telephony operated by Tolkcentralen in Orebro Läns Landsting (the Orebro County Council Interpreter Centre).

Project Facts

Technology
3G technology and telephones with video functionality.

Target group
People who are deaf and hearing-impaired persons who use sign language as their first language.

Number of users
Approximately 100

Project period
April 2005 – February 2006.

Conclusion

Broadband communications have demonstrated that they can play an important role in providing vital services to people with special needs for communication. The technology is mature enough to provide advanced services and it is reasonable to expect that such services will become more widely available as long as organizational and economical aspects can be solved.

References

PTS (2005). Mobile video communications for people who are deaf. www.pts.se/Dokument/dokument.asp?Sectionid=&Itemid=3745&Language id=EN.

PTS (2004). Broad band for people with disability. www.pts.se/Dokument/dokument.asp?Sectionid=&Itemid=4615&Language id=EN.




2.3.2 Access to video relay services through the Pocket Interpreter (3G) and Internet (IP)

By Patrik Bystedt

Background

The National Post and Telecom Agency (Post- och telestyrelsen, PTS) is the authority that monitors the electronic communications and postal sectors in Sweden. PTS has according to governmental regulations and decisions, an assignment to, through procurement, ensure that the special needs of people with disabilities are satisfied. The Government grants an allowance for this purpose every year to PTS.

PTS procures a number of electronic communications and postal services. Furthermore PTS continuously initiates projects with the aim to test new technologies and new functionality that could support different groups of disabled people.

The technological development and rapid growth of fixed and mobile broadband networks creates new possibilities for people with disabilities. There are two significant trends that PTS has recognized. The first is that more and more services are based on the IP protocol. The second is an increasing demand for services to be mobile. A recent survey in Sweden shows that practically everyone in Sweden has a mobile telephone, senior citizens is the only group where accessibility to mobile telephones is less than 90%.

The relay service for video telephony has been available in Sweden since 1997.It is primarily used for relaying telephone calls between a deaf person using sign language and a talking person.During the first years, the service was only offered to sign language users via the ISDN network.Since the ISDN network is not widely spread and communication is quite expensive, the number of users was limited. ISDN is no longer a promoted service in Sweden and the number of subscriptions will decline. Today video calls via IP based video and 3G telephones are more used. There is therefore a need to develop the relay service for video telephony to meet the new communication need and trends. Since 2003 PTS has initiated two different development projects, the IP access project and the pocket interpreter.
 
The IP access project

PTS started the IP access project in 2003. The overall aim of the project was to develop the relay service for video telephony and build a new IP platform that could handle calls to and from different types of video telephones. The IP access project was concluded in August 2006.

Before the incoming calls to the service were handled in different studios at the interpreter centre depending on the video telephone. If the user called from an ISDN telephone the interpreter would go to the ISDN studio and if the user called from an IP based video telephone the interpreter would go instead to the IP studio. This was not an ideal situation and certainly not a scalable solution.

In February 2006 the new IP platform was put into use. Today all incoming calls are handled on the same platform, with the same service quality measures. The studios connected to the service are no longer dedicated to a certain type of video telephone. Another aim of the project was to allow access to the service through a web client. A user with a computer, web camera and a broadband connection can download software for video telephony. This means that the user becomes less dependant on the specific video telephone.

With the new IP platform, a call centre solution has been initiated. Collaborating interpreter centres or companies can now connect their studios to the service and supply interpreter services. The incoming calls are distributed through an automatic call distribution (ACD) mechanism. This gives the relay service flexibility and ability to grow. The dependency on certain interpreter centres and geography is also minimized.

Figure 2.19 describes how the service looks like today with more accessibility for the user (left hand side) and a more flexible and scalable solution with collaborating suppliers of interpreter services (right hand side).

Diagram showing an overview of the Relay service for video telephony in Sweden

Figure 2.19 Overview of the Relay service for video telephony in Sweden.


The pocket interpreter

In 2004 PTS initiated a trial project called Mobile video communication for people who are deaf. One of the services tested in the trial was distance interpretation and relay of mobile video calls. This mobile application of interpretation services has, among users, been referred to as “the pocket interpreter”. The use of 3G is rather extensive among people who are deaf. According to Sveriges Dövas Riksförbund (the Swedish National Association for the deaf) an estimated 4 000 to 6 000 people who are deaf use a 3G telephone, which would represent approximately half of the number of people who are born deaf in Sweden. The conclusion of this trial project was that there is a great demand for this service and there are many potential users of the service.

In order to meet this demand PTS started the development project, the Pocket interpreter, in April 2005. The main objective of the project was to develop methodology and technology for distance interpreting and mediation of mobile video calls (3G) to the new IP platform.
 
One of the major efforts in the project was to improve the interpreter situation. When the project started the interpreter used the same equipment as in the trial Mobile video communication for people who are deaf. This was an ordinary 3G telephone and the interpreter sat in a specific 3G studio at the interpreter centre. Since February 2006, when the new IP platform was put into operation, mobile video calls were handled in the same platform and in the same manner as any other call to the service. The specific solution and studio is not used anymore.

The number of 3G users in the project was initially limited to 100.In May 2006 new functionality was implemented in the platform so that incoming 3G calls could be distributed through the ACD. This means that more studios can handle incoming 3G calls and that the number of 3G users can increase.

There has been a lot of interest shown in the project and the Pocket interpreter has been demonstrated frequently, for example at the World Summit on Information Society (WSIS) in Tunis in November 2005. The project has also competed in the Stockholm Challenge Award.

The project was concluded in August 2006, but the Pocket interpreter, the mobile access to the relay service for video telephony, lives on.

Conclusion

The development project was finished by the end of August 2006.As a result of these projects, users can now call the video relay service using their 3G telephone, IP based video telephone or web client as well as their traditional ISDN video telephone. The future of the video relay service is that both the service and the user will be less dependant on the specific video telephone. The service will become much more flexible. The number of video calls from mobile telephones and computers connected to Internet is expected to grow rapidly and will create demand on the service resources. The new service platform allows interpreter companies to collaborate as sub contractors which means that more interpreters can handle incoming calls, regardless of geographic location. The future will probably see more of these joint ventures to create national services.



2.3.3 Convenient invocation of relay services

By Robert Hecht

Introduction

The term relay service refers here to a service that allows people with a disability to use the telephone, when normally they could not, through the use of an operator. Today these three types of relay service are in common use in Sweden and other countries:

These relay services are very good and important for translation between various means of communication. They thereby contribute towards ensuring equal opportunities for telecommunications for people with disabilities. Further variations of relay services can be created through new combinations of media and language in the calls.

To call through a relay service is currently a two-step process. First a person calls the service and explains who they really want to call.The relay service then connects and performs the relaying action.

The methods for invoking the relay service for a call can be improved so that the relay service can contribute more effectively to equal opportunities for communication.There are methods to arrange the convenient invocation of relay services that are listed below.

Needs and functional description of connection cases

In the illustrations, a picture of a text telephone that has a call through the Relay Service for Text Telephony is usually used, but the cases also apply to video telephony through the Relay Service for Video Telephony and voice telephony through the speech-to-speech relay service (Teletal in Sweden).

The descriptions provide reasons for why the connection cases would simplify the relay services for the user.

Direct dialling to relay service users

Of all calls to the relay services, 85% are initiated by relay service users. Voice phone users rarely call to relay service users.One reason for this is that it is too complicated for a voice telephone user to call to a relay service user.It is also too complicated to describe how to do it.

One solution is to be able to use a voice telepho