next up previous contents
Next: System capability profile Up: System design Previous: System design

Introduction

Spoken language systems are appropriate combination of several modules including recognition of speech input, recognition of speaker identity (verification or identification), speech output generation and synthesis (including speech coding), and/or Man-Machine interaction management.

A simple use of spoken language system consists of recognising speaker utterances, interpreting them with respect to the application, deriving a meaning (or a command) and providing a consequent feedback to the user (may be a speech prompt or a system action). This is illustrated below for a speech input/output dialogue system:

Figure 1

In order to generalise the use of such systems in different Man-Machine interaction contexts, a predictive model of performance ( =1 (

; choukri 1988) ) needs to be obtained as a function of different identified relevant factors. The definition of those factors has to lead to a set of parameters that can describe a speech processing system. This description has to express two opposed points of view: one of the technology provider (designer) and the other one of the application developer (buyer). The two points of view have to be distinguished.

Designers should give proofs of the performance of their recognisers with a measure of the impact of any change. So the first contribution of this chapter is related to the technology supplier point of view, and aims at providing detailed guidelines for the specification of speech processing systems in order to explicit the operational capabilities offered by the technology. This will allow the technology providers to depict the system performance in a comprehensive way to the application developers.

Buyers need comprehensive information about how each system or device will perform in the specific conditions of their practical application. So the second contribution of this chapter is related to the application developer point of view, aims at providing detailed guidelines to express the requirement of applications that incorporate speech processing systems, in order to explicit the application requirements that should be met by the operational capabilities of the technology. This will allow the application developers to express their needs in a comprehensive way to the technology providers.

The technology specification is complex enough, and has to go beyond the sole numerical value of 99% of recognition accuracy usually announced by the equipment suppliers. This rate depends on numerous parameters. Some of them cannot be easily quantified ( =1 (

; lea 1982) , =1 (

; pallet 1985) ,

=1 (

; choukri 1987) , =1 (

; choukri 1988) , =1 (

; moore 1988) ). In order to focus on the most relevant parameters one need to adopt a multi-dimensional characterisation of the speech processing system. This characterisation will be called ``The system capability profile'' (expression firstly introduced by R. Moore).

The application requirement is also a complex phenomena, too complex to be reflected only by a transaction success rate and should also be depicted as a multi-dimensional characterisation [Ref. WP3/WP8 of sundial]. This will be referred to as ``The application requirement profile''.

The objective of this chapter is to list the major factors that would allow to define the above mentioned multi-dimension spaces and moreover a way to express a matching process between the two spaces. It will consists of 3 tables with keyword entries that concerns the different dimensions seen form the points of view of the technology provider as capabilities or as system features, and from the application developer point of view as requirements.



next up previous contents
Next: System capability profile Up: System design Previous: System design



WWW Administrator
Fri May 19 11:53:36 MET DST 1995