next up previous contents
Next: References Up: Assessment of speaker Previous: Forensic applications

Conclusions

 

In this chapter, we have identified the main features that characterise the functions of a speaker recognition system. This characterisation leads to distinctions far more accurate than the usual oppositions between speaker identification versus verification on the one hand, and between text-dependent and text-independent systems on the other hand.

We have distinguished 5 tasks covered by the common term of speaker recognition :

A common distinction of the tasks understood under the general term of speaker recognition, is the opposition between speaker identification and speaker verification, on the one hand, and between text-dependent systems and text-independent systems, on the other hand. If the first opposition is relatively appropriate, the second one does certainly not hold in front of the number of distinct protocols that are covered by each field. A major point of section XXX is that at least 6 different levels of text-dependence must be distinguished. A primary goal when dealing with assessment methodology is to define a non-ambiguous terminology.

We have stressed the existence of several text-dependence levels. As we defined them above, the distinction must be made between :

Especially when a sufficiently large database becomes avalaible, a large amount of basic research is still required to study various aspects of the new material and confirm or refine some previous findings. Knowledge on the exact nature of intra- and inter-speaker variabilities are still very incomplete, and new material tailored to study these phenomena will certainly give a new impulse to fundamental research in this field. A by-product of this research will certainly be some new outcomes in speaker-independent speech recognition, but also in speech synthesis and coding.

Among all the axis of reflexion we have presented here, some may be more prioritary than others, but the definition of such an order can not be decided without deep discussions with many other specialists in the field. We believe however that all of these topics should be taken into account, because they represent different aspects of a same problem, and they should not be addressed independently.

We hope that we have laid significant foundations for further discussion and investigation in the field of speaker recognition assessment methodology, and we strongly recommend that the need of such investigations be clearly acknowledged by concrete support from various institutions, so that it can be carried on.

The section XXX, concerned with algorithmical approaches, illustrates the large amount of work dedicated to speaker recognition in many laboratories across the world, and the wide variety of successful algorithmic approaches to the problem. It shows also that, in spite of this abundance, it is impossible to identify clear advantages for a given family of methods, because of the heterogeneity in databases and test protocols. Conclusive assessment of algorithmic methods relies on the use of common databases and the defininiton of consensual test protocols.

In section YYY, we list and classify some possible applications, together with a few existing prototypes and commercial products, targeted to speaker recognition techniques. Three main fields are considered : telecommunication, on-site and forensic applications. If speaker verification systems are involved in the three of them, applications of speaker identification are mainly linked to the forensic domain. We stress that speaker verification can certainly offer an increased security for user authentication as an additional protection layer, without constituting a factor of discouragement for authorised users : this requires however a careful engineering of the application and a detailed study of ergonomical aspects. In particular, the notion of equal error rate may not be appropriate in a practical application, and a larger rate of false acceptance (combined with a minimal rate of false rejection) may be a good compromise between fraud decrease and user satisfaction. It is certain anyway that speaker recognition will have a role to play in the future, within multi-modal man-machine communication systems. For what concerns forensic applications, we underline the need of actions and contacts between scientists, magistrate and police officials, aimed at clarifying the limits to the use of speaker recognition techniques in this field.

Among the directions mentioned in this chapter toward standardisation of assessment methodologies in speaker verification, we suggest in particular to define standard scoring procedures, common databases (if possible calibrated by human listening tests) and a reference system. The choice of scoring procedures must be the result of a consensus between research laboratories, and can not be done before a wider consultation. Some actions are taking place currently for multi-lingual database collection, and it seems crucial to us that each European language should be represented in this database : this involves a quite large strategic and material effort and requires national and subsidiary European support. The desirable properties of a reference system is its easy implementation and reproductibility. We however do not have enough ground so far to recommend any particular one.



next up previous contents
Next: References Up: Assessment of speaker Previous: Forensic applications



WWW Administrator
Fri May 19 11:53:36 MET DST 1995