next up previous contents
Next: Interactive Dialogue systems Up: Assessment methodologies and Previous: Assessing recognisers

Assessing speaker verification and recognition systems

Speaker verification needs to make use of many of the statistical and experimental procedures outlined in previous sections. The principal procedures that are required for assessing speaker verification and recogniton systems are those applied to speech recognition processes in the preceding section. The dependent variable is going to be whether speaker verified or recognised, that is a discrete variable. Discrete variables are somewhat simpler measures than those considered in connection with acoustic measures of speech recognition (note, I am not saying that the problems involved are necessarily simpler). The particular types of statistical tests called for will be non-parametric (see section gif).

As mentioned before, assessing speech recognisers is integrally related to speech corpora. (It is for this reason that an example was worked through from scratch addressing setting up a corpus and assessing recognition). The same applies here. So, a lot of what needs to be said has been covered. Here only the additional considerations that need dealing with are described --- those to do with sampling speakers for recognition and verification and those to do with assessment.

Sampling rare events in speaker verification and recognition systems

In section gif consideration of estimated sample size for equally frequent events was covered. In that example, p and q were the same at 0.5. If p is quite small as it may be in the case of impostors into speaker verification systems (p = 0.05, 0.01 or 0.001), it would be necessary to obtain very large samples (running in to tens of thousands). In these cases the Poisson distribution is frequently used.

Employing expert judgments to augment speaker verification and assessment for forensic aspects of speaker verification and recognition

One important area of speaker verification concerns forensic applications. The forensic data that might be used can be obtained in a number of ways: Experts might listen to the material and offer a subjective judgement. This judgment is basically whether the speaker is or is not a particular speaker (i.e. a discrete measure). There are lots of ways that expert judgments about speakers could be standardised which would enhance the judgments in the ways that, for example, doctors knowledge in formulating diagnosis has been done and incoporated into automatic disease diagnosis expert systems. It is beyond the scope of the current chapter to present this information. Nor is it clear whether presenting here would in fact reach the target audience. Another approach that has been taken to speaker recognition and verification has been through spectrographic measures (a continuous measure). To use all these measurements together calls for statistical techniques which can deal with mixes of continuous (parametric) and discrete (non-parametric) measures.

Generalised Linear Modelling Techniques (GLIM) allow models to be constructed that involve predicting dependent variables like those required here from mixture of binary, discrete and continuous measures ( =1 (

; aitkin anderson francis hinde 1989) ). These have not been applied to forensic applications of speaker verification/recognition but seem appropriate for the task. The technique as it would be applied here enables the experimenter to establish which acoustic and subjective factors differentiates one speaker from another. In contrast, Analysis of Variance can only deal with continuous measures.



next up previous contents
Next: Interactive Dialogue systems Up: Assessment methodologies and Previous: Assessing recognisers



WWW Administrator
Fri May 19 11:53:36 MET DST 1995