Speaker verification needs to make use of many of the statistical
and experimental procedures outlined in previous sections. The
principal procedures that are required for assessing
speaker
verification and recogniton systems are those applied to speech
recognition processes in the preceding section. The dependent variable
is going to be whether speaker verified or recognised, that is a
discrete variable. Discrete variables are
somewhat simpler measures
than those considered in connection with acoustic measures of speech
recognition (note, I am not saying that the problems involved are
necessarily simpler). The particular types of statistical tests called
for will be
non-parametric (see section ).
As mentioned before, assessing speech recognisers is integrally related to speech corpora. (It is for this reason that an example was worked through from scratch addressing setting up a corpus and assessing recognition). The same applies here. So, a lot of what needs to be said has been covered. Here only the additional considerations that need dealing with are described --- those to do with sampling speakers for recognition and verification and those to do with assessment.
In section consideration of estimated sample size for
equally frequent
events was covered. In that example, p and q were the
same at 0.5. If p is quite small as it may be in the case of impostors
into speaker verification systems (p = 0.05, 0.01 or 0.001), it would be
necessary to obtain very
large samples (running in to tens of
thousands). In these cases the Poisson distribution is frequently used.
One important area of speaker verification concerns forensic applications. The forensic data that might be used can be obtained in a number of ways: Experts might listen to the material and offer a subjective judgement. This judgment is basically whether the speaker is or is not a particular speaker (i.e. a discrete measure). There are lots of ways that expert judgments about speakers could be standardised which would enhance the judgments in the ways that, for example, doctors knowledge in formulating diagnosis has been done and incoporated into automatic disease diagnosis expert systems. It is beyond the scope of the current chapter to present this information. Nor is it clear whether presenting here would in fact reach the target audience. Another approach that has been taken to speaker recognition and verification has been through spectrographic measures (a continuous measure). To use all these measurements together calls for statistical techniques which can deal with mixes of continuous (parametric) and discrete (non-parametric) measures.
Generalised Linear Modelling Techniques (GLIM) allow models to be constructed that involve predicting dependent variables like those required here from mixture of binary, discrete and continuous measures ( =1 (
; aitkin anderson francis hinde 1989) ). These have not been applied to forensic applications of speaker verification/recognition but seem appropriate for the task. The technique as it would be applied here enables the experimenter to establish which acoustic and subjective factors differentiates one speaker from another. In contrast, Analysis of Variance can only deal with continuous measures.