next up previous contents
Next: Transcription of spoken Up: Corpus representation Previous: Corpus representation

Introduction

In this chapter two important stages in the representation of spoken language corpora are discussed:

Transcription  means that a symbolic representation of the speech is made. The transcription  makes it possible to easily find your way in a speech corpus. Design of the data base  means the way in which speech, speaker information, information on data collection, and transcriptions  are stored in a structured way. For both transcription  and data base  structure no firm standards exist at this moment. Partly this is due to the fact that the representation of corpora very much depends on their future use. However, general remarks and recommendations can be made. It is expected that the need for standards will grow in the near future as the number of corpora which are collected increases and the exchange of corpora will be greater. Furthermore, the foundation of the European Language Resources Association (ELRA) in 1995 will stimulate the standardisation of spoken language corpora.
In section gif transcription  of spoken language corpora will be discussed and in section gif we will deal with the storage and the data base  structure of corpora.



WWW Administrator
Fri May 19 11:53:36 MET DST 1995