In this chapter two important stages in the representation of spoken language corpora are discussed:
Transcription means that a symbolic representation of the speech is made. The transcription makes it possible to easily find your way in a speech corpus. Design of the data base means the way in which speech, speaker information, information on data collection, and transcriptions are stored in a structured way. For both transcription and data base structure no firm standards exist at this moment. Partly this is due to the fact that the representation of corpora very much depends on their future use. However, general remarks and recommendations can be made. It is expected that the need for standards will grow in the near future as the number of corpora which are collected increases and the exchange of corpora will be greater. Furthermore, the foundation of the European Language Resources Association (ELRA) in 1995 will stimulate the standardisation of spoken language corpora.