=====================================================================
DOCUMENTATION for the SPEEDURCONT-CORPUS
=====================================================================
(C) Friedrich Neubarth, Hannes Pirker
    Austrian Research Institute for AI (OFAI)
    Last Update: 12 Feb 2007

====================================================================
Lexical Content and Naming Conventions:
====================================================================

There is one file per sentence. Files are named 'sNNN.XXX.wav' and
'sNNN.XXX.xml", where NNN is the sentence number and XXX represents
the type/lexical content of the file. 

Values for XXX and their meanings are:

--
no 
--
"Nordwind und Sonne": A short story containing phonetically balanced text.

--
bu 
--
"Buttergschichte":    A short story containing phonetically balanced text.

--------------------
zg1, zg2, ...,  zg22
--------------------

"Zeitungsartikel": 22 Short articles more or less randomly selected
from Austrian newspapers.

-------------------
sa1, sa2, ... , sa5
-------------------

"Einzelsaetze": 298 isolated 'standard' sentences taken from various
sources (e.g. Phondat, Marburg-Saetze,...). They are grouped according
to their original sources:

sa1: s001 - s100
sa2: s101 - s120
sa3: s121 - s183
sa4: s184 - s253
sa5: s254 - s298

--------------------
fa1, fa2, ... , fa5
--------------------

"Frage-Antwort Paare": Highly uniform pairs of questions and answers
(to be correct: only the *answers*!), used for controlled induction of
different focus conditions (broad vs. narrow focus).

They are grouped in bundles of 50 sentences each.
fa1: s001 - s050
fa2: s051 - s100
...
fa5: s201 - s250

====================================================================