2004 – 2006

Creative Histories

The Josefsplatz Experience

The goal of the project is to reconstruct a complex 3D model of an urban environment, in particular the Viennese Josefsplatz, from historical pictures and paintings and present this 4D information space (3D geometry over time) on PCs and mobile devices. Moreover, a user adaptive meta-information system enables the visualisation of complex, interlinked historic events. A special focus is set on the different qualities of historical information. In the project, OFAI concentrates on the realization of the meta-information system.

2003 – 2006

Architecture and Effective Development of a High-Quality Part-of-Speech Tagger

The general aim of the project is enhancing the quality of Part-of-Speech tagging by developing a tagger combining the statistical approach with the Constraint Grammar based approach in such a way that (i) strengths of each of the approaches are accented, and (ii) weaknesses are mutually compensated for. Apart from these theoretical aims, a validation/practical demonstration of the developed methodology is also due, together with an evaluation of the practical results achieved. This sums up to the following three main objectives of (and simultaneously to the three innovations in the field of PoS-tagging contributed by) the project: (1) proposing and advocating a novel tagger architecture combining the statistical and the Constraint Grammar based tagging scheme into a tagging system with higher accuracy than any of its components taken alone; (2) developing a systematic method for writing rules of a Constraint Grammar tagger, together with a novel and more powerful method of their application; (3) implementing and evaluating a combined tagger for German, employing the TnT tagger by T. Brants as the statistical component and the newly developed Constraint Grammar tagger for German, and using the NEGRA corpus as the evaluation standard.

2005 – 2006


Semantic Phonetic Automatic Reconstruction of Dictations

The SPARC project aims at integrating semantic knowledge bases in automatic speech recognition systems for dictation applications. Speech recognition systems, which take spoken text as input and convert it into written text, have long reached a point where they can be commercially employed. An important application for speech recognition is automating document creation in institutions with a large dictation volume. This type of application poses a challenge for text processing due to its potentially large vocabulary. While in dialog or command-and-control systems, 'semantics' is represented by the underlying databases or the set of possible system actions, dictation systems have to handle texts with a much broader content, even if the domain is usually limited. In order to create documents from spoken texts, speech recognition systems usually only rely on an acoustic model and a language model which represents co-occurrence statistics of words. Based on this knowledge, a transcription of the spoken text is produced. To fully employ the potential of language technology for automated dictation, systems must move away from simple transcriptions of the spoken utterances to document creation conforming to the formal and informal requirements of specific types of texts. By making use of explicit semantic information, our project will contribute to this new dimension in automatic speech recognition technology for dictation systems. Improvements gained with the integration of semantic knowledge will concern document quality, word error rate and usability.

2003 – 2005


Biological Text Mining

The main aim of this project is the development of a generic text mining tool for the biological domain. The BioMint tool will search the literature and automatically extract information from abstracts and papers in order to provide two essential research support services: (1) Curator's assistant: accelerate, by partially automating, the annotation and update of bio-databases; and (2) Researcher's assitant: generate readable reports in response to queries from biological researchers and practitioners.

2002 – 2005

Knowledge Exploration in Science and Technology

The primary objective of the Action is to develop and implement computerised systems for extracting previously unknown, non-trivial, and potentially useful knowledge from structurally complex, high-volume, distributed, and fast-changing scientific and R&D databases within the context of current and newly developed global computing and data infrastructures such as the GRID.

1998 – 2005

Computer-Based Music Research

Artificial Intelligence Models of Musical Expression

The goal of this project is to use Artificial Intelligence methods to study the phenomenon of expressive music performance. The focus of the project is on developing and using machine learning and data mining methods for the analysis of expressive performance data. The goal is to gain a deeper understanding of this complex domain of human competence and to contribute new methods to the (relatively new) branch of musicology that tries to develop quantitative models and theories of musical expression.

2004 – 2005

An Automaton for the Moderation of Internet-based Discussion Fora

The project aimed at the development of an automaton for the moderation of internet-based discussion fora. Up to now, fora which have to fulfill certain legal and qualitative standards, had to be moderated by hand. In the case of fora with intensive participation, this required large amounts of human workload and lead often to delays in publication. The goal of the project was a partial automatic checking of contributions according to given criteria in order to greatly reduce human workload. A prototype was established and successfully tested with actual material from an Austrian online newspaper.

2004 – 2005


Machine learning to qualify postings

The research and development project Foromat was the first endeavour of OFAI to bring the benefits of machine learning to news media. Media owners and publishers have a legal (and moral) obligation to monitor the content published on their pages. Foromat helps the forum moderators to identify postings that must not be published due to their infringing, abusive and offending content. The system is successfully running at Der Standard since 2005.

2003 – 2004


Artificial Intelligence Methods for Ebusiness

This project aims at obtaining an overview about the potential of AI for eBusiness, studied in four sub-projects, plus the development of small-scale, prototypical applications in each of these areas.

2001 – 2004


A Net Environment for Embodied Emotional Conversational Agents

The objective of the NECA project was to develop a new generation of mixed multi-user / multi-agent virtual spaces populated by affective conversational agents. The agents are be able to express themselves through synchronised emotional speech and non-verbal expression, generated from an abstract representation. This is the first time that such expressive capabilities are featured in Internet applications. The agents' usefulness were evaluated in two concrete application scenarios. From a technical point of view, the NECA platform provides a confederation of dedicated components including an affective reasoner, co-ordinated generation of verbal and nonverbal aspects of communication, and emotional speech synthesis, thus providing a basis for the development of new Internet applications with emotional agents. OFAI was the co-ordinating partner of the project. Moreover OFAI was responsible for the representation of multimodal information and for text generation in the German versions of the demonstrators. OFAI contributed also to the speech synthesis.