OFAI

Technical Reports - Query Results

Your query term was 'number = 2012'
17 reports found
Reports are sorted by descending number

OFAI-TR-2012-17 ( 281kB PDF file)

The Relation of Hubs to the Doddington Zoo in Speaker Verification

Dominik Schnitzer, Arthur Flexer, Jan Schlueter

In speaker verification systems there exists the well-known phenomenon of speakers which are very problematic to verify and have been given various metaphoric animal names. Our work connects this so-called 'Doddington zoo' and the animals of the whole 'biometric menagerie' to the problem of 'hubs' in high dimensional data spaces, which was recently the topic of a number of publications in the machine learning literature. Due to a general problem of measuring distances in high dimensional data spaces, hub objects emerge which have a high similarity to a large number of data items. This is a novel aspect of the 'curse of dimensionality' which adversely affects classification and identification performance. In a series of experiments we try to understand the 'Doddington zoo' problem with respect to the notions of hubs and anti-hubs.

Keywords: Speaker Verification, Hubs, Normalization, Machine Learning

Citation: Schnitzer D., Flexer A., Schlüter J.: The Relation of Hubs to the Doddington Zoo in Speaker Verification. Technical Report, Proceedings of the 21st European Signal Processing Conference (EUSIPCO'2013), September 9-13, Marrakech, Morocco, 2013.


OFAI-TR-2012-16 ( 1946kB PDF file)

Structure and stability of online chat networks built on emotion-carrying links

Vladimir Gligorijevic, Marcin Skowron, Bosiljka Tadic

High-resolution data of online chats are studied as a physical system in the laboratory in order to quantify collective behavior of users. Our analysis reveals strong regularities characteristic of natural systems with additional features. In particular, we find self-organized dynamics with long-range correlations in user actions and persistent associations among users that have the properties of a social network. Furthermore, the evolution of the graph and its architecture with specific k-core structure are shown to be related with the type and the emotion arousal of exchanged messages. Partitioning of the graph by deletion of the links which carry high arousal messages exhibits critical fluctuations at the percolation threshold.

Keywords: Publications List Interact, Social structure emerges in online chats, Users associate by emotion-carrying messages, Physics and computer science reveal new dimension of user behaviors

Citation: Gligorijevic V., Skowron M., Tadic B.: Structure and stability of online chat networks built on emotion-carrying links. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2012-16,


OFAI-TR-2012-15 ( 380kB PDF file)

Local and Global Scaling Reduce Hubs in Space

Dominik Schnitzer, Arthur Flexer, Markus Schedl, Gerhard Widmer

"Hubness" has recently been identified as a general problem of high dimensional data spaces, manifesting itself in the emergence of objects, so-called hubs, which tend to be among the k nearest neighbors of a large number of data items. As a consequence many nearest neighbor relations in the distance space are asymmetric, that is, object y is amongst the nearest neighbors of x but not vice versa. The work presented here discusses two classes of methods that try to symmetrize nearest neighbor relations and investigates to what extent they can mitigate the negative effects of hubs. We evaluate local distance scaling and propose a global variant which has the advantage of being easy to approximate for large datasets and of having a probabilistic interpretation. Both local and global approaches are shown to be effective especially for high-dimensional datasets, which are affected by high hubness. Both methods lead to a strong decrease of hubness in these datasets, while at the same time improving properties like classification accuracy. We evaluate the methods on a large number of public machine learning datasets and synthetic data. Finally we present a real-world application where we are able to achieve significantly higher retrieval quality.

Keywords: local and global scaling, shared near neighbors, hubness, classification, curse of dimensionality, nearest neighbor relation

Citation: Schnitzer D., Flexer A., Schedl M., Widmer G.: Local and Global Scaling Reduce Hubs in Space, Journal of Machine Learning Research, 13(Oct):2871-2902, 2012.


OFAI-TR-2012-14 ( 14136kB PDF file)

Evolving Topology on the Network of Online Chats

Vladimir Gligorijevic, Marcin Skowron, Bosiljka Tadic

Large amount of data collected at Web portals contain valuable information to study human behavior in the on-line communications. Recently a powerful methodology was developed to study the emergence of the collective emotional behaviors of Blog users, by combining the methods of statistical physics of complex systems with the machine-learning techniques for text analysis. Mapping the high-resolution data onto a suitable network structure makes a starting point in this approach, on which the quantitative analysis within the graph theory is based. In this work we use network mapping approach to analyse the users collective behaviors in the online chats. Specifically, having in mind character of the dynamics in IRC channels, here we analyse the evolution of the network that emerges via user contacts and in particular, evolving specific topology features on such network over successive time windows.

Keywords: Publications List Interact, Social and Information Networks, Data Analysis, Text Analysis, Online Communication Affective Computing

Citation: Gligorijevic V., Skowron M., Tadic B.: Evolving Topology on the Network of Online Chats. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2012-14,


OFAI-TR-2012-13 ( 377kB PDF file)

Affect Listeners - From dyads to group interactions with affective dialog systems

Marcin Skowron, Stefan Rank

Affect Listeners are applied as tools for studying the role of emotions in online communication. They need to interact both in dyads as well as in group settings with multiple users. In this paper, we present the evolution of such affective dialog systems from a focus on dyadic interaction to multi-party interaction on chat networks. Starting from experiments on the use of these dialog systems in virtual dyadic settings, we outline the requirements, design and implementation decisions necessary to apply the systems to affective interactions with multiple users. Finally, we introduce two realisations of Interactive Affective Bots designed for such interaction scenarios that integrate modelling of individuals and groups as part of their decision mechanism.

Keywords: Publications List Interact, affective dialog system, affective human-computer interactions, agent control architecture

Citation: Skowron M., Rank S.: Affect Listeners - From dyads to group interactions with affective dialog systems. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2012-13,


OFAI-TR-2012-12 ( 1110kB PDF file)

Entropy-growth-based model of emotionally charged online dialogues

Julian Sienkiewicz, Marcin Skowron, Georgios Paltoglou, Janusz Holyst

We analyze emotionally annotated massive data from IRC (Internet Relay Chat) and model the dialogues between its participants by assuming that the driving force for the discussion is the entropy growth of emotional probability distribution. This process is claimed to be correlated to the emergence of the power-law distribution of the discussion lengths observed in the dialogues. We perform numerical simulations based on the noticed phenomenon obtaining a good agreement with the real data. Finally, we propose a method to artificially prolong the duration of the discussion that relies on the entropy of emotional probability distribution.

Keywords: Publications List Interact, Computation and Language, Social and Information Networks, Data Analysis, Statistics and Probability, Physics and Society

Citation: Sienkiewicz J., Skowron M., Paltoglou G., Holyst J.: Entropy-growth-based model of emotionally charged online dialogues. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2012-12,


OFAI-TR-2012-11 ( 762kB PDF file)

Unsupervised Feature Learning for Speech and Music Detection in Radio Broadcasts

Jan Schlueter, Reinhard Sonnleitner

Detecting speech and music is an elementary step in extracting information from radio broadcasts. Existing solutions either rely on general-purpose audio features, or build on features specifically engineered for the task. Interpreting spectrograms as images, we can apply unsupervised feature learning methods from computer vision instead. In this work, we show that features learned by a mean-covariance Restricted Boltzmann Machine partly resemble engineered features, but outperform three hand-crafted feature sets in speech and music detection on a large corpus of radio recordings. Our results demonstrate that unsupervised learning is a powerful alternative to knowledge engineering.

Keywords: Music Information Retrieval,

Citation: Schlueter J., Sonnleitner R.: Unsupervised Feature Learning for Speech and Music Detection in Radio Broadcasts, in Proceedings of the 15th International Conference on Digital Audio Effects (DAFx-12), York, UK, 2012.


OFAI-TR-2012-10 ( 190kB PDF file)

Putting the User in the Center of Music Information Retrieval

Markus Schedl, Arthur Flexer

Personalized and context-aware music retrieval and recommendation algorithms ideally provide music that perfectly fits the individual listener in each imaginable situation and for each of her information or entertainment need. Although first steps towards such systems have recently been presented at ISMIR and similar venues, this vision is still far away from being a reality. In this paper, we investigate and discuss literature on the topic of user-centric music retrieval and reflect on why the breakthrough in this field has not been achieved yet. Given the different expertises of the authors, we shed light on why this topic is a particularly challenging one, taking a psychological and a computer science view. Whereas the psychological point of view is mainly concerned with proper experimental design, the computer science aspect centers on modeling and machine learning problems. We further present our ideas on aspects vital to consider when elaborating user-aware music retrieval systems, and we also describe promising evaluation methodologies, since accurately evaluating personalized systems is a notably challenging task.

Keywords: Music Information Retrieval, Evaluation, User studies

Citation: Schedl M., Flexer A.: Putting the User in the Center of Music Information Retrieval, Proceedings of the 13th International Society for Music Information Retrieval Conference (ISMIR'12), Porto, Portugal, October 8th-12th, 2012.


OFAI-TR-2012-09

Interactive Entertainment of Elder Persons using Intelligent and Emotional Software Agents

Lisa Szugfil, Robert Trappl

This project tried to broaden the scope of classical digital games for elderly people by developing a game which takes social and emotional aspects into account, gives elderly people the possibility to bring their own experience into the game and puts cognitive training into context. A modified version of the classical memory game was developed, in which a human played against an emotional software agent. An experiment with eighteen participants (Mage = 84.33 years) examined the influence of the game-type on the perception of and the interaction with the software agent. Furthermore the perception of the playing speed of the counter player was investigated. The results showed significantly more comments towards the software agent when playing a personalized memory game, than when playing the classical memory game. In addition, the mirrored game speed of the software agent was evaluated as being faster than the human player's own playing speed but also as optimal by the participants.

Keywords: digital game, elderly people, cognitive training in context, memory, software agent, emotions, playing speed, table top

Citation: Szugfil L., Trappl R.: Interactive Entertainment of Elder Persons using Intelligent and Emotional Software Agents. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2012-09,


OFAI-TR-2012-08 ( 1056kB PDF file)

The Hippocampal-Entorhinal Complex performs Bayesian Localization and Error Correction

T Madl, S Franklin, K Chen, D Montaldi, R Trappl

The mammalian brain updates representations of spatial location with self-motion cues, a process referred to as path integration. Since self-motion information is inherently inexact and subject to neuronal noise, this process leads to errors, which would accumulate over time if not corrected by sensory information about the environment. In this paper, we propose that the hippocampal-entorhinal complex, the major neuronal correlate representing spatial information, corrects such errors by integrating self-motion information and sensory information about the environment in a Bayes-optimal manner. Based on theoretical arguments as well as empirical data, we propose that hippocampal place cells are able to encode probability distributions and uncertainties of allocentric spatial location, and to use them for Bayesian inference to improve the accuracy of the location representation using different sources of information. We hypothesize about possible neuronal correlates of the components and processes required for such inference. Unlike most previously suggested error correction and spatial cue integration mechanisms, we not only provide a plausible neuronal basis for these mechanisms but also generate concrete predictions from our hypotheses and substantiate them with empirical data. We describe a computational model performing Bayesian localization in arbitrary two-dimensional environments in a biologically plausible way, and use it to replicate neuronal recording data as well as behaviour data in published studies in order to strengthen our claims. Our ideas tie in with a growing body of research suggesting that the brain might behave like a Bayesian machine (the Bayesian brain hypothesis [1]), and provides empirical evidence suggesting that it might employ Bayesian processes on the level of neuronal implementation.

Citation: Madl T., Franklin S., Chen K., Montaldi D., Trappl R.: The Hippocampal-Entorhinal Complex performs Bayesian Localization and Error Correction. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2012-08,


OFAI-TR-2012-07 ( 290kB PDF file)

Persistent Empirical Wiener Estimation With Adaptive Threshold Selection For Audio Denoising

Kai Siedenburg

Exploiting the persistence properties of signals leads to significant improvements in audio denoising. This contribution derives a novel denoising operator based on neighborhood smoothed, Wiener filter like shrinkage. Relations to the sparse denoising approach via thresholding are drawn. Further, a rationale for adapting the threshold level to a performance criterion is developed. Using a simple but efficient estimator of the noise level, the introduced operators with adaptive thresholds are demonstrated to act as attractive alternatives to the state of the art in audio denoising.

Keywords: Audio denoising

Citation: Siedenburg K.: Persistent Empirical Wiener Estimation With Adaptive Threshold Selection For Audio Denoising, Proceedings of the 9th Sound and Music Computing Conference (SMC 2012), Copenhagen, Denmark, 2012.


OFAI-TR-2012-06 ( 139kB PDF file)

A MIREX meta-analysis of hubness in audio music similarity

Arthur Flexer, Dominik Schnitzer, Jan Schlueter

We use results from the 2011 MIREX ``Audio Music Similarity and Retrieval'' task for a meta analysis of the hub phenomenon. Hub songs appear similar to an undesirably high number of other songs due to a problem of measuring distances in high dimensional spaces. Comparing 17 algorithms we are able to confirm that different algorithms produce very different degrees of hubness. We also show that hub songs exhibit less perceptual similarity to the songs they are close to, according to an audio similarity function, than non-hub songs. Application of the recently introduced method of ``mutual proximity'' is able to decisively improve this situation.

Keywords: Music Information Retrieval, Hubs

Citation: Proceedings of the 13th International Society for Music Information Retrieval Conference (ISMIR'12), Porto, Portugal, October 8th-12th, 2012


OFAI-TR-2012-05 ( 1818kB PDF file)

Constructing high-level perceptual audio descriptors for textural sounds

Thomas Grill

This paper describes the construction of computable audio descriptors capable of modeling relevant high-level perceptual qualities of textural sounds. These qualities - all metaphoric bipolar and continuous constructs - have been identified in previous research: high-low, ordered-chaotic, smooth-coarse, tonal-noisy, and homogeneous-heterogeneous, covering timbral, temporal and structural properties of sound. We detail the construction of the descriptors and demonstrate the effects of tuning with respect to individual accuracy or mutual independence. The descriptors are evaluated on a corpus of 100 textural sounds against respective measures of human perception that have been retrieved by use of an online survey. Potential future use of perceptual audio descriptors in music creation is illustrated by a prototypic sound browser application.

Keywords: Music Information Retrieval, Audio descriptor, Perception

Citation: Grill T.: Constructing high-level perceptual audio descriptors for textural sounds, Proceedings of the 9th Sound and Music Computing Conference (SMC 2012), pp. 486-493, Copenhagen, Denmark, 2012


OFAI-TR-2012-04 ( 1973kB PDF file)

Visualization of perceptual qualities in textural sounds

Thomas Grill, Arthur Flexer

We describe a visualization strategy that is capable of efficiently representing relevant perceptual qualities of textural sounds. The general aim is to develop intuitive screen-based interfaces representing large collections of sounds, where sound retrieval shall be much facilitated by the exploitation of cross-modal mechanisms of human perception. We propose the use of metaphoric sensory properties that are shared between sounds and graphics, constructing a meaningful mapping of auditory to visual dimensions. For this purpose, we have implemented a visualization using tiled maps, essentially combining low-dimensional projection and iconic representation. To prove the suitability we show detailed results of experiments having been conducted in the form of an online survey. Potential future use in music creation is illustrated by a prototypic sound browser application.

Keywords: Music Information Retrieval, Visualization, Perception

Citation: Grill T., Flexer A.: Visualization of perceptual qualities in textural sounds, Proceedings of the International Computer Music Conference (ICMC 2012), Ljubljana, Slovenia, 2012


OFAI-TR-2012-03 ( 1633kB PDF file)

Emotional persistence in online chatting communities

Antonios Garas, David Garcia, Marcin Skowron, Frank Schweitzer

How do users behave in online chatrooms, where they instantaneously read and write posts? We analyzed about 2.5 million posts covering various topics in Internet relay channels, and found that user activity patterns follow known power-law and stretched exponential distributions, indicating that online chat activity is not different from other forms of communication. Analysing the emotional expressions (positive, negative, neutral) of users, we revealed a remarkable persistence both for individual users and channels. I.e. despite their anonymity, users tend to follow social norms in repeated interactions in online chats, which results in a specific emotional “tone” of the channels. We provide an agent-based model of emotional interaction, which recovers qualitatively both the activity patterns in chatrooms and the emotional persistence of users and channels. While our assumptions about agent's emotional expressions are rooted in psychology, the model allows to test different hypothesis regarding their emotional impact in online communication.

Keywords: Publications List Interact, applied physics, statistical physics, modelling and theory, text analysis, online communication, affective computing

Citation: Nature - Scientific Reports, 2, 402, doi:10.1038/srep00402


OFAI-TR-2012-02 ( 337kB PDF file)

Creativity in Configuring Affective Agents for Interactive Storytelling

Stefan Rank, Steve Hoffmann, Hans-Georg Struck, Ulrike Spierling, Paolo Petta

Affective agent architectures can be used as control components in Interactive Storytelling systems for artificial autonomous characters. Creative authoring for such systems then involves configuration of these agents that translate part of the creative process to the system’s runtime, necessarily constrained by the capabilities of the specific implementation. Using a framework for presenting configuration options based on literature review; a questionnaire evaluation of authors’ preferences for character creation; and a case study of an author’s conceptualisation of the creative process, we categorise available and potential methods for configuring affective agents in existing systems regarding creative exploration. Finally, we present work-in-progress on exemplifying the different options in the ActAffAct system.

Keywords: Creativity, Authoring, Interactive Storytelling, Affective Characters

Citation: Rank S., Hoffmann S., Struck H., Spierling U., Petta P. (2012) Creativity in Configuring Affective Agents for Interactive Storytelling. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2012-02; to appear in: Proceedings, International Conference on Computational Creativity, May 30-June 1, University College Dublin, Dublin Ireland, 2012, to appear.


OFAI-TR-2012-01 ( 399kB PDF file)

Towards the development of a conceptual framework for an applied theory of problem structuring for complex agents: Questions to Luhmann's Social System Theory

Karl Neumayer, Paolo Petta

This extended abstract provides a snapshot of the current status of our efforts aimed at the development of a principled approach to corporate strategy consulting. This research is motivated by the need to improve the quality of strategic decision making of enterprises as complex agents. To this end, we take a step back and propose a paradigmatic reconceptualisation of the foundations of decision making in terms of processes underlying Problem Structuring, with implications in particular for the identity of complex agents, the notion of rationality, as well as the shaping of decision processes. The two interrelated main components are the transpersonal Weinhaus conceptual modelling framework and a structured method for the development, implementation, and verification of sound interventions. A key guideline is our aim to enable the identification of relevant, practical, and verifiable interventions. Against this body of work, we can formulate a number of candidate questions to Social Systems Theory to discuss at the Symposium, so as to: critically review our achievements and ascertain the scope of applicability of our model, identify directions and means of improvements, look for answers to open challenges, and understand the potential for a reformulation in Social Systems Theory terms.

Keywords: Corporate strategy consulting, Enterprise modelling, Transpersonal modelling, Action theory, Theory of social systems, Agent-based modelling, Business modelling,

Citation: Neumayer K., Petta P.: Towards the development of a conceptual framework for an applied theory of problem structuring for complex agents: Questions to Luhmann's Social System Theory. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2012-01.
(Extended version of the extended abstract appearing in the proceedings of the 21st European Meeting on Cybernetics and Systems Research (EMCSR 2012), April 10-13, Vienna, Austria (EU), BCSSS Bertalanffy Center for the Study of Systems Science, Vienna, Austria (EU))