OFAI

Technical Reports - Query Results

Your query term was 'authors = Fürnkranz'
43 reports found
Reports are sorted by descending number

OFAI-TR-2003-35 ( 232kB PDF file)

Modeling Rule Precision

Johannes Fürnkranz

This paper reports first results of an empirical study of the precision of classification rules on an independent test set. We generated a large number of rules using a general covering algorithm and recorded their coverage on training and test sets. These meta data are briefly presented and analyzed with respect to their variance among different domains and search heuristics. The main part of the paper describes experiments that aimed at modeling the precision of the learned rules on the test set in dependence of their coverage on the training set. To this end, we trained a neural network as an evaluation function for a rule learner, and present parameter settings for the $m$-heuristic and the generalized $m$-heuristic that are optimal in the sense that they minimize the squared error of predicting the test set precision with training set coverage of positive and negative examples.

Keywords: Inductive Rule Learning, Overfitting, Error Estimates, Meta-Learning

Citation: Fürnkranz J.: Modeling Rule Precision. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2003-35, 2003


OFAI-TR-2003-26 ( 785kB PDF file)

Preference Learning: Models, Methods, Applications - Proceedings of the KI-2003 Workshop

Eyke Hüllermeier, Johannes Fürnkranz

The preferences of an individual, say, the participant of an electronic auction or the customer of an electronic store, can be expressed in various ways, either explicitly, e.g. in the form of preference statements or implicitly, e.g. through the way of acting in different situations. The problem of finding out about an individual's preferences, or about those of a group of individuals, is referred to as preference elicitation. This requires, among other things, formal models for representing preferences and methods for their (automatic) acquisition. Touching on various aspects of AI, both theoretical and practical, preference elicitation is one of this field's most recent and interesting research topics. This workshop, held at the German Conference for Artificial Intelligence (KI-2003), focused on learning methods for preference elicitation, that is on methods for inducing preferences from given observations. Like other types of complex learning tasks that have recently entered the stage in machine learning and related fields, preference learning deviates strongly from the standard problems of classification and regression. It is particularly challenging as it involves the prediction of complex structures, such as weak or partial order relations, rather than single values. Moreover, training input will not, as is usually the case, be offered in the form of complete examples but may comprise more general types of information, such as relative preferences or different kinds of indirect feedback.

Citation: Hüllermeier E., Fürnkranz J.: Preference Learning: Models, Methods, Applications - Proceedings of the KI-2003 Workshop. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2003-26, 2003


OFAI-TR-2003-14 ( 159kB PDF file)

Pairwise Preference Learning and Ranking

Johannes Fürnkranz, Eyke Hüllermeier

We consider supervised learning of a ranking function, which is a mapping from instances to total orders over a set of labels (options). The training information consists of examples with partial (and possibly inconsistent) information about their associated rankings. From these, we induce a ranking function by reducing the original problem to a number of binary classification problems, one for each pair of labels. The main objective of this work is to investigate the trade-off between the quality of the induced ranking function and the computational complexity of the algorithm, both depending on the amount of preference information given for each example. To this end, we present theoretical results on the complexity of pairwise preference learning. We also carry out some controlled experiments investigating the predictive performance of our method for different types of preference information, such as top-ranked labels and complete rankings. The domain of this study is the prediction of a rational agent's ranking of actions in an uncertain environment.

Keywords: Ranking, Pairwise Classification, Round Robin Learning

Citation: Fürnkranz J., Hüllermeier E.: Pairwise Preference Learning and Ranking. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2003-14, 2003


OFAI-TR-2002-37 ( 130kB PDF file)

Round Robin Ensembles

Johannes Fürnkranz

In this paper we investigate the performance of pairwise (or round robin) classification, originally a technique for turning multi-class problems into two-class problems, as a general ensemble technique. In particular, we show that the use of round robin ensembles will also increase the classification performance of decision tree learners, even though they can directly handle multi-class problems. The performance gain is not as large as for bagging and boosting, but on the other hand round robin ensembles have a clearly defined semantics. Furthermore, we investigate whether confidence estimates can be used to improve the accuracy of the predictions of the ensemble. Finally, we show that the advantage of pairwise classification over direct multi-class classification and one-against-all binarization increases with the number of classes, and that round robin ensembles form an interesting alternative for problems with ordered class values.

Keywords: Ensemble Methods, Round Robin Learning, Rule Learning, Decision Tree Learning, Ordered Classification

Citation: Fürnkranz J.: Round Robin Ensembles. Intelligent Data Analysis 7(5), 2003.


OFAI-TR-2002-36 ( 154kB PDF file)

On the Use of Fast Subsampling Estimates for Algorithm Recommendation

Johannes Fürnkranz, Johann Petrak, Pavel Brazdil, Carlos Soares

The use of subsampling for scaling up the performance of learning algorithms has become fairly popular in the recent literature. In this paper, we investigate the use of performance estimates obtained on a subsample of the data for the task of recommending the best learning algorithm(s) for the problem. In particular, we examine the use of subsampling estimates as features for meta-learning, thereby generalizing previous work on landmarking and on direct algorithm recommendation via subsampling. The main goal of the paper is to investigate the influence of various parameter choices on the meta-learning performance, in particular the size of training and test sets and the number of subsamples.

Citation: Fürnkranz J., Petrak J., Brazdil P., Soares C.: On the Use of Fast Subsampling Estimates for Algorithm Recommendation. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2002-36, 2002


OFAI-TR-2002-33 ( 153kB PDF file)

Web Structure Mining - Exploiting the Graph Structure of the World-Wide Web

Johannes Fürnkranz

The World-Wide Web provides every internet citizen with access to an abundance of information, but it becomes increasingly difficult to identify the relevant pieces of information. Web mining is a new research area that tries to address this problem by applying techniques from data mining and machine learning to Web data and documents. In this paper, we will give a brief overview of Web mining, with a special focus on techniques that aim at exploiting the graph structure of the Web for improved retrieval performance and classification accuracy.

Keywords: Web Mining, Web Structure Mining, Survey

Citation: J. Fürnkranz: Web Structure Mining - Exploiting the Graph Structure of the World-Wide Web, ÖGAI Journal 21(2):17-26, 2002.


OFAI-TR-2002-28 ( 218kB PDF file)

A Pathology of Bottom-Up Hill-Climbing in Inductive Rule Learning

Johannes Fürnkranz

In this paper, we close the gap between the simple and straight-forward implementations of top-down hill-climbing that can be found in the literature, and the comparably complex strategies for greedy bottom-up generalization. Our main result is that the simple bottom-up counterpart of the top-down hill-climbing algorithm is unable to learn in domains with comparably dispersed examples. In particular, we show that greedy generalization from a seed example is impossible if it differs from its nearest neighbor in more than one attribute value. We also perform an empirical study of the how frequent this case is in popular benchmark datasets, and present average-case and worst-case results for binary domains.

Keywords: Rule Learning, Hill-Climbing

Citation: Fürnkranz J.: A Pathology of Bottom-Up Hill-Climbing in Inductive Rule Learning. In Proceedings of the 13th International Conference on Algorithmic Learning Theory (ALT-02), Lübeck, Germany. Springer-Verlag 2002.


OFAI-TR-2002-20 ( 212kB PDF file)

Pairwise Classification as an Ensemble Technique

Johannes Fürnkranz

In this paper we investigate the performance of pairwise (or round robin) classification, originally a technique for turning multi-class problems into two-class problems, as a general ensemble technique. In particular, we show that the use of round robin ensembles will also increase the classification performance of decision tree learners, which could directly handle multi-class problems. The performance gain is not as large as for bagging and boosting, but on the other hand round robin ensembles have a clear semantics. Furthermore, we show that the advantage of pairwise classification over direct multi-class classification and one-against-all binarization increases with the number of classes, and that round robin ensembles form an interesting alternative for problems with ordered class values.

Keywords: Class Binarization, Pairwise Classification, Round Robin Learning, Inductive Rule Learning, Ensemble Methods

Citation: Fürnkranz J.: Pairwise Classification as an Ensemble Technique. Proceedings of the 13th European Conference on Machine Learning (ECML'02), Helsinki, Finland.


OFAI-TR-2001-30 ( 80kB g-zipped PostScript file,  2453kB PDF file)

Hyperlink Ensembles: A Case Study in Hypertext Classification

Johannes Fürnkranz

In this paper, we introduce hyperlink ensembles, a novel type of ensemble classifier for classifying hypertext documents. Instead of using the text on a page for deriving features that can be used for training a classifier, we suggest to use portions of texts from all pages that point to the target page. A hyperlink ensemble is formed by obtaining one prediction for each hyperlink that points to a page. These individual predictions for each hyperlink are subsequently combined to a final prediction for the class of the target page. We explore four different ways of combining the individual predictions and four different techniques for identifying relevant text portions. The utility of our approach is demonstrated on a set of Web-pages that relate to Computer Science Departments.

Keywords: web mining, hypertext classification, ensemble techniques, inductive rule learning

Citation: Fürnkranz J.: Hyperlink Ensembles: A Case Study in Hypertext Classification. Information Fusion 3(4):299-312, December 2002, Special Issue on Fusion of Multiple Classifiers.


OFAI-TR-2001-29 ( 417kB g-zipped PostScript file,  644kB PDF file)

User Profiling for the Melvil Knowledge Retrieval System

Johannes Fürnkranz, Christian Holzbaur, Robert Temel

Melvil is an ontology-based knowledge retrieval platform that provides a three-dimensional visualization of search results. The user can tailor the presentation of the search results to her preferences by changing the settings of various parameters on the screen. In this paper, we report on a prototype implementation of a user profiling device that learns to predict appropriate settings for these parameters for the current search results based on previous experiences. In a preliminary study, we evaluated several off-the-shelf machine learning algorithms on parts of the problem. The final implementation required the flexibility of handling both regression and classification problems, being able to deal with set-valued input and output attributes, as well as incorporating Melvil's ontologies for the respective application domain. Thus, we selected a nearest-neighbor approach for the prototype implementation. An evaluation on off-line data collected from several users showed a satisfactory performance.

Citation: Fürnkranz J., Holzbaur C., Temel R.: User Profiling for the Melvil Knowledge Retrieval System. Applied Artificial Intelligence 16(4):243-281, April 2002.


OFAI-TR-2001-18 ( 75kB g-zipped PostScript file,  154kB PDF file)

Round Robin Classification

Johannes Fürnkranz

In this paper, we discuss round robin classification (aka pairwise classification), a technique for handling multi-class problems with binary classifiers by learning one classifier for each pair of classes. We present an empirical evaluation of the method, implemented as a wrapper around the ripper rule learning algorithm, on 20 multi-class datasets from the UCI database repository. Our results show that it is very likely to improve ripper's classification accuracy without having a high risk of decreasing it. More importantly, we give a general theoretical analysis of the complexity of the approach and show that its training effort is below that of the commonly used one-against-all technique. These theoretical results are not restricted to rule learning but are also of interest to other communities where pairwise classification has recently received some attention. Furthermore, we investigate its properties as a general ensemble technique and show that round robin classification with C5.0 may improve C5.0's performance on multi-class problems. However, this improvement does not reach the performance increase of boosting, and a combination of boosting and round robin classification does not produce any gain over conventional boosting. Finally, we show that the performance of round robin classification can be further improved by performing multiple round comparisons, i.e., by integrating it with bagging.

Keywords: pairwise classification, inductive rule learning, multi-class problems, class binarization, ensemble techniques

Citation: Fürnkranz J.: Round Robin Classification. Journal of Machine Learning Research 2:721-747, March 2002.


OFAI-TR-2001-13 ( 50kB g-zipped PostScript file,  206kB PDF file)

An Evaluation of Landmarking Variants

Johannes Fürnkranz, Johann Petrak

Landmarking is a novel technique for data characterization in meta-learning. While conventional approaches typically describe a database with its statistical measurements and properties, landmarking proposes to enrich such a description with quick and easy-to-obtain performance measures of simple learning algorithms. In this paper, we will discuss two novel aspects of landmarking. First, we investigate relative landmarking, which tries to exploit the relative order of the landmark measures instead of their absolute value. Second, we propose to the use of subsampling estimates as a different way for efficiently obtaining landmarks. In general, our results are mostly negative. The most interesting result is a surprisingly simple rule that predicts quite accurately when it is worth to boost decision trees.

Keywords: Meta-Learning, Landmarking, Subsampling

Citation: Fürnkranz J., Petrak J.: An Evaluation of Landmarking Variants. In C. Giraud-Carrier, N. Lavrac, S. Moyle & B. Kavsek (eds.) Proceedings of the ECML/PKDD-01 Workshop Integrating Aspects of Data Mining, Decision Support and Meta-learning, pp.57-68, Freiburg, Germany, 2001.


OFAI-TR-2001-09 ( 88kB g-zipped PostScript file,  238kB PDF file)

Detecting Temporal Change in Event Sequences: An Application to Demographic Data

Hendrik Blockeel, Johannes Fürnkranz, Alexia Prskawetz, Francesco C. Billari

In this paper, we discuss an approach for discovering temporal changes in event sequences, and present first results from a study on demographic data. The data encode characteristic events in a person's life course, such as their birth date, the begin and end dates of their partnerships and marriages, and the birth dates of their children. The goal is to detect significant changes in the chronology of these events over people from different birth cohorts. To solve this problem, we encoded the temporal information in a first-order logic representation, and employed Warmr, an ILP system that discovers association rules in a multi-relational data set, to detect frequent patterns that show significant variance over different birth cohorts. As a case study in multi-relational association rule mining, this work illustrates the flexibility resulting from the use of first-order background knowledge, but also uncovers a number of important issues that hitherto received little attention.

Keywords: Data Mining, Association Rules, Temporal Patterns, Inductive Logic Programming, Demography, Life Course Analysis

Citation: Blockeel H., Fürnkranz J., Prskawetz A., Billari F.: Detecting Temporal Change in Event Sequences: An Application to Demographic Data. In L. De Raedt and A. Siebes (Eds.), Proceedings of the 5th European Conference on Principles of Data Mining and Knowledge Discovery (PKDD-01), Freiburg, Germany. Springer-Verlag 2001.


OFAI-TR-2001-02 ( 57kB g-zipped PostScript file,  130kB PDF file)

Round Robin Rule Learning

Johannes Fürnkranz

In this paper, we discuss a technique for handling multi-class problems with binary classifiers. The idea - learning one classifier for each pair of classes - is known as pairwise classification but - to our knowledge - has not yet been thoroughly investigated in the context of inductive rule learning. We present an empirical evaluation of the method as a wrapper around the Ripper rule learning algorithm on 20 multi-class datasets from the UCI database repository. Our results show that the method is very likely to improve Ripper's classification performance without having a high risk of decreasing it. The size of this improvement is similar to that obtained by boosting C5. In addition, we give a theoretical analysis of the complexity of the approach and show that its training time is within a small constant bound of the training time of the sequential class learning technique that is currently used in Ripper.

Keywords: Rule Learning, Pairwise Classification, Class Binarization

Citation: Fürnkranz J.: Round Robin Rule Learning. In Proceedings of the 18th International Conference on Machine Learning (ICML-01). Williamstown, MA. Morgan Kaufmann, 2001.


OFAI-TR-2001-01 ( 55kB g-zipped PostScript file,  117kB PDF file)

An Evaluation of Grading Classifiers

Alexander K. Seewald, Johannes Fürnkranz

In this paper, we introduce grading, a novel meta-classification scheme. While stacking uses the predictions of the base classifiers as meta-level attributes, we use ``graded'' predictions (i.e., predictions that have been marked as correct or incorrect) as meta-level classes. For each base classifier, one meta classifier is learned whose task is to predict when the base classifier will err. Hence, just like stacking may be viewed as a generalization of voting, grading may be viewed as a generalization of selection by cross-validation and therefore fills a conceptual gap in the space of meta-classification schemes. Grading may also be interpreted as a technique for turning the error-characterizing technique introduced by Bay and Pazzani (2000) into a powerful learning algorithm by resorting to an ensemble of meta-classifiers. Our experimental evaluation shows that this step results in a performance gain that is quite comparable to that achieved by stacking, while both, grading and stacking outperform their simpler counter-parts voting and selection by cross-validation.

Keywords: Machine Learning, Classification, Ensembles

Citation: Seewald A., Fürnkranz J.: An Evaluation of Grading Classifiers. In Advances in Intelligent Data Analysis: Proceedings of the 4th International Symposium (IDA-01). Lisbon, Portugal. Springer-Verlag 2001.


OFAI-TR-2000-31 ( 109kB g-zipped PostScript file,  255kB PDF file)

Machine Learning in Games: A Survey

Johannes Fürnkranz

This paper provides a survey of previously published work on machine learning in game playing. The material is organized around a variety of problems that typically arise in game playing and that can be solved with machine learning methods. This approach, we believe, allows both, researchers in game playing to find appropriate learning techniques for helping to solve their problems as well as machine learning researchers to identify rewarding topics for further research in game-playing domains. The paper covers learning techniques that range from neural networks to decision tree learning in games that range from poker to chess. However, space constraints prevent us from giving detailed introductions to the used learning techniques or games. Overall, we aimed at striking a fair balance between being exhaustive and being exhausting.

Keywords: Machine Learning, Game Playing

Citation: To appear in J. Fürnkranz & M. Kubat (eds.): Machines that Learn to Play Games, Nova Scientific Publishers, Chapter 2, pp. 11--59, Huntington, NY, 2001.


OFAI-TR-2000-30 ( 118kB PDF file)

Timing, Sequencing, and Quantum of Life Course Events: a Machine Learning Approach

Francesco C. Billari, Johannes Fürnkranz, Alexia Prskawetz

In this methodological paper we discuss and apply machine learning techniques, a core research area in the artificial intelligence literature, to analyse simultaneously timing, sequencing, and quantum of life course events from a comparative perspective. We outline the need for techniques which allow the adoption of a holistic approach to the analysis of life courses, illustrating the specific case of the transition to adulthood. We briefly introduce machine learning algorithms to build decision trees and rule sets and then apply such algorithms to delineate the key features which distinguish Austrian and Italian pathways to adulthood, using Fertility and Family Survey data. The key role of sequencing and synchronisation between events emerges clearly from the methodology used.

Keywords: life course, event history, data mining, machine learning, transition to adulthood

Citation: Billari F., Fürnkranz J., Prskawetz A.: Timing, Sequencing, and Quantum of Life Course Events: a Machine Learning Approach. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2000-30, 2000. Also appeared as Working Paper WP 2000-10, Max-Planck Institute for Demographic Research, Rostock, Germany.


OFAI-TR-2000-02 ( 60kB g-zipped PostScript file,  74kB PDF file)

Searching for Patterns in Political Event Sequences: Experiments with the KEDS Database

Klaus Kovar, Johannes Fürnkranz, Johann Petrak, Bernhard Pfahringer, Robert Trappl, Gerhard Widmer

The paper presents an empirical study on the possibility of discovering interesting event sequences and sequential rules in a large database of international political events. We have implemented and extended a data mining algorithm, first presented by Mannila and Toivonen (1996), which is able to search for generalized episodes in such event databases. Experiments conducted with this algorithm on the KEDS database, an event data set covering interactions between countries in the Persian Gulf region, are described. We report some qualitative and quantitative results and also discuss our experiences with strategies for reducing the problem complexity and focussing the search on interesting subsets of events.

Citation: Kovar, K., Fürnkranz, J., Petrak, J., Pfahringer, B., Trappl, R., and Widmer, G. (2000). Searching for Patterns in Political Event Sequences. Cybernetics and Systems 31(6).


OFAI-TR-99-12 ( 40kB g-zipped PostScript file,  76kB PDF file)

Learning to Make Good Use of Operational Advice

Bernhard Pfahringer, Hermann Kaindl, Stefan Kramer, Johannes Fürnkranz

We address the problem of advice-taking in a given domain, in particular for building a game-playing program. Our approach to solving it strives for the application of machine learning techniques throughout, i.e., for avoiding knowledge elicitation by any other means as much as possible. In particular, we build upon existing work on the operationalization of advice by machine and assume that advice is already available in operational form. The relative importance of this advice is, however, not yet known and can therefore not be utilized well by a program. This paper presents an approach to determine the relative importance for a given situation through reinforcement learning. We implemented this approach for the game of Hearts and gathered some empirical evidence on its usefulness through experiments. The results show that the programs built according to our approach learned to make good use of the given operational advice.

Keywords: Machine Learning, Game Playing, Hearts, Advice Taking, Reinforcement Learning

Citation: Pfahringer B., Kaindl H., Kramer S., Fürnkranz J.: Learning to Make Good Use of Operational Advice. In Proc. ICML-99 WS on Machine Learning in Game Playing, Bled, Slovenia, 1999.


OFAI-TR-98-30 ( 28kB g-zipped PostScript file,  295kB PDF file)

A Study Using n-gram Features for Text Categorization

Johannes Fürnkranz

In this paper, we study the effect of using n-grams (sequences of words of length n) for text categorization. We use an efficient algorithm for generating such n-gram features in two benchmark domains, the 20 newsgroups data set and 21,578 REUTERS newswire articles. Our results with the rule learning algorithm RIPPER indicate that, after the removal of stop words, word sequences of length 2 or 3 are most useful. Using longer sequences reduces classification performance.

Keywords: Machine Learning, Text Categorization

Citation: Fürnkranz J.: A Study Using n-gram Features for Text Categorization, Austrian Research Institute for Artificial Intelligence, Vienna, TR-98-30, 1998.


OFAI-TR-98-29 ( 29kB g-zipped PostScript file,  300kB PDF file)

Using Links for Classifying Web-pages

Johannes Fürnkranz

In this paper, we report on a systematic set of experiments that explore the utility of making use of such structural information. Our working hypothesis is that (at least in some domains) it is easier to classify hypertext pages using information provided on pages that point to a page instead of using information that is provided on the page itself. We present a set of experiments that confirm this hypothesis on a set of Web-pages that relate to Computer Science Departments.

Keywords: Text Categorization, Rule Learning, WWW, Hyperlinks

Citation: Fürnkranz J.: Using Links for Classifying Web-pages, Austrian Research Institute for Artificial Intelligence, Vienna, TR-98-29, 1998.


OFAI-TR-97-33 ( 55kB g-zipped PostScript file)

Knowledge Discovery in Chess Databases: A Research Proposal

Johannes Fürnkranz

In this paper we argue that chess databases have a significant potential as a test-bed for techniques in the area of Knowledge Discovery in Databases (KDD). Conversely, we think that research in Artificial Intelligence has not yet come up with reasonable solutions for the knowledge representation and reasoning problems that are posed by knowledge-based computer chess programs, and consequently argue that KDD techniques could be useful for the advancement of various types of knowledge-based computer chess systems. Although we cannot present any concrete results, we hope to outline some fruitful directions for further research and exchange of ideas between the KDD and computer chess communities.

Keywords: KDD, Data Mining, Chess, Game Playing

Citation: Fürnkranz J.: Knowledge Discovery in Chess Databases: A Research Proposal, Austrian Research Institute for Artificial Intelligence, Vienna, TR-97-33, 1997.


OFAI-TR-97-26 ( 29kB g-zipped PostScript file)

Dimensionality Reduction in ILP: A Call to Arms

Johannes Fürnkranz

The recent uprise of Knowledge Discovery in Databases (KDD) has underlined the need for machine learning algorithms to be able to tackle large-scale applications that are currently beyond their scope. One way to address this problem is to use technologies for reducing the dimensionality of the learning problem by reducing the hypothesis space and/or reducing the example space. While research in machine learning has devoted considerable attention to such techniques, they have so far been neglected in ILP research. The purpose of this paper is to motivate research in this area and to present some results on windowing techniques.

Citation: Fürnkranz J.: Dimensionality Reduction in ILP: A Call to Arms, Proceedings of the IJCAI-97 Workshop on Frontiers of Inductive Logic Programming, Nagoya, Japan, August 1997


OFAI-TR-97-25 ( 56kB g-zipped PostScript file)

On Effort in AI Research: A Description along Two Dimensions

Franz-Günter Winkler, Johannes Fürnkranz

In this paper we describe Artificial Intelligence as research that moves along two different axes: human-compatible knowledge and machine-compatible processing. An analysis of computer chess research along these dimensions shows that AI more and more diverges into an engineering branch and a cognitive branch. As an explanation, we offer a hypothesis about the dependency of research effort on these dimensions. It becomes obvious that the most rewarding projects are the hardest.

Citation: Winkler F., Fürnkranz J.: On Effort in AI Research: A Description along Two Dimensions, Proceedings of the AAAI-97 Workshop on Deep Blue vs. Kasparov: The Significance for Artificial Intelligence, Providence, R.I., July 1997


OFAI-TR-97-10 ( 64kB g-zipped PostScript file)

Machine Learning and Case-based Reasoning: Their Potential Role in Preventing the Outbreak of Wars or in Ending Them

Robert Trappl, Johannes Fürnkranz, Johann Petrak, Jacob Bercovitch

In a current project we investigate the potential contribution of Artificial Intelligence for the avoidance and termination of crises and wars. This paper reports some results obtained by analyzing international conflict databases using machine learning and case-based reasoning techniques.

Keywords: Machine Learning, Case-Based Reasoning, Data Mining, Knowledge Discovery in Databases, International Relations, Peace

Citation: Trappl R., Fürnkranz J., Petrak J., Bercovitch J.: Machine Learning and Case-based Reasoning: Their Potential Role in Preventing the Outbreak of Wars or in Ending Them, in G. della Riccia, H.-J. Lenz, and R. Kruse (eds.), Learning, Networks and Statistics: Proceedings of the ISSEK-96 Workshop, pp. 209-225, 1997.


OFAI-TR-97-07 ( 55kB g-zipped PostScript file)

Noise-tolerant Windowing

Johannes Fürnkranz

Windowing has been proposed as a procedure for efficient memory use in the ID3 decision tree learning algorithm. However, previous work has shown that it may often lead to a decrease in performance, in particular in noisy domains. Following up on previous work, where we have shown that the ability of separate-and-conquer rule learning algorithms to learn rules independently can be exploited for more efficient windowing procedures, we demonstrate in this paper how this property can be exploited to achieve noise-tolerance in windowing.

Citation: Fürnkranz J.: Noise-tolerant Windowing, in Proceedings of the 15th International Joint Conference on Artificial Intelligence (IJCAI-97), pp. 852-857, Nagoya, Japan, 1997.


OFAI-TR-97-01 ( 54kB g-zipped PostScript file)

More Efficient Windowing

Johannes Fürnkranz

Windowing has been proposed as a procedure for efficient memory use in the ID3 decision tree learning algorithm. However, previous work has shown that windowing may often lead to a decrease in performance. In this work, we try to argue that separate-and-conquer rule learning algorithms are more appropriate for windowing than divide-and-conquer algorithms, because they learn rules independently and are less susceptible to changes in class distributions. In particular, we will present a new windowing algorithm that achieves additional gains in efficiency by exploiting this property of separate-and-conquer algorithms. While the presented algorithm is only suitable for redundant, noise-free data sets, we will also briefly discuss the problem of noisy data in windowing and present some preliminary ideas how it might be solved with an extension of the algorithm introduced in this paper.

Keywords: Machine Learning, Rule Learning, Efficiency

Citation: Fürnkranz J.: More Efficient Windowing, in Proceedings of the 14th National Conference on Artificial Intelligence (AAAI-97), pp. 509-514, Providence, RI, 1997.


OFAI-TR-96-25 ( 127kB g-zipped PostScript file)

Separate-and-Conquer Rule Learning

Johannes Fürnkranz

This paper is a survey of inductive rule learning algorithms that use a separate-and-conquer strategy. This strategy can be traced back to the AQ learning system and still enjoys popularity as can be seen from its frequent use in Inductive Logic Programming systems. We will put this wide variety of algorithms into a single framework and analyze them along three different dimensions, namely their search, language and overfitting avoidance biases.

Keywords: Machine Learning, Inductive Logic Programming, Rule Learning

Citation: Fürnkranz J.: Separate-and-Conquer Rule Learning, Artificial Intelligence Review 13(1), 1999.


OFAI-TR-96-24 ( 114kB g-zipped PostScript file)

Knowledge Discovery in International Conflict Databases

Johannes Fürnkranz, Johann Petrak, Robert Trappl

Artificial Intelligence is heavily supported by military institutions,while practically no effort goes into the investigation of possible contributions of AI to the avoidance and termination of crises and wars. This paper makes a first step into this direction by investigating the use of machine learning techniques for discovering knowledge in international conflict and conflict management databases. We have applied similarity-based case retrieval to the KOSIMO database of international conflicts. Furthermore, we present results of analyzing the CONFMAN database of successful and unsuccessful conflict management attempts with an inductive decision tree learning algorithm. The latter approach seems to be particularly promising, as conflict management events apparently are more repetitive and thus better suited for machine-aided analysis.

Keywords: Machine Learning, Data Mining, Knowledge Discovery, Peace

Citation: Fürnkranz J., Petrak J., Trappl R.: Knowledge Discovery in International Conflict Databases, Applied Artificial Intelligence 11(2):91-118, March 1997.


OFAI-TR-96-11 ( 57kB g-zipped PostScript file)

Machine Learning in Computer Chess: The Next Generation

Johannes Fürnkranz

Ten years ago the ICCA Journal published an overview of machine learning approaches to computer chess (Skiena,1986). The author's results were rather pessimistic. In particular he concludes that ``with the exception of rote learning in the opening book few results have trickled into competitive programs'' and that ``there appear no research projects on the horizon which offer reason for optimism''. In this paper we will update Skiena's work with research that has been conducted in this area since the publication of his paper. By doing so we hope to show that at least Skiena's second conclusion is no longer valid.

Citation: Fürnkranz J.: Machine Learning in Computer Chess: The Next Generation, International Computer Chess Association Journal 19(3):147-161, September 1996.


OFAI-TR-96-10 ( 54kB g-zipped PostScript file)

Digging for Peace: Using Machine Learning Methods for Assessing International Conflict Databases

Robert Trappl, Johannes Fürnkranz, Johann Petrak

In the last decade research in Machine Learning has developed a variety of powerful tools for inductive learning and data analysis. On the other hand, research in International Relations has developed a variety of different conflict databases that are mostly analyzed with classical statistical methods. As these databases are in general of a symbolic nature, they provide an interesting domain for application of Machine Learning algorithms. This paper gives a short overview of available conflict databases and subsequently concentrates on the application of machine learning methods for the analysis and interpretation of such databases.

Citation: Trappl R., Fürnkranz J., Petrak J.: Digging for Peace: Using Machine Learning Methods for Assessing International Conflict Databases, Proc. of the 12th European Conference on Artificial Intelligence (ECAI-96), 1996.


OFAI-TR-96-07 ( 111kB g-zipped PostScript file)

Pruning Algorithms for Rule Learning

Johannes Fürnkranz

Pre-Pruning and Post-Pruning are two standard methods of dealing with noise in decision tree learning. Pre-Pruning methods deal with noise during learning, while post-pruning methods try to address this problem after an overfitting theory has been learned. This paper shows how pre- and post-pruning algorithms can be used for separate-and-conquer rule learning algorithms. We discuss some fundamental problems and show how to solve them with two new algorithms that combine and integrate pre- and post-pruning.

Citation: Fürnkranz J.: Pruning Algorithms for Rule Learning, Machine Learning 27(2):139-171, May 1997.


OFAI-TR-95-03 ( 68kB g-zipped PostScript file)

A Tight Integration of Pruning and Learning

Johannes Fürnkranz

This paper outlines some problems that may occur with Reduced Error Pruning in rule learning algorithms. In particular we show that pruning complete theories is incompatible with the separate-and-conquer learning strategy that is commonly used in propositional and relational rule learning systems. As a solution we propose to integrate pruning into learning and examine two algorithms, one that prunes at the clause level and one that prunes at the literal level. Experiments show that these methods are not only much more efficient, but also able to achieve small gains in accuracy by solving the outlined problem.

Citation: Fürnkranz J.: A Tight Integration of Pruning and Learning, in N. Lavrac and S. Wrobel (eds.), Proceedings of the 8th European Conference on Machine Learning (ECML-95), pp. 291-294, Crete, Greece, 1995.


OFAI-TR-94-33 ( 91kB g-zipped PostScript file)

Machine Learning Methods for International Conflict Databases: A Case Study in Predicting Mediation Outcome

Johannes Fürnkranz, Johann Petrak, Robert Trappl, Jacob Bercovitch

This paper tries to identify rules and factors that are predictive for the outcome of international conflict management attempts. We use C4.5, an advanced Machine Learning algorithm, for generating decision trees and prediction rules from cases in the CONFMAN database. The results show that simple patterns and rules are often not only more understandable, but also more reliable than complex rules. Simple decision trees are able to improve the chances of correctly predicting the outcome of a conflict management attempt. This suggests that mediation is more repetitive than conflicts per se, where such results have not been achieved so far.

Citation: Fürnkranz J., Petrak J., Trappl R., Bercovitch J.: Machine Learning Methods for International Conflict Databases: A Case Study in Predicting Mediation Outcome , Austrian Research Institute for Artificial Intelligence, Vienna, TR-94-33, 1994.


OFAI-TR-94-32 ( 104kB g-zipped PostScript file)

The Possible Contribution of AI to the Avoidance of Crises and Wars: Using CBR Methods with the KOSIMO Data Base of Conflicts

Johann Petrak, Robert Trappl, Johannes Fürnkranz

This paper presents the application of Case-Based Reasoning methods to the KOSIMO data base of international conflicts. A Case-Based Reasoning tool - VIE-CBR - has been deveolped and used for the classification of various outcome variables, like political, military, and territorial outcome, solution modalities, and conflict intensity. In addition, the case retrieval algorithms are presented as an interactive, user-modifiable tool for intelligently searching the conflict data base for precedent cases.

Citation: Petrak J., Trappl R., Fürnkranz J.: The Possible Contribution of AI to the Avoidance of Crises and Wars: Using CBR Methods with the KOSIMO Data Base of Conflicts, Austrian Research Institute for Artificial Intelligence, Vienna, TR-94-32, 1994.


OFAI-TR-94-28 ( 261kB g-zipped PostScript file)

Efficient Pruning Methods for Relational Learning

Johannes Fürnkranz

This thesis is concerned with efficient methods for achieving noise-tolerance in Machine Learning algorithms that are capable of using relational background knowledge. While classical algorithms are restricted to learn propositional concepts in the form of decision trees or decision lists, relational learning algorithms are able to include into the learning process not only knowledge about data attributes and values, but also about relations between the attributes. As these algorithms use a more powerful representation language --- they learn PROLOG programs for classification --- they are part of the recent field of Inductive Logic Programming, a new research area at the intersection of Machine Learning and Logic Programming. In this work we first review several known methods for achieving noise-tolerance and put them into a unified framework and then introduce three new and improved algorithms. The two basic approaches to pruning are either to try to recognize noise in the data during the learning process (pre-pruning) or to first learn a theory from the data as they are and subsequently try to detect and correct the resulting mistakes in this theory (post-pruning). Both approaches having their advantages, the major part of this thesis is devoted to trying to combine and integrate them into new powerful algorithms. A series of experiments with artificial and natural data sets demonstrates the usefulness of these approaches.

Citation: Fürnkranz J.: Efficient Pruning Methods for Relational Learning, Ph.D. Thesis, Technical University of Vienna, Austria, November 1994.


OFAI-TR-94-26 ( 58kB g-zipped PostScript file)

Pruning Methods for Rule Learning Algorithms

Johannes Fürnkranz

In this paper we will shortly review several pruning methods for relational learning algorithms and show how they are related to each other. We then report some experiments in several natural domains and try to analyse the performance of the algorithms in these domains in terms of run-time and accuracy. While some algorithms are clearly faster than others, no safe recommendation for achieving high accuracy can be given.

Citation: Fürnkranz J.: Pruning Methods for Rule Learning Algorithms, Proc. 4th Intl. Inductive Logic Programming Workshop, ILP-94, pp. 321-336, Bad Honnef/Bonn, 1994.


OFAI-TR-94-16 ( 50kB g-zipped PostScript file)

A Comparison of Pruning Methods for Relational Concept Learning

Johannes Fürnkranz

Pre-Pruning and Post-Pruning are two standard methods of dealing with noise in concept learning. Pre-Pruning methods are very efficient, while Post-Pruning methods typically are more accurate, but much slower, because they have to generate an overly specific concept description first. We have experimented with a variety of pruning methods, including two new methods that try to combine and integrate pre- and post-pruning in order to achieve both accuracy and efficiency. This is verified with test series in a chess position classification task.

Citation: Fürnkranz J.: A Comparison of Pruning Methods for Relational Concept Learning, Proc. AAAI-94 Workshop on Knowledge Discovery in Databases, Seattle, WA, 1994.


OFAI-TR-94-09 ( 61kB g-zipped PostScript file)

Incremental Reduced Error Pruning

Johannes Fürnkranz, Gerhard Widmer

This paper outlines some problems that may occur with Reduced Error Pruning in Inductive Logic Programming, most notably efficiency. Thereafter a new method, Incremental Reduced Error Pruning, is proposed that attempts to address all of these problems. Experiments show that in many noisy domains this method is much more efficient than alternative algorithms, along with a slight gain in accuracy. However, the experiments show as well that the use of this algorithm cannot be recommended for domains with a very specific concept description.

Citation: Fürnkranz J., Widmer G.: Incremental Reduced Error Pruning, Proc. 11th Intl. Conf. on Machine Learning, ML-94, pp. 70-77, Rutgers University, NJ, 1994.


OFAI-TR-94-03 ( 49kB g-zipped PostScript file)

Top-Down Pruning in Relational Learning

Johannes Fürnkranz

Pruning is an effective method for dealing with noise in Machine Learning. Recently pruning algorithms, in particular Reduced Error Pruning, have also attracted interest in the field of Inductive Logic Programming. However, it has been shown that these methods can be very inefficient, because most of the time is wasted for generating clauses that explain noisy examples and subsequently pruning these clauses. We introduce a new method which searches for good theories in a top-down fashion to get a better starting point for the pruning algorithm. Experiments show that this approach can significantly lower the complexity of the task as well as increase predictive accuracy.

Citation: Fürnkranz J.: Top-Down Pruning in Relational Learning, Proc. 11th European Conf. on Artificial Intelligence, ECAI-94, pp. 453-457, Amsterdam, John Wiley and Sons, 1994.


OFAI-TR-93-28 ( 88kB g-zipped PostScript file)

FOSSIL: A robust relational learner

Johannes Fürnkranz

The research reported in this paper describes FOSSIL, an ILP system that uses a search heuristic based on statistical correlation. Several interesting properties of this heuristic are discussed, and a it is shown how it naturally can be extended with a simple, but powerful stopping criterion that is independent of the number of training examples. Instead, FOSSIL's stopping criterion depends on a search heuristic that estimates the utility of literals on a uniform scale. After a comparison with FOIL and mFOIL in the KRK domain and on the mesh data, we outline some ideas how FOSSIL can be adopted for top-down pruning and present some preliminary results.

Citation: Fürnkranz J.: FOSSIL: A robust relational learner, An extended version of the paper in Proc. European Conf. on Machine Learning, ECML-94, pp. 122-137, Catania, Italy, Springer-Verlag, 1994.


OFAI-TR-93-16 ( 50kB g-zipped PostScript file)

Avoiding Noise Fitting in a FOIL-like Learning Algorithm

Johannes Fürnkranz

The research reported in this paper describes FOSSIL, an ILP system that uses a search heuristic based on statistical correlation. This algorithm implements a new method for learning useful concepts in the presence of noise. In contrast to FOIL's stopping criterion which allows theories to grow in complexity as the size of the training sets increase, we propose a new stopping criterion that is independent of the number of training examples. Instead, FOSSIL's stopping criterion depends on a search heuristic that estimates the utility of literals on a uniform scale.

Citation: Fürnkranz J.: Avoiding Noise Fitting in a FOIL-like Learning Algorithm, Proc. IJCAI-93 workshop on Inductive Logic Programming, Chambery, France, August 1993.


OFAI-TR-93-09 ( 128kB g-zipped PostScript file)

The Role of Qualitative Knowledge in Machine Learning

Johannes Fürnkranz

This paper analyzes the important role qualitative knowledge plays in Machine Learning. For this purpose important results from research in the fields approximate theory formation, automated qualitative modeling, learning in plausible domain theories and learning with abstractions are reviewed. The analysis of these approaches shows several of the benefits the use of qualitative knowledge can bring to Machine Learning and also points out important problems that have to be dealt with. The need for qualitative knowledge to keep learning tractable is illustrated with examples from the domain of chess. Finally we make some suggestions for further research based on the shortcomings of previous approaches.

Citation: Fürnkranz J.: The Role of Qualitative Knowledge in Machine Learning, Austrian Research Institute for Artificial Intelligence, Vienna, TR-93-09, 1993.