OFAI

METAL - Miscellanous Software

On this page you find miscellanous software that was used in the METAL Project. It can be used together with the METAL-MLEE software package to repeat the experiments for obtaining meta-data or obtain new meta-data for new databases. See the Data Mining Advisor web page for more information on the outcome of the project.

NOTE: Precompiled binaries of programs are provided for convenience. We were careful to avoid potential security and safety problems like viri, but do not take any responsibility for any damage that might occur. If you want to make sure, please download the source code, inspect it, and compile it yourself.


Base-level Classification Algorithm

This program will build a classification model that will assign either the most frequent class label encountered, or a random class label (or rather, the first class label that happens to be listed in the "names" file for the database). This algorithm can be used to create the class label files and the necessary timing information needed for comparing more advanced learning algorithms with a reasonable "minimum" or "base-level" algorithm.

Usage

baseclearn {-t|-e} -f filestem -l N -m modelfile -p predictionfile -n N [-v] [-h]

  • -t: Run the training phase - a model file to be created must be specified with the -m flag.
  • -e: Run the evaluation phase - a model file to be used must be specified with the -m flag and a prediction file to be created with the -p flag.
  • -f filestem: The filestem (full path without the extension) of the database files: for training files named <filestem>.names and <filestem>.data will be used, for evaluation the files named <filestem>.names and <filestem>.test
  • -l N: N=1: use the first label encountered in the .names file. N=2: use the most frequent label encountered in the .data file (in case of a tie, the most frequen class which is encountered first in the names file is picked).
  • -m modelfile: The full path of the model file to be created or used.
  • -p predictionfile: The full path of the prediction file to be created.
  • -n N: The maximum number of cases to be read during the training phase. The value 0 means no limit.
  • -v: Show additional messages on standard error.
  • -h: Show usage info and exit.

Download

Source code: baseclearn-src.tgz (~4K)
Optional precompiled binaries for x86/linux: baseclearn (~424K)
Optional precompiled binarys for x86/win32+Cygwin: baseclearn.exe (~124K)
Optional precompiled binaries for Sparc/Solaris2.6: baseclearn (~154K)

License

This software is distributed under the terms of the GNU General Public License.


c5test

This program is a modification for use with METAL-MLEE of the free programs for reading and interpreting C5.0 models as available from the Rulequest download page.

Usage

c5test -f filestem [-r] [-b] -p predictionfile

  • -f filestem: The filestem (full path without the extension) of the database files: for training files named <filestem>.names and <filestem>.data will be used, for evaluation the files named <filestem>.names and <filestem>.test
  • -r: use rulesets instead of decision trees
  • -b: invoke boosting if possible
  • -p predictionfile: The full path of the prediction file to be created.
The program has been modified to expect a file with the extension .test instead of .cases, accept the additional option -p and output the average error of prediction for the cases processed. Also, the output format of the prediction file has been changed to just contain the predicted class label of each case in each line.

Download

Source code: c5test-src.tgz (~11K)
Optional precompiled binaries for x86/linux: c5test (~29K)
Optional precompiled binarys for x86/win32+Cygwin: c5test.exe (~35K)
Optional precompiled binaries for Sparc/Solaris2.6: c5test (~62K)

License

This software is distributed as public domain software.


cubist_test

This program is a modification for use with METAL-MLEE of the free programs for reading and interpreting Cubist models as available from the Rulequest download page.

Usage

cubist_test -f filestem -p predictionfile

  • -f filestem: The filestem (full path without the extension) of the database files: for training files named <filestem>.names and <filestem>.data will be used, for evaluation the files named <filestem>.names and <filestem>.test
  • -p predictionfile: The full path of the prediction file to be created.
The program has been modified to expect a file with the extension .test instead of .cases, accept the additional option -p and output the average absolute error and the mean squared error of prediction for the cases processed. Also, the output format of the prediction file has been changed to just contain the predicted value for each case in each line.

Download

Source code: cubist_test-src.tgz (~24K)
Optional precompiled binaries for x86/linux: cubist_test (~109K)
Optional precompiled binarys for x86/win32+Cygwin: cubist_test.exe (~110K)
Optional precompiled binaries for Sparc/Solaris2.6: cubist_test (~76K)

License

This software is distributed as public domain software.


fselQ1

This program performs a simple class-aware feature selection by analyzing the maximum differences of the variances of field values when grouped by class (see ...). This program is based on the free programs for reading and interpreting See 5 models as available from the Rulequest download page.

Usage

cubist_test -f filestem -o resultfile [-s signlvl] [-v] [-d] [-h]

  • -f filestem: The filestem (full path without the extension) of the database files <filestem>.names and <filestem>.data
  • -o resultfile: The full path of the where the numbers of the selected attributes will be written.
  • -s signlvl: The significance level (in standard deviations) that must be exceeded to include an attribute (default: 2.0).

Download

Source code: fselQ1-src.tgz (~11K)
Optional precompiled binaries for x86/linux: fselQ1 (~61K)
Optional precompiled binarys for x86/win32+Cygwin: fselQ1.exe (~63K)
Optional precompiled binaries for Sparc/Solaris2.6: fselQ1 (~87K)

License

This software is distributed as public domain software.


atrib_list

List the numbers of the attributes that occur in a decision tree model generated by C5.0. This program is a modification for use with METAL-MLEE of the free programs for reading and interpreting Cubist models as available from the Rulequest download page.

Usage

atrib_list -f filestem [-r] [-b]

  • -f filestem: The filestem (full path without the extension) of the database files

Download

Source code: atrib_list-src.tgz (~11K)
Optional precompiled binaries for x86/linux: atrib_list (~31K)
Optional precompiled binarys for x86/win32+Cygwin: atrib_list.exe (~36K)
Optional precompiled binaries for Sparc/Solaris2.6: atrib_list (~33K)

License

This software is distributed as public domain software.


Last modified: Thu Oct 17 23:20:51 CEST 2002