OFAI

c4.5 - ofai

What it is

c4.5-ofai is a modification of Ross Quinlan's famous and widely-used program c4.5 which is available from Ross Quinlan's homepage.

The license and rules for redestribution for C4.5 do seem to have changed over time, therefore I do NOT distribute a modified version of c4.5, but rather a patch that should be applied to Release 8 of the original c4.5 sources.

Features

  • Additional options for c4.5 and c4.5rules. The additional option -h will show a short help about all possible options and their default settings.
  • Can compile and run with CygWin (and possibly VC) under Windows.
  • Representation of missing values for continuous attributes changed from -999 to 1.1E-38
  • Options to suppress the output of trees to speed up finalization for large trees, suppress pruning completely, do the learning for an attribute that is different from the last one, store the binary tree representations in differently named files, treatment of missing values in the class label, and more.
  • New program bconsult: after learning a tree with c4.5, this program can be used to create a file that contains all the predictions for a test file, based on either the pruned or unpruned tree.
  • New program bconsultr: after learning rules with c4.5rules, this program can be used to create a file that contains all the predictions for a test file for these rules.
  • New program c4.5showtree: to print the full generated pruned or unpruned tree (c4.5 will split large trees and print subtrees seperately) with different levels of detail, optionally including instance weights.
  • New program c4.5showrules: to print the rules generated with c4.5rules.
  • Class confusion matrices will print zero values as 0 instead of blank to simplify automatic parsing.

Download and Installation

Version of patch: 1.1 (2003-03-15)

  • Download the patch in the original format (318k) or gzip-compressed format (62k) or ZIP-compressed format (62k) and uncompress it if necessary.
  • Download the original c4.5-Release 8 distribution from Ross Quinlan's homepage and extract it to some directory.
  • Change to the subdirectory of the c4.5 release 8 package where the source files are in, most likely "Src".
  • Apply the patch with the command
    patch -p0 < path_to_patch_file
  • Edit the Makefile and change things as needed, e.g. change the destination directory for the binaries.
  • Build the binaries with the command make.
  • As a user with sufficient access rights to write into the destination directory for the binaries, install them with the command make install.

License

The patch (not c4.5 or the modified c4.5 generated by it! For information on the license for c4.5, please ask the author) is licensed under the GNU general public license. If you change the code from the patch to correct bugs or add other features, I would be glad to hear from you.