Class for evaluating machine learning models.
-------------------------------------------------------------------
General options when evaluating a learning scheme from the command-line:
-t filename
Name of the file with the training data. (required)
-T filename
Name of the file with the test data. If missing a cross-validation is performed.
-c index
Index of the class attribute (1, 2, ...; default: last).
-x number
The number of folds for the cross-validation (default: 10).
-no-cv
No cross validation. If no test file is provided, no evaluation is done.
-split-percentage percentage
Sets the percentage for the train/test set split, e.g., 66.
-preserve-order
Preserves the order in the percentage split instead of randomizing the data first with the seed value ('-s').
-s seed
Random number seed for the cross-validation and percentage split (default: 1).
-m filename
The name of a file containing a cost matrix.
-l filename
Loads classifier from the given file. In case the filename ends with ".xml", a PMML file is loaded or, if that fails, options are loaded from XML.
-d filename
Saves classifier built from the training data into the given file. In case the filename ends with ".xml" the options are saved XML, not the model.
-v
Outputs no statistics for the training data.
-o
Outputs statistics only, not the classifier.
-i
Outputs information-retrieval statistics per class.
-k
Outputs information-theoretic statistics.
-classifications "weka.classifiers.evaluation.output.prediction.AbstractOutput + options"
Uses the specified class for generating the classification output. E.g.: weka.classifiers.evaluation.output.prediction.PlainText or : weka.classifiers.evaluation.output.prediction.CSV -p range
Outputs predictions for test instances (or the train instances if no test instances provided and -no-cv is used), along with the attributes in the specified range (and nothing else). Use '-p 0' if no attributes are desired.
Deprecated: use "-classifications ..." instead.
-distribution
Outputs the distribution instead of only the prediction in conjunction with the '-p' option (only nominal classes).
Deprecated: use "-classifications ..." instead.
-r
Outputs cumulative margin distribution (and nothing else).
-g
Only for classifiers that implement "Graphable." Outputs the graph representation of the classifier (and nothing else).
-xml filename | xml-string
Retrieves the options from the XML-data instead of the command line.
-threshold-file file
The file to save the threshold data to. The format is determined by the extensions, e.g., '.arff' for ARFF format or '.csv' for CSV.
-threshold-label label
The class label to determine the threshold data for (default is the first label)
-------------------------------------------------------------------
Example usage as the main of a classifier (called FunkyClassifier):
public static void main(String [] args) { runClassifier(new FunkyClassifier(), args); }
------------------------------------------------------------------
Example usage from within an application:
Instances trainInstances = ... instances got from somewhere Instances testInstances = ... instances got from somewhere Classifier scheme = ... scheme got from somewhere Evaluation evaluation = new Evaluation(trainInstances); evaluation.evaluateModel(scheme, testInstances); System.out.println(evaluation.toSummaryString());
@author Eibe Frank (eibe@cs.waikato.ac.nz)
@author Len Trigg (trigg@cs.waikato.ac.nz)
@version $Revision: 7228 $