Step #1 - overfitting: ---------------------- Evaluate the J48 classifier on the glass.arff dataset by using the same learning set as your test set (the "Use training set" option in WEKA). You get a classification accuracy of ___ and a classification error of just ___. Now, try the same classifier on the same data, but this time evaluate using a 66/34 train/test split method (the "Percentage split(66%)" option in WEKA). What happened? The classification accuracy dropped to ___ and the classification error increased to ___. Try the same experiment by using the 10-fold cross-validation (WEKA's default) evaluation method. Now, you get a classification accuracy of ___ and a classification error of ___. Why such a difference in accuracies/errors between different evaluation methods? Step #2 - train/test split and randomization: --------------------------------------------- When WEKA splits the data into the training and test sets -- in the cases of "Percentage split" and "Cross-valiadation" -- the data is first randomized by default. WEKA uses ___ stratified sampling when selecting the training/test samples from the datasets. This default randomization can also be turned off in need be. Now, run the J48 classifier on the iris.arff dataset by using the a 66/34 train/test split method. You get a classification accuracy of ___ and a classification error of just ___. Randomization was switched on in this case. Try now to switch the randomization off ("More options...", check "Preserve order for % Split"). Notice that this time you get a classification accuracy of just ___ and a classification error of ___. The classification accuracy dropped almost to 0. Why's that? Step #3 - use the Experimenter: ------------------------------- You performed the statistical comparison test as instructed in the Assignment. Now it's time to look at the results table. You find out, by looking in this table that (with 95% statistical significance): - J48 has significantly higher accuracy than ZeroR on ___ out of ___ datasets; - J48 has significantly higher accuracy than OneR on ___ out of ___ datasets; - Although, on the ___ dataset the 3 classifiers have very different classification accuracies, the difference is not statistically significant, because of high ___ of the results; - On the ___ dataset all 3 classifiers perform the same - classification accuracy is ___, standard deviation is ___. THE END :-)