Step #1: -------- The Rookie-train data set contains ___ instances, ___ attributes (not counting the class attribute) and a binary class attribute with possible values "Yes" and "No". All attributes are nominal and there are no missing values. Attribute A1 has ___ distinct values. Attribute A2 has ___ distinct values. Attribute A3 has ___ distinct values. Attribute A4 has ___ distinct values. Attribute A5 has ___ distinct values. This is a balanced data set, because it contains ___ instances of class "Yes" and ___ instances of class "No". Step #2: -------- Applying ZeroR to the Rookie-train data gives an error of ___% on the training data and it incorrectly classifies ___ instance(s) on the Rookie-test data. On the other hand, applying OneR to the Rookie-train data gives an error of ___% on the training data and it incorrectly classifies ___ instance(s) on the Rookie-test data. In "More options...", set "PlainText" for "Output predictions". Re-run OneR on Rookie-train, testing it on Rookie-test. OneR classifies the 3 instances in the test data as ___, ___ and ___, respectively. Revert back to "Output predictions" = "Null" and "Use training set" and run OneR again on the Rookie-train data. OneR chooses ___ as the "best" attribute. Remove this attribute (in the Preprocess panel) and re-run OneR with the same setting as before. Now, ___ shows up as the "best" attribute with an error of ___% on the training set. By repeating the procedure of removing the "best" attribute and re-running OneR on the remaining data, you shall notice that all the remaining attributes give the same ___% error on the training data. When applying this "remove/re-run OneR" procedure, the attributes were removed in the order: ___, ___, ___, ___, and ___. Use the Preprocess' "Undo" functionality to restore the Rookie-train data in its original form. Step #3: -------- Applying Id3 to the Rookie-train data, you get a decision tree with attribute ___ in its root and an error of ___% on the training data. Applying J48 to the Rookie-train data, you get a decision tree with ___ leaves, the same attribute ___ in its root, and an error of ___% on the training data. If, instead of testing them on the training data, you test both Id3 and J48 on the Rookie-test data, they both yield a classification accuracy of ___%. Now, proceed to the "Select attributes" panel in WEKA and choose "Ranker" as the "Search Method". Apply the InfoGainAttributeEval "Attribute Evaluator" to the Rookie-train data (beware: select the "correct" class attribute -- not A5!). The algorithm ranks all five attributes in descending order according to the Information Gain criterion. The "best" attribute in this case is ___ with an information gain of ___ bits. Now, apply the GainRatioAttributeEval "Attribute Evaluator" to the Rookie-train data. This algorithm ranks all five attributes in descending order according to the Information Gain Ratio criterion. You will notice that this time the order of attributes is not the same as before. The first ___ attributes come in the same order as before, while the order of the last ___ attributes changed. Now, the "worst" attribute is ___ with an information gain ratio of ___. THE END :-)