Home About Blog Projects Contact

Summary Of Classification Models

28 Oct 2017

This is a summary of our classification experiments on the Bottle Rocket dataset

If you haven’t read about this dataset, please click on: The Bottle Rocket Pattern.

Most Python packages have an ensemble method that is used to test several models. Instead, we prefer to investigate models between vendors.
This already has uncovered a glaring deficiency in TensorFlow. Their high level API (TF.Learn, not to be confused with TFLearn) does not have an argument like “balances_classes = True”, which exists in Scikit-Learn and H2O.

This is very important because the Bottle Rocket dataset is unbalanced (8:1). We believe that this explains why TensorFlow does so poorly.

Below is a table that compares several metrics between TensorFlow, Scikit-Learn and H2O. The Random Forest models are very attractive. (Also, TensorFlow does not have a Random Forest API). One of the reasons we like the Random Forest model is because that the dataset does not need to be standardized.

This is our first attempt, and we are lucky that the results are reasonably good. They can be better, and more work is needed. Machine Learning research is like walking into quick-sand. You get swallowed up very quickly.

╒═══════════╤══════════════╤═══════════╤══════════╤══════════╕
│ Metric    │   TensorFlow │   sklearn │   H2O RF │   H2O NN │
╞═══════════╪══════════════╪═══════════╪══════════╪══════════╡
│ logloss   │       ------ │    3.1555 │   0.2354 │   0.2772 │
├───────────┼──────────────┼───────────┼──────────┼──────────┤
│ accuracy  │       0.6417 │    0.9086 │   0.9054 │   0.9193 │
├───────────┼──────────────┼───────────┼──────────┼──────────┤
│ precision │       0.6218 │    0.5846 │   1.0000 │   1.0000 │
├───────────┼──────────────┼───────────┼──────────┼──────────┤
│ recall    │       0.7823 │    0.6726 │   0.6742 │   0.6750 │
├───────────┼──────────────┼───────────┼──────────┼──────────┤
│ f1        │       0.6929 │    0.1255 │   0.3544 │   0.3903 │
├───────────┼──────────────┼───────────┼──────────┼──────────┤
│ auc       │       0.6368 │    0.8580 │   0.8741 │   0.9203 │
╘═══════════╧══════════════╧═══════════╧══════════╧══════════╛

Putting the Analysis of the Bottle Rocket dataset to work

We could not wait for further analysis. We used the trained model from Scikit-Learn and used it in HedgeTools. When combined with an experienced day-trader, the results were very encouraging.

One of our Machine Learning models recognized this pattern, and gave the buy signal. The green LED shown above the text at the bottom of the price panel (above “7.62”) indicated that a Bottle Rocket pattern was recognized. The buy signal was a result of a SciKit-Learn Random Forest Classification analysis.

blog comments powered by Disqus