Microsoft has created a script to run their AutoML. I have used this script extensively, and I found the the best estimator for my dataset was the FastTree estimator. I also was compelled to drill down with this estimator to see if I could squeeze out more performance.
In this notebook, I run AutoML for a much longer time to observe the training error curve.
The other reason for doing this analysis was to obtain a trained model that could be executed in milliseconds. Microsoft’s ML.NET does this. The previous posting in this Blog uses Python and TPOT. This is my gold standard model. It took over a week of machine time to train the model. Because TPOT is Python based, I cannot use that model. It takes over 4 seconds to make a prediction using new data. With ML.NET I now have a model that is an order of magnitude faster.
Charles Brauer CBrauer@CypressPoint.com
#r "nuget:Microsoft.ML"
#r "nuget:Microsoft.ML.AutoML"
using System;
using System.Diagnostics;
using System.Linq;
using System.IO;
using System.Linq;
using Microsoft.ML;
using Microsoft.ML.Data;
using Microsoft.ML.AutoML;
using Microsoft.ML.Trainers.FastTree;
using XPlot.Plotly;
Installing package Microsoft.ML…done!
Successfully added reference to package Microsoft.ML, version 1.4.0
Installing package Microsoft.ML.AutoML…done!
Successfully added reference to package Microsoft.ML.AutoML, version 0.16.0
public class BinaryExperimentProgressHandler:IProgress<RunDetail<BinaryClassificationMetrics>> {
private int _iterationIndex;
private double best_f1;
public void Report(RunDetail<BinaryClassificationMetrics> iterationResult) {
if (_iterationIndex++ == 0) {
Console.WriteLine(" Trainer Accuracy AUC AUPRC F1-score Duration");
}
var trainerName = iterationResult.TrainerName;
var accuracy = iterationResult.ValidationMetrics.Accuracy;
var auc = iterationResult.ValidationMetrics.AreaUnderRocCurve;
var aupr = iterationResult.ValidationMetrics.AreaUnderPrecisionRecallCurve;
var f1 = iterationResult.ValidationMetrics.F1Score;
var runtimeInSeconds = iterationResult.RuntimeInSeconds;
if (f1 > best_f1) {
best_f1 = f1;
Console.WriteLine("{0, 4} {1, -35} {2, 9:F4} {3, 9:F4} {4, 9:F4} {5, 9:F4} {6, 9:F4}",
_iterationIndex, trainerName, accuracy, auc, aupr, f1, runtimeInSeconds);
} else {
Console.Write("{0, 4}\r", _iterationIndex);
}
}
}
private static void PrintMetrics(BinaryClassificationMetrics metrics) {
Console.WriteLine(" Accuracy........................ {0:f6}", metrics.Accuracy);
Console.WriteLine(" AreaUnderPrecisionRecallCurve... {0:f6}", metrics.AreaUnderPrecisionRecallCurve);
Console.WriteLine(" AreaUnderRocCurve............... {0:f6}", metrics.AreaUnderRocCurve);
Console.WriteLine(" F1Score......................... {0:f6}", metrics.F1Score);
Console.WriteLine(" NegativePrecision............... {0:f6}", metrics.NegativePrecision);
Console.WriteLine(" NegativeRecall.................. {0:f6}", metrics.NegativeRecall);
Console.WriteLine(" PositivePrecision............... {0:f6}", metrics.PositivePrecision);
Console.WriteLine(" PositiveRecall.................. {0:f6}", metrics.PositiveRecall);
Console.WriteLine("\nConfusion Matrix:\n{0}", metrics.ConfusionMatrix.GetFormattedConfusionTable());
}
public class ModelInput {
[ColumnName("BoxRatio"), LoadColumn(0)]
public float BoxRatio { get; set; }
[ColumnName("Thrust"), LoadColumn(1)]
public float Thrust { get; set; }
[ColumnName("Acceleration"), LoadColumn(2)]
public float Acceleration { get; set; }
[ColumnName("Velocity"), LoadColumn(3)]
public float Velocity { get; set; }
[ColumnName("OnBalRun"), LoadColumn(4)]
public float OnBalRun { get; set; }
[ColumnName("vwapGain"), LoadColumn(5)]
public float VwapGain { get; set; }
[ColumnName("Altitude"), LoadColumn(6)]
public bool Altitude { get; set; }
}
public class ModelOutput {
[ColumnName("PredictedLabel")]
public bool Prediction { get; set; }
public float Probability { get; set; }
public float Score { get; set; }
}
The program prints only the improved F1 scores, and then summarizes the results.
var sw = Stopwatch.StartNew();
var mlContext = new MLContext(seed: 1);
Console.WriteLine("\nStep 1: Load the Bottle Rocket dataset.");
var trainDataView = mlContext.Data.LoadFromTextFile<ModelInput>(
path: @"H:\HedgeTools\Datasets\rocket-train-classify-smote.csv",
hasHeader: true,
separatorChar: ',');
var validDataView = mlContext.Data.LoadFromTextFile<ModelInput>(
path: @"H:\HedgeTools\Datasets\rocket-valid-classify.csv",
hasHeader: true,
separatorChar: ',');
var optimizingMetrics = new BinaryClassificationMetric[4];
optimizingMetrics[0] = BinaryClassificationMetric.Accuracy;
optimizingMetrics[1] = BinaryClassificationMetric.F1Score;
optimizingMetrics[2] = BinaryClassificationMetric.AreaUnderRocCurve;
optimizingMetrics[3] = BinaryClassificationMetric.AreaUnderPrecisionRecallCurve;
var trainers = new BinaryClassificationTrainer[2];
trainers[0] = BinaryClassificationTrainer.LightGbm;
trainers[1] = BinaryClassificationTrainer.FastTree;
var bestAccuracy = 0.0;
var bestOptimizingMetric = "";
var bestTrainer = "";
var loop = 0;
ExperimentResult<BinaryClassificationMetrics> bestExperimentResult = null;
foreach (var trainer in trainers) {
foreach (var optimizingMetric in optimizingMetrics) {
var settings = new BinaryExperimentSettings {
MaxExperimentTimeInSeconds = 1 * 30 * 60,
OptimizingMetric = optimizingMetric,
CacheDirectory = null
};
settings.Trainers.Clear();
settings.Trainers.Add(trainer);
Console.WriteLine("\n" + (++loop) + ": Running AutoML binary classification experiment using: " +
trainer.ToString() + ", " + optimizingMetric.ToString()
);
var experimentResult =
mlContext.Auto().CreateBinaryClassificationExperiment(settings)
.Execute(trainData: trainDataView,
labelColumnName: "Altitude",
progressHandler: new BinaryExperimentProgressHandler());
var bestRun = experimentResult.BestRun;
Console.WriteLine("Total models produced.... {0}", experimentResult.RunDetails.Count());
var validDataViewWithBestScore = bestRun.Model.Transform(validDataView);
var validMetrics = mlContext.BinaryClassification.EvaluateNonCalibrated(data: validDataViewWithBestScore,
labelColumnName: "Altitude");
Console.WriteLine("\nMetrics using validation dataset:");
PrintMetrics(validMetrics);
var crossValidationResults =
mlContext.BinaryClassification.CrossValidateNonCalibrated(validDataView,
bestRun.Estimator,
numberOfFolds: 6,
labelColumnName:
"Altitude");
var metricsInMultipleFolds = crossValidationResults.Select(r => r.Metrics);
var AccuracyValues = metricsInMultipleFolds.Select(m => m.Accuracy);
var accuracyValues = AccuracyValues as double[] ?? AccuracyValues.ToArray();
var AccuracyAverage = accuracyValues.Average();
double average = accuracyValues.Average();
double sumOfSquaresOfDifferences = accuracyValues.Select(val => (val - average) * (val - average)).Sum();
double AccuraciesStdDeviation = Math.Sqrt(sumOfSquaresOfDifferences / (accuracyValues.Length - 1));
double confidenceInterval95 = 1.96 * AccuraciesStdDeviation / Math.Sqrt((accuracyValues.Length - 1));
var AccuraciesConfidenceInterval95 = confidenceInterval95;
if (AccuracyAverage > bestAccuracy) {
bestAccuracy = AccuracyAverage;
bestTrainer = bestRun.TrainerName;
bestOptimizingMetric = optimizingMetric.ToString();
bestExperimentResult = experimentResult;
Console.WriteLine("\nBest model's trainer............ {0}", bestTrainer);
Console.WriteLine("Best model's optimizingMetric... {0}", bestOptimizingMetric);
Console.WriteLine("Cross Validation Metric: ");
Console.WriteLine(" Average Accuracy: {0:f4}, Standard deviation: {1:f4}, Confidence Interval 95%: {2:f4}",
AccuracyAverage, AccuraciesStdDeviation, AccuraciesConfidenceInterval95);
var mlModel = bestRun.Model;
mlContext.Model.Save(mlModel, trainDataView.Schema, "./MLModel.zip");
Console.WriteLine("Done. The model is saved.");
}
}
}
Step 1: Load the Bottle Rocket dataset.
1: Running AutoML binary classification experiment using: LightGbm, Accuracy
Trainer Accuracy AUC AUPRC F1-score Duration
1 LightGbmBinary 0.8048 0.8802 0.8925 0.8017 4.1021
2 LightGbmBinary 0.8353 0.9124 0.9137 0.8237 8.8050
8 LightGbmBinary 0.8915 0.9524 0.9581 0.8936 15.6461
62 LightGbmBinary 0.8942 0.9465 0.9556 0.8965 17.4921
79 LightGbmBinary 0.8950 0.9563 0.9611 0.8981 14.8344
92 LightGbmBinary 0.8978 0.9581 0.9652 0.9017 12.2783
Total models produced.... 198
Metrics using validation dataset:
Accuracy........................ 0.836847
AreaUnderPrecisionRecallCurve... 0.787978
AreaUnderRocCurve............... 0.876020
F1Score......................... 0.701342
NegativePrecision............... 0.887395
NegativeRecall.................. 0.888141
PositivePrecision............... 0.702128
PositiveRecall.................. 0.700559
Confusion Matrix:
TEST POSITIVE RATIO: 0.2734 (895.0/(895.0+2378.0))
Confusion table
||======================
PREDICTED || positive | negative | Recall
TRUTH ||======================
positive || 627 | 268 | 0.7006
negative || 266 | 2 112 | 0.8881
||======================
Precision || 0.7021 | 0.8874 |
Best model's trainer............ LightGbmBinary
Best model's optimizingMetric... Accuracy
Cross Validation Metric:
Average Accuracy: 0.7996, Standard deviation: 0.0086, Confidence Interval 95%: 0.0076
Done. The model is saved.
2: Running AutoML binary classification experimeent using: LightGbm, F1Score
Trainer Accuracy AUC AUPRC F1-score Duration
1 LightGbmBinary 0.8050 0.8838 0.8876 0.7894 3.5057
2 LightGbmBinary 0.8034 0.8848 0.8993 0.8042 6.1197
8 LightGbmBinary 0.8590 0.9289 0.9375 0.8579 8.9197
19 LightGbmBinary 0.8670 0.9433 0.9503 0.8725 9.9242
24 LightGbmBinary 0.8864 0.9471 0.9557 0.8889 15.3275
25 LightGbmBinary 0.8866 0.9532 0.9598 0.8917 13.3286
63 LightGbmBinary 0.8942 0.9524 0.9581 0.8959 13.0294
91 LightGbmBinary 0.8989 0.9581 0.9629 0.8988 12.7727
Total models produced.... 196
Metrics using validation dataset:
Accuracy........................ 0.833792
AreaUnderPrecisionRecallCurve... 0.781009
AreaUnderRocCurve............... 0.870147
F1Score......................... 0.702407
NegativePrecision............... 0.891880
NegativeRecall.................. 0.877628
PositivePrecision............... 0.688103
PositiveRecall.................. 0.717318
Confusion Matrix:
TEST POSITIVE RATIO: 0.2734 (895.0/(895.0+2378.0))
Confusion table
||======================
PREDICTED || positive | negative | Recall
TRUTH ||======================
positive || 642 | 253 | 0.7173
negative || 291 | 2 087 | 0.8776
||======================
Precision || 0.6881 | 0.8919 |
3: Running AutoML binary classification experimeent using: LightGbm, AreaUnderRocCurve
Trainer Accuracy AUC AUPRC F1-score Duration
1 LightGbmBinary 0.8050 0.8838 0.8876 0.7894 4.1777
3 LightGbmBinary 0.8013 0.8848 0.8995 0.8052 2.2853
4 LightGbmBinary 0.8142 0.8981 0.9069 0.8126 3.6899
9 LightGbmBinary 0.8251 0.9150 0.9253 0.8259 6.8250
20 LightGbmBinary 0.8727 0.9430 0.9433 0.8644 9.6666
29 LightGbmBinary 0.8775 0.9462 0.9523 0.8815 9.8766
31 LightGbmBinary 0.8824 0.9459 0.9449 0.8842 10.3500
36 LightGbmBinary 0.8847 0.9496 0.9551 0.8861 9.8303
42 LightGbmBinary 0.8902 0.9490 0.9535 0.8914 12.2504
82 LightGbmBinary 0.8894 0.9556 0.9608 0.8937 15.2517
90 LightGbmBinary 0.8922 0.9530 0.9576 0.8942 13.3331
95 LightGbmBinary 0.8976 0.9569 0.9615 0.8991 15.3605
135 LightGbmBinary 0.8990 0.9546 0.9601 0.9001 15.5735
183 LightGbmBinary 0.8971 0.9538 0.9607 0.9013 12.9670
Total models produced.... 189
Metrics using validation dataset:
Accuracy........................ 0.838375
AreaUnderPrecisionRecallCurve... 0.795228
AreaUnderRocCurve............... 0.879851
F1Score......................... 0.705948
NegativePrecision............... 0.890249
NegativeRecall.................. 0.886880
PositivePrecision............... 0.702434
PositiveRecall.................. 0.709497
Confusion Matrix:
TEST POSITIVE RATIO: 0.2734 (895.0/(895.0+2378.0))
Confusion table
||======================
PREDICTED || positive | negative | Recall
TRUTH ||======================
positive || 635 | 260 | 0.7095
negative || 269 | 2 109 | 0.8869
||======================
Precision || 0.7024 | 0.8902 |
4: Running AutoML binary classification experimeent using: LightGbm, AreaUnderPrecisionRecallCurve
Trainer Accuracy AUC AUPRC F1-score Duration
1 LightGbmBinary 0.7905 0.8764 0.8932 0.7829 3.5890
10 LightGbmBinary 0.8382 0.9213 0.9232 0.8389 7.3470
16 LightGbmBinary 0.8615 0.9318 0.9429 0.8656 7.1917
25 LightGbmBinary 0.8793 0.9443 0.9539 0.8808 11.5563
26 LightGbmBinary 0.8929 0.9501 0.9563 0.8948 13.6540
55 LightGbmBinary 0.9024 0.9595 0.9579 0.9040 14.8611
160
var threshold = 0.5;
var modelPath = "./MLModel.zip";
var mlContext = new MLContext();
if (!File.Exists(modelPath)) {
Console.WriteLine("Cannot find MLModel.zip");
return;
}
DataViewSchema modelSchema;
var mlModel = mlContext.Model.Load(modelPath, out modelSchema);
var scatters = bestExperimentResult.RunDetails.Where(d => d.ValidationMetrics != null).GroupBy(
r => r.TrainerName,
(name, details) => new Graph.Scatter() {
name = name,
x = details.Select(r => r.RuntimeInSeconds),
y = details.Select(r => r.ValidationMetrics.Accuracy),
mode = "markers",
marker = new Graph.Marker() { size = 5 }
}
);
var chart1 = Chart.Plot(scatters);
chart1.WithXTitle("Training Time (in seconds)");
chart1.WithYTitle("Accuracy");
chart1.Width = 500;
chart1.Height = 500;
display(chart1)
var scatters = bestExperimentResult.RunDetails.Where(d => d.ValidationMetrics != null).GroupBy(
r => r.TrainerName,
(name, details) => new Graph.Scatter() {
name = name,
x = details.Select(r => r.RuntimeInSeconds),
y = details.Select(r => r.ValidationMetrics.F1Score),
mode = "markers",
marker = new Graph.Marker() { size = 5 }
}
);
var chart2 = Chart.Plot(scatters);
chart2.WithXTitle("Training Time (in seconds)");
chart2.WithYTitle("F1Score");
chart2.Width = 500;
chart2.Height = 500;
display(chart2);
var validDataView = mlContext.Data.LoadFromTextFile<ModelInput>(
path: @"H:\HedgeTools\Datasets\rocket-valid-classify.csv",
hasHeader: true,
separatorChar: ',');
var bestRun = bestExperimentResult.BestRun;
var validDataViewWithBestScore = bestRun.Model.Transform(validDataView);
var validMetrics = mlContext.BinaryClassification.EvaluateNonCalibrated(data: validDataViewWithBestScore,
labelColumnName: "Altitude");
Console.WriteLine("\nMetrics of best model on validation data:");
PrintMetrics(validMetrics);
var crossValidationResults = mlContext.BinaryClassification.CrossValidateNonCalibrated(validDataView,
bestRun.Estimator,
numberOfFolds: 6,
labelColumnName: "Altitude");
var metricsInMultipleFolds = crossValidationResults.Select(r => r.Metrics);
var AccuracyValues = metricsInMultipleFolds.Select(m => m.Accuracy);
var accuracyValues = AccuracyValues as double[] ?? AccuracyValues.ToArray();
var AccuracyAverage = accuracyValues.Average();
double average = accuracyValues.Average();
double sumOfSquaresOfDifferences = accuracyValues.Select(val => (val - average) * (val - average)).Sum();
double AccuraciesStdDeviation = Math.Sqrt(sumOfSquaresOfDifferences / (accuracyValues.Length - 1));
double confidenceInterval95 = 1.96 * AccuraciesStdDeviation / Math.Sqrt((accuracyValues.Length - 1));
var AccuraciesConfidenceInterval95 = confidenceInterval95;
Console.WriteLine("Cross Validation Metrics:");
Console.WriteLine(" Average Accuracy: {0:f4}, Standard deviation: {1:f4}, Confidence Interval 95%: {2:f4}",
AccuracyAverage, AccuraciesStdDeviation, AccuraciesConfidenceInterval95);
Metrics of best model on validation data:
Accuracy........................ 0.838688
AreaUnderPrecisionRecallCurve... 0.720566
AreaUnderRocCurve............... 0.861095
F1Score......................... 0.631340
NegativePrecision............... 0.905006
NegativeRecall.................. 0.888655
PositivePrecision............... 0.611437
PositiveRecall.................. 0.652582
Confusion Matrix:
TEST POSITIVE RATIO: 0.2117 (639.0/(639.0+2380.0))
Confusion table
||======================
PREDICTED || positive | negative | Recall
TRUTH ||======================
positive || 417 | 222 | 0.6526
negative || 265 | 2 115 | 0.8887
||======================
Precision || 0.6114 | 0.9050 |
Cross Validation Metrics:
Average Accuracy: 0.8331, Standard deviation: 0.0150, Confidence Interval 95%: 0.0131
This data was not seen by the training of the model. It is used to test what we can expect in the actual use of the model.
var testDataPath = @"H:\HedgeTools\Datasets\rocket-test-classify.csv";
var testDataView = mlContext.Data.LoadFromTextFile<ModelInput>(
path: testDataPath,
hasHeader: true,
separatorChar: ',');
var testDataViewWithBestScore = bestRun.Model.Transform(testDataView);
var testMetrics = mlContext.BinaryClassification.EvaluateNonCalibrated(data: testDataViewWithBestScore,
labelColumnName: "Altitude");
Console.WriteLine("\nMetrics using the test data:");
PrintMetrics(testMetrics);
Metrics using the test data:
Accuracy........................ 0.846638
AreaUnderPrecisionRecallCurve... 0.716333
AreaUnderRocCurve............... 0.848605
F1Score......................... 0.639689
NegativePrecision............... 0.900504
NegativeRecall.................. 0.904682
PositivePrecision............... 0.645212
PositiveRecall.................. 0.634259
Confusion Matrix:
TEST POSITIVE RATIO: 0.2146 (648.0/(648.0+2371.0))
Confusion table
||======================
PREDICTED || positive | negative | Recall
TRUTH ||======================
positive || 411 | 237 | 0.6343
negative || 226 | 2 145 | 0.9047
||======================
Precision || 0.6452 | 0.9005 |
This will simulate the prediction code on a line of new data.
public void Predict(string predictors, double threshold, out int predicted, out double probability) {
try {
var predEngine = mlContext.Model.CreatePredictionEngine<ModelInput, ModelOutput>(mlModel);
var modelInput = new ModelInput();
var fields = predictors.Split(',');
modelInput.BoxRatio = Convert.ToSingle(fields[0]);
modelInput.Thrust = Convert.ToSingle(fields[1]);
modelInput.Acceleration = Convert.ToSingle(fields[2]);
modelInput.Velocity = Convert.ToSingle(fields[3]);
modelInput.OnBalRun = Convert.ToSingle(fields[4]);
modelInput.VwapGain = Convert.ToSingle(fields[5]);
var prediction = predEngine.Predict(modelInput);
predicted = prediction.Prediction ? 1 : 0;
probability = Math.Round(prediction.Probability, 8);
if (predicted == 1 && probability < threshold) {
// Console.WriteLine("Corrected prediction 1 to 0 because probability {0:f8} < {1:f2}",
// probability, threshold);
predicted = 0;
}
if (predicted == 0 && probability >= threshold) {
Console.WriteLine("We should never come here.");
predicted = 1;
}
return;
} catch (EndOfStreamException ex) {
Console.WriteLine("Predict error: " + ex.Message);
} catch (Exception ex) {
Console.WriteLine("Predict error: " + ex.Message);
}
predicted = 0;
probability = 0.0;
}
Console.WriteLine("Using dataset:" + testDataPath);
var tableRows = new List<string>();
try {
var reader = new StreamReader(File.OpenRead(testDataPath));
reader.ReadLine(); // Ignore the header
while (!reader.EndOfStream) {
var line = reader.ReadLine();
tableRows.Add(line);
}
reader.Close();
} catch (Exception ex) {
Console.WriteLine("Oh drat! We have an error: " + ex.Message);
return;
}
int false_positive, true_negative, false_negative;
var true_positive = false_positive = true_negative = false_negative = 0;
double goal_met = 0.0, goal_failed = 0.0, probability;
var max_corrected = 0;
var nrows = tableRows.Count;
sw = Stopwatch.StartNew();
var best_threshold = 0.0;
var max_accuracy = 0.0;
Console.WriteLine("\nComputing Metrics using the test dataset, nrows: " + nrows);
for (var threshold = 0.50; threshold < 0.96; threshold += 0.02) {
var true_positive = 0;
var false_positive = 0;
var false_negative = 0;
var true_negative = 0;
double goal_met = 0.0, goal_failed = 0.0;
var corrected = 0;
foreach (var predictors in tableRows) {
int altitude, predicted;
double probability;
try {
var fields = predictors.Split(',');
altitude = Convert.ToInt32(fields[6]);
// Drop the "Altitude" field, leaving only the "features'.
var features = string.Join(",", fields.Take(fields.Length - 1));
// An Altitude of 1 means that the goal wa reached. This is a "ground truth" value.
// The probability here is simply the probability that a given prediction is True (i.e. 1).
// If the probability is less than the threshold then it is more likely to be False than True,
// thus we change the predicion to False (i.e. 0).
Predict(features, threshold, out predicted, out probability);
} catch (Exception) {
Console.WriteLine("dataset is not in the proper format. Did you delete the index column?");
return;
}
if (altitude == 1) {
goal_met++;
} else {
goal_failed++;
}
if (predicted == 1 && altitude == 1) {
// We have found a True Positive. The model predicted a gain and we got it.
true_positive += 1;
}
if (predicted == 1 && altitude == 0) {
// We have found a False Positive. The model predicted a gain, but we got a loss.
false_positive += 1;
}
if (predicted == 0 && altitude == 1) {
// We have found a False Negative. The model predicted a loss, but we got a gain.
false_negative += 1;
}
if (predicted == 0 && altitude == 0) {
// We have found a True Negative. The model predicted a loss, and we got it.
true_negative += 1;
}
}
var total = Convert.ToDouble(true_positive + false_positive + true_negative + false_negative);
var accuracy = 100.0 * (true_positive + true_negative) / total;
accuracy = Math.Round(accuracy, 4);
Console.WriteLine("accurancy: {0:f4}, max_accuracy: {1:f4}, threshold: {2:f2}",
accuracy, max_accuracy, threshold);
if (accuracy > max_accuracy) {
max_accuracy = accuracy;
double recall, f1score;
var precision = recall = f1score = 0.0;
var denom = Convert.ToDouble(true_positive + false_negative);
if (denom > 0.0) recall = true_positive / denom;
denom = Convert.ToDouble(true_positive + false_positive);
if (denom > 0.0) precision = true_positive / denom;
if (precision + recall > 0.0) f1score = 2.0 * (precision * recall) / (precision + recall);
precision = Math.Round(precision, 4);
recall = Math.Round(recall, 4);
f1score = Math.Round(f1score, 4);
best_threshold = threshold;
Console.WriteLine("Metrics:");
Console.WriteLine(" threshold................. {0:f2}", threshold);
Console.WriteLine(" No. of True Positive...... {0}/{1}", true_positive, goal_met);
Console.WriteLine(" No. of False Positive..... {0}", false_positive);
Console.WriteLine(" No. of False Negative..... {0}", false_negative);
Console.WriteLine(" No. of True Negative...... {0}/{1}", true_negative, goal_failed);
Console.WriteLine(" Accuracy.................. {0:f4}%", accuracy);
Console.WriteLine(" Precision................. {0:f4}" + precision);
Console.WriteLine(" Recall.................... {0:f4}" + recall);
Console.WriteLine(" F1 Score.................. {0:f4}" + f1score);
}
}
Console.WriteLine("Done. The best threshold is {0:f2}", best_threshold);
Using dataset:H:\HedgeTools\Datasets\rocket-test-classify.csv
Computing Metrics using the test dataset, nrows: 3019
accurancy: 84.6638, max_accuracy: 0.0000, threshold: 0.50
Metrics:
threshold................. 0.50
No. of True Positive...... 411/648
No. of False Positive..... 226
No. of False Negative..... 237
No. of True Negative...... 2145/2371
Accuracy.................. 84.6638%
Precision................. {0:f4}0.6452
Recall.................... {0:f4}0.6343
F1 Score.................. {0:f4}0.6397
accurancy: 84.7300, max_accuracy: 84.6638, threshold: 0.52
Metrics:
threshold................. 0.52
No. of True Positive...... 410/648
No. of False Positive..... 223
No. of False Negative..... 238
No. of True Negative...... 2148/2371
Accuracy.................. 84.7300%
Precision................. {0:f4}0.6477
Recall.................... {0:f4}0.6327
F1 Score.................. {0:f4}0.6401
accurancy: 85.0282, max_accuracy: 84.7300, threshold: 0.54
Metrics:
threshold................. 0.54
No. of True Positive...... 407/648
No. of False Positive..... 211
No. of False Negative..... 241
No. of True Negative...... 2160/2371
Accuracy.................. 85.0282%
Precision................. {0:f4}0.6586
Recall.................... {0:f4}0.6281
F1 Score.................. {0:f4}0.643
accurancy: 85.2600, max_accuracy: 85.0282, threshold: 0.56
Metrics:
threshold................. 0.56
No. of True Positive...... 405/648
No. of False Positive..... 202
No. of False Negative..... 243
No. of True Negative...... 2169/2371
Accuracy.................. 85.2600%
Precision................. {0:f4}0.6672
Recall.................... {0:f4}0.625
F1 Score.................. {0:f4}0.6454
accurancy: 85.4256, max_accuracy: 85.2600, threshold: 0.58
Metrics:
threshold................. 0.58
No. of True Positive...... 404/648
No. of False Positive..... 196
No. of False Negative..... 244
No. of True Negative...... 2175/2371
Accuracy.................. 85.4256%
Precision................. {0:f4}0.6733
Recall.................... {0:f4}0.6235
F1 Score.................. {0:f4}0.6474
accurancy: 85.4919, max_accuracy: 85.4256, threshold: 0.60
Metrics:
threshold................. 0.60
No. of True Positive...... 402/648
No. of False Positive..... 192
No. of False Negative..... 246
No. of True Negative...... 2179/2371
Accuracy.................. 85.4919%
Precision................. {0:f4}0.6768
Recall.................... {0:f4}0.6204
F1 Score.................. {0:f4}0.6473
accurancy: 85.5250, max_accuracy: 85.4919, threshold: 0.62
Metrics:
threshold................. 0.62
No. of True Positive...... 399/648
No. of False Positive..... 188
No. of False Negative..... 249
No. of True Negative...... 2183/2371
Accuracy.................. 85.5250%
Precision................. {0:f4}0.6797
Recall.................... {0:f4}0.6157
F1 Score.................. {0:f4}0.6462
accurancy: 85.6575, max_accuracy: 85.5250, threshold: 0.64
Metrics:
threshold................. 0.64
No. of True Positive...... 397/648
No. of False Positive..... 182
No. of False Negative..... 251
No. of True Negative...... 2189/2371
Accuracy.................. 85.6575%
Precision................. {0:f4}0.6857
Recall.................... {0:f4}0.6127
F1 Score.................. {0:f4}0.6471
accurancy: 85.8894, max_accuracy: 85.6575, threshold: 0.66
Metrics:
threshold................. 0.66
No. of True Positive...... 395/648
No. of False Positive..... 173
No. of False Negative..... 253
No. of True Negative...... 2198/2371
Accuracy.................. 85.8894%
Precision................. {0:f4}0.6954
Recall.................... {0:f4}0.6096
F1 Score.................. {0:f4}0.6497
accurancy: 86.0881, max_accuracy: 85.8894, threshold: 0.68
Metrics:
threshold................. 0.68
No. of True Positive...... 392/648
No. of False Positive..... 164
No. of False Negative..... 256
No. of True Negative...... 2207/2371
Accuracy.................. 86.0881%
Precision................. {0:f4}0.705
Recall.................... {0:f4}0.6049
F1 Score.................. {0:f4}0.6512
accurancy: 86.2206, max_accuracy: 86.0881, threshold: 0.70
Metrics:
threshold................. 0.70
No. of True Positive...... 387/648
No. of False Positive..... 155
No. of False Negative..... 261
No. of True Negative...... 2216/2371
Accuracy.................. 86.2206%
Precision................. {0:f4}0.714
Recall.................... {0:f4}0.5972
F1 Score.................. {0:f4}0.6504
accurancy: 86.4856, max_accuracy: 86.2206, threshold: 0.72
Metrics:
threshold................. 0.72
No. of True Positive...... 385/648
No. of False Positive..... 145
No. of False Negative..... 263
No. of True Negative...... 2226/2371
Accuracy.................. 86.4856%
Precision................. {0:f4}0.7264
Recall.................... {0:f4}0.5941
F1 Score.................. {0:f4}0.6537
accurancy: 86.4856, max_accuracy: 86.4856, threshold: 0.74
accurancy: 86.4856, max_accuracy: 86.4856, threshold: 0.76
accurancy: 86.4525, max_accuracy: 86.4856, threshold: 0.78
accurancy: 86.3200, max_accuracy: 86.4856, threshold: 0.80
accurancy: 86.4193, max_accuracy: 86.4856, threshold: 0.82
accurancy: 86.4193, max_accuracy: 86.4856, threshold: 0.84
accurancy: 86.5187, max_accuracy: 86.4856, threshold: 0.86
Metrics:
threshold................. 0.86
No. of True Positive...... 340/648
No. of False Positive..... 99
No. of False Negative..... 308
No. of True Negative...... 2272/2371
Accuracy.................. 86.5187%
Precision................. {0:f4}0.7745
Recall.................... {0:f4}0.5247
F1 Score.................. {0:f4}0.6256
accurancy: 86.4856, max_accuracy: 86.5187, threshold: 0.88
accurancy: 86.4193, max_accuracy: 86.5187, threshold: 0.90
accurancy: 86.4193, max_accuracy: 86.5187, threshold: 0.92
accurancy: 85.9225, max_accuracy: 86.5187, threshold: 0.94
Done. The best threshold is 0.86
In this notebook, we have gone into deeper analysis of the top two models for the Bottle Rocker dataset. We have searched for the five best optimizing metrics, and the results are displayed.
We also experimented with modifying the decision threshold for the probability calculation. Choosing a higher threshold has the effect for lowering the number of False Positives, which is extremely import as a False Positive, in general, results in loosing money on a trade.
I would appreciate your comments on this analysis.
Charles Brauer