Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Confusion Metrics:
Neural Narrator
Jun 18, 2024
21 views
Confusion Metrics:
#ModelEvaluation #Series #2
A way to view various metrics of classification is confusions matrix.
It also helps up understand precision and recall better.
In a classification problem, during the testing phase you will have two categroies:
Keep in mind the model could predicted it as HAM, This means if you have two possible classes you should have 4 separate groups at the end of testing.
Real Condition Predicted Ham Predicted Spam
Ham True False
Spam False True
now if we expand these table futher:
(Positive)
| Real Condition | Predicted Ham | Predicted Spam (Negative)
|----------------------------|---------------------------|---------------------------|
| Ham | True Positive | False Negative |
| Spam | False Positive | True Negative |
The key point with confusion matrix and various calculated metrics is that
What constitutes a GOOD metics?
It really depends on the specific situation.
Confusion matrix to evaluate our model. In this example we are going to test for the presence of the disease. This is a supervised learning, so before we run them through the testing program, we already know the true conditions of these patients. Whether they have diseases or not. So imaging this as testing a new diagnostic tool.
For presence of disease:
Yes = Positive Test or True or 1
No = Negative Test or False or O
Total People for the test = 165, N = 165
The result is:
| Real Condition | Predicted No | Predicted YES (Positive)
|----------------------------|---------------------------|---------------------------|
| Actual No | 50 | 10 |
| Actual Yes | 5 | 100 |
let's map the value in table:
| Real Condition | Predicted No | Predicted YES (Positive)
|----------------------------|---------------------------|---------------------------|
| Actual No | 50 TN | 10 FP |
| Actual Yes | 5 FN | 100 TP |
Now is 91% accurate good enough?
This depends on the situation, if you are dealing with cancer, that's a high stakes game so 91% is not good enough,
The really important statistics here is False Negative, these 5 people had cancer that we predicted as safe.This is a extremely dangerous situation to be in. So you have to keep in mind the context of what your ML model is trying to achieve?.
SO there is always going to a trade off between false negative vs false positive.
In this situation we want to minimize the false negative. Here we'd really like to avoid in this situation, telling someone they are clear of the disease when they actually have one.
You can also calculate Misclassification Rate:
= (FP + FN) / total
= 15 /165
= 0.09 or 9% error rate.
In statistics, false positives and false negatives are referred as type I and type II error.
Type1 i.e false positive: telling man a he is pregnant
Type II error, false negative: telling a pregnant women she is not pregnant.