You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I observe that when adding negative patients (with no lesion to detect) for test or cross-validation, the FROC score is always decreased. When I ignore these patients for the test or cross-validation of the same trained model, the score is higher...
I tryed to explain myself this by the fact that adding negative patients can only add false positives (FP) and nothing else (no true negative in detection), not permitting to increase the sensitivity (Se), biasing it toward low values for fixed FP/scan. But a colleague challenged this explanation saying the following :
For example if we have a score of Se 80% @ 2FP/scan in average, if we add negative healthy patients, Se will remain the same, and we could expect 2FP/scan in average still...
Do you have elements of answer, or elements of explanation of this phenomenum? Did you observe this also?
Another side question: The X-axis (FP/scan) is computed at the sample level then averaged, but the Y-axis ? at the lesion level aggregated across samples, or in average as well? Maybe be it could help uderstanding?
Thank you very much.
Best,
Thibault
The text was updated successfully, but these errors were encountered:
regarding your side question: both values first compute TP and FP across the entire dataset (nothing is computed per sample). The TP are than "normalized" by the number of ground truth objects (i.e. classical sensitivity on object level) and the FP are normalized by the number of images.
The behaviour when adding negative patients will change depending on the type of problem you are looking at and what kind of negative patient images are added.
Some thoughts:
"... can only add false positives (FP) and nothing else (no true negative in detection), not permitting to increase the sensitivity (Se), biasing it toward low values for fixed FP/scan." The general thought is correct, Sensitivity won't increase but the False Positives are computed as an average across the images -> adding negative images will add False Positives but simultaneously also increase the allowed total number of FP in the dataset (example: 5 images with 10 FP total across all images => 2FP/Image ; 10 images with 20 FP total across all images => still 2FP/Image)
The exact behaviour will change based on your detection problem. I observed quite often (but not always) that networks are able to differentiate healthy and sick patients quite effectively producing little FP's in them -> this would increase your FROC score, since you increase the total amount of allowed FPs in the dataset without adding FPs from the detector. (generally indicated by problems where the AP is rather low but FROC scores are high)
Your thoughts are correct for AP -> AP only has TP and FP on object level and thus AP can only decrease when more patients are added.
Highly recommend the metrics reloaded and metrics pitfalls papers to build intuition for these things.
In the end, the best way to look into this problem more closely is to look into your images and the predictions :)
❓ Question
Maybe related to Project-MONAI/tutorials#1582
Hello,
I observe that when adding negative patients (with no lesion to detect) for test or cross-validation, the FROC score is always decreased. When I ignore these patients for the test or cross-validation of the same trained model, the score is higher...
I tryed to explain myself this by the fact that adding negative patients can only add false positives (FP) and nothing else (no true negative in detection), not permitting to increase the sensitivity (Se), biasing it toward low values for fixed FP/scan. But a colleague challenged this explanation saying the following :
For example if we have a score of Se 80% @ 2FP/scan in average, if we add negative healthy patients, Se will remain the same, and we could expect 2FP/scan in average still...
Do you have elements of answer, or elements of explanation of this phenomenum? Did you observe this also?
Another side question: The X-axis (FP/scan) is computed at the sample level then averaged, but the Y-axis ? at the lesion level aggregated across samples, or in average as well? Maybe be it could help uderstanding?
Thank you very much.
Best,
Thibault
The text was updated successfully, but these errors were encountered: