TY - JOUR
T1 - An uncertainty estimator method based on the application of feature density to classify mammograms for breast cancer detection
AU - Fuentes-Fino, Ricardo
AU - Calderón-Ramírez, Saúl
AU - Domínguez, Enrique
AU - López-Rubio, Ezequiel
AU - Elizondo, David
AU - Molina-Cabello, Miguel A.
N1 - Publisher Copyright:
© 2023, The Author(s).
PY - 2023/10
Y1 - 2023/10
N2 - In the area of medical imaging, one of the factors that can negatively influence the performance of prediction algorithms is the limited number of observations for each class within a labeled dataset. Usually, in order to increase the samples, a second set of unlabeled images is used. However, this set adds two new problems (i) finding patient observations with different pathologies than those observed in the labeled data set and (ii) finding images belonging to a different distribution from the dataset used in the model training process. This way, merging datasets from different sources can have an adverse effect on the distribution of features. Encountering this type of data (better known as out-of-distribution data) within the deployment environments may also lead to varying degrees of performance degradation as can be seen in the different experimental results obtained. In this research, a study of the behavior of Feature Density is made, as a mathematical model for the estimation of predictive uncertainty in supervised classification algorithms, in order to improve the behavior when out-of-distribution data are presented in the dataset. The Feature Density method is based on the estimation of feature density by means of histogram calculation (or Probability Density Function). The advantage of this method over the baseline approach (Mahalanobis distance) is that it does not assume a Gaussian-type distribution of sample characteristics and serves to estimate the uncertainty. This work focuses on the binary classification of mammography X-ray images from three different datasets simulating the condition of a different degree of contamination with out-of-distribution sample. According to the obtained results, the performance of the proposed method depends directly on the architecture of the implemented neural network.
AB - In the area of medical imaging, one of the factors that can negatively influence the performance of prediction algorithms is the limited number of observations for each class within a labeled dataset. Usually, in order to increase the samples, a second set of unlabeled images is used. However, this set adds two new problems (i) finding patient observations with different pathologies than those observed in the labeled data set and (ii) finding images belonging to a different distribution from the dataset used in the model training process. This way, merging datasets from different sources can have an adverse effect on the distribution of features. Encountering this type of data (better known as out-of-distribution data) within the deployment environments may also lead to varying degrees of performance degradation as can be seen in the different experimental results obtained. In this research, a study of the behavior of Feature Density is made, as a mathematical model for the estimation of predictive uncertainty in supervised classification algorithms, in order to improve the behavior when out-of-distribution data are presented in the dataset. The Feature Density method is based on the estimation of feature density by means of histogram calculation (or Probability Density Function). The advantage of this method over the baseline approach (Mahalanobis distance) is that it does not assume a Gaussian-type distribution of sample characteristics and serves to estimate the uncertainty. This work focuses on the binary classification of mammography X-ray images from three different datasets simulating the condition of a different degree of contamination with out-of-distribution sample. According to the obtained results, the performance of the proposed method depends directly on the architecture of the implemented neural network.
KW - Deep learning
KW - Feature density
KW - Jensen–Shannon distance
KW - Mahalanobis distance
KW - Uncertainty estimation
UR - http://www.scopus.com/inward/record.url?scp=85168566306&partnerID=8YFLogxK
U2 - 10.1007/s00521-023-08904-3
DO - 10.1007/s00521-023-08904-3
M3 - Artículo
AN - SCOPUS:85168566306
SN - 0941-0643
VL - 35
SP - 22151
EP - 22161
JO - Neural Computing and Applications
JF - Neural Computing and Applications
IS - 30
ER -