TY - JOUR
T1 - Discovering Diagnostic Features Used by a CNN in Plant Species Identification
AU - Figueroa-Mata, Geovanni
AU - Mata-Montero, Erick
AU - Acosta-Vargas, Luis
N1 - Publisher Copyright:
© 2024 Instituto Politecnico Nacional. All rights reserved.
PY - 2024
Y1 - 2024
N2 - An approach to improve the explainability (interpretability) of convolutional neural networks that identify plant species from leaf images is proposed. Specifically, a methodology is established to discover the most determining diagnostic features used by a convolutional neural network (CNN) in the identification of 63 native plant species from Costa Rica. The result is a CNN that not only identifies plant species but also provides an explanation through a heat map and a translation of that map into a table of diagnostic features used in classical taxonomy, each with a weight that describes the relative importance of each trait (e.g., apex, primary vein, and leaf base). To achieve this, a CNN was trained using leaf images from 63 vascular plant species from Costa Rica. Once the network was trained, the Layer-wise Relevance Propagation (LRP) technique was applied to a subset I of 50 leaves images distributed uniformly across a set of 10 species to visualize the representations (heat maps) learned by the internal layers of the CNN. Then, a taxonomist was asked to perform an equivalent task manually, annotating the same 50 leaf images in I by graphically highlighting the most significant features according to their expert judgment (feature map). Finally, algorithmic comparisons were made between the heat maps and feature maps to determine the similarity between the hottest areas used by the CNN and the features used in classical taxonomy.
AB - An approach to improve the explainability (interpretability) of convolutional neural networks that identify plant species from leaf images is proposed. Specifically, a methodology is established to discover the most determining diagnostic features used by a convolutional neural network (CNN) in the identification of 63 native plant species from Costa Rica. The result is a CNN that not only identifies plant species but also provides an explanation through a heat map and a translation of that map into a table of diagnostic features used in classical taxonomy, each with a weight that describes the relative importance of each trait (e.g., apex, primary vein, and leaf base). To achieve this, a CNN was trained using leaf images from 63 vascular plant species from Costa Rica. Once the network was trained, the Layer-wise Relevance Propagation (LRP) technique was applied to a subset I of 50 leaves images distributed uniformly across a set of 10 species to visualize the representations (heat maps) learned by the internal layers of the CNN. Then, a taxonomist was asked to perform an equivalent task manually, annotating the same 50 leaf images in I by graphically highlighting the most significant features according to their expert judgment (feature map). Finally, algorithmic comparisons were made between the heat maps and feature maps to determine the similarity between the hottest areas used by the CNN and the features used in classical taxonomy.
KW - automated plant species identification
KW - Convolutional neural network
KW - deep learning
KW - heat map
KW - interpretability
KW - layer-wise relevance propagation
UR - http://www.scopus.com/inward/record.url?scp=85213888323&partnerID=8YFLogxK
U2 - 10.13053/CyS-28-4-4720
DO - 10.13053/CyS-28-4-4720
M3 - Artículo
AN - SCOPUS:85213888323
SN - 1405-5546
VL - 28
SP - 1741
EP - 1755
JO - Computacion y Sistemas
JF - Computacion y Sistemas
IS - 4
ER -