Real-Time Hand Detection using Convolutional Neural Networks for Costa Rican Sign Language Recognition

Juan Zamora-Mora, Mario Chacon-Rivas

Producción científica: Capítulo del libro/informe/acta de congresoContribución a la conferenciarevisión exhaustiva

13 Citas (Scopus)

Resumen

Sign language is the natural language for the deaf, something that comes naturally as a form of non-verbal communication between signers, ruled by a set of grammars that is in constant evolution as the universe of signs represents a small fraction of all words in Spanish. This limitation combined with the lack of knowledge in sign language by verbal speakers creates a separation where both parties (signers and non-signers) are unable to efficiently communicate, a problem that increases under a specific context such as emergency situations, where first-response teams such as EMTs, firefighters or police officers might be unable to properly attend an emergency as interactions between the involved parties becomes a barrier for decision making when time is scarce. Developing a cognitive-capable tool that serves to recognize sign language in a ubiquitous way, is a must to reduce barriers between the deaf and emergency corps under this context. Hand detection is the first step toward building a Costa Rican sign language (LESCO) recognition framework. Important advances in computing, particularly in the area of deep learning, open a new frontier for object recognition that can be leveraged to build a hand detection module. This study trains the MobileNet V1 convolutional neural network against the EgoHands dataset from Indiana University's UI Computer Vision Lab to determine if the dataset itself is sufficient to detect hands in LESCO videos, from five different signers that wear short-sleeve shirts and under complex backgrounds. Those requirements are key to determine the usefulness of the solution as consulted bibliography performs tests with single-color backgrounds and long-sleeve shirts that ease the classification tasks under controlled environments only. The two-step experiment obtained 1) a mean average precision of 96.1% for the EgoHands dataset and 2) a 91% average accuracy for hand detection across the five LESCO videos. Despite the high accuracy reported by the tests in this paper, the hand detection module was unable to detect certain hand shapes such as closed fists and open hands pointing perpendicular to the camera lens, which suggests that the complex egocentric views as captured in the EgoHands dataset might be insufficient for proper hand detection for Costa Rican sign language.

Idioma originalInglés
Título de la publicación alojadaProceedings - 2019 International Conference on Inclusive Technologies and Education, CONTIE 2019
EditoresMonica Adriana Carreno-Leon, Jesus Andres Sandoval-Bringas, Mario Chacon-Rivas, Francisco Javier Alvarez-Rodriguez
EditorialInstitute of Electrical and Electronics Engineers Inc.
Páginas180-186
Número de páginas7
ISBN (versión digital)9781728154367
DOI
EstadoPublicada - oct 2019
Evento2nd International Conference on Inclusive Technologies and Education, CONTIE 2019 - San Jose del Cabo, México
Duración: 30 oct 20191 nov 2019

Serie de la publicación

NombreProceedings - 2019 International Conference on Inclusive Technologies and Education, CONTIE 2019

Conferencia

Conferencia2nd International Conference on Inclusive Technologies and Education, CONTIE 2019
País/TerritorioMéxico
CiudadSan Jose del Cabo
Período30/10/191/11/19

Huella

Profundice en los temas de investigación de 'Real-Time Hand Detection using Convolutional Neural Networks for Costa Rican Sign Language Recognition'. En conjunto forman una huella única.

Citar esto