TY - GEN
T1 - Combining Word Embeddings with Fuzzy Logic to Protect Web Applications Fuzzy VADAS
AU - Lucas, Aurelio Somarriba
AU - Rodriguez, Cesar Garita
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - In the early era of the Internet, webpages only contained static content, such as text and images. However, with the emergence of Web 2.0, a new set of dynamic web applications appeared, such as online banking, e-commerce, social networking, gaming, and others that revolutionized the industry. These new technologies have presented a new set of vulnerabilities that can be exploited by malicious users for multiple purposes, such as data exfiltration/modification/deletion, privilege escalation, malware installation, DDoS (Distributed Denial of Service) attacks, etc. In order to detect some of these web attacks, companies are relying in Web Application Firewalls. These Web Application Firewalls (WAFs) rely on complicated regular expressions (REGEX) that are created by experienced security researchers in order to detect malicious signatures found in tampered HTTP requests. The goal of this research is to provide an alternate way to detect these web attacks without relying on complicated regular expressions. VADAS (Valence Aware worD embedding for web Application Security) approach to detect web attacks is by using a set of revised vocabularies (word embeddings created using unsupervised algorithms) that are commonly found in web attack vectors. These embeddings will allow us to calculate a valence score for each word quantifying its positive/negative score by using cosine similarity. The output from VADAS system is connected to a fuzzy logic controller in order to achieve a final 'maliciousness' classification result (Fuzzy VADAS). Preliminary results show that the performance of Fuzzy VADAS is quite effective, obtaining an accuracy of over 98%. The proposed Fuzzy VADAS approach provides a new way of detecting web application attacks, relying on minimal interaction with security experts (refinement of dictionaries, removing good words, etc.). This is a great advantage in comparison to existing REGEX rule-based systems.
AB - In the early era of the Internet, webpages only contained static content, such as text and images. However, with the emergence of Web 2.0, a new set of dynamic web applications appeared, such as online banking, e-commerce, social networking, gaming, and others that revolutionized the industry. These new technologies have presented a new set of vulnerabilities that can be exploited by malicious users for multiple purposes, such as data exfiltration/modification/deletion, privilege escalation, malware installation, DDoS (Distributed Denial of Service) attacks, etc. In order to detect some of these web attacks, companies are relying in Web Application Firewalls. These Web Application Firewalls (WAFs) rely on complicated regular expressions (REGEX) that are created by experienced security researchers in order to detect malicious signatures found in tampered HTTP requests. The goal of this research is to provide an alternate way to detect these web attacks without relying on complicated regular expressions. VADAS (Valence Aware worD embedding for web Application Security) approach to detect web attacks is by using a set of revised vocabularies (word embeddings created using unsupervised algorithms) that are commonly found in web attack vectors. These embeddings will allow us to calculate a valence score for each word quantifying its positive/negative score by using cosine similarity. The output from VADAS system is connected to a fuzzy logic controller in order to achieve a final 'maliciousness' classification result (Fuzzy VADAS). Preliminary results show that the performance of Fuzzy VADAS is quite effective, obtaining an accuracy of over 98%. The proposed Fuzzy VADAS approach provides a new way of detecting web application attacks, relying on minimal interaction with security experts (refinement of dictionaries, removing good words, etc.). This is a great advantage in comparison to existing REGEX rule-based systems.
KW - Artificial Intelligence
KW - Fasttext
KW - Fuzzy Logic
KW - n-gram
KW - NLP
KW - WAF
KW - Web Application Firewalls
KW - Web Application Security
KW - Word Embeddings
UR - http://www.scopus.com/inward/record.url?scp=85193303251&partnerID=8YFLogxK
U2 - 10.1109/CONCAPANXLI59599.2023.10517554
DO - 10.1109/CONCAPANXLI59599.2023.10517554
M3 - Contribución a la conferencia
AN - SCOPUS:85193303251
T3 - Proceeding of the 2023 IEEE 41st Central America and Panama Convention, CONCAPAN XLI 2023
BT - Proceeding of the 2023 IEEE 41st Central America and Panama Convention, CONCAPAN XLI 2023
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 41st IEEE Central America and Panama Convention, CONCAPAN 2023
Y2 - 8 November 2023 through 10 November 2023
ER -