A Machine Learning-Based Network Intrusion Detection System for Apache Weblogs Using the Random Forest Algorithm
Abstract
With the rapid development of internet technologies these days, people start relying on the internet more and more. As a result, websites have become essential platforms for information dissemination and service delivery in daily life. However, while they bring convenience to people, they have also become primary targets for malicious actors which lead to a sharp increase in cybersecurity incidents. Therefore, this study proposes a machine learning-based approach to classify attacks in Apache weblogs in order to effectively identify and detect malicious network attacks. A balanced training dataset consisting of four attack categories was also constructed. Additionally, eight potential features were extracted, and Gini Feature Importance was used as an evaluation metric to conduct ablation studies on different feature combinations.
With regard to the experimental results, they demonstrate that the proposed method can reach a prediction accuracy up to 99.49% and with precision, recall, and F1-score all exceeding 99%. Based on the experiment results, the proposed approach can be used to build a more reliable attack classifier and verify the potential of combining machine learning with log feature analysis in the field of cybersecurity.
PIN-RONG CHEN, HAO-YU WENG, CHENG-TA HUANG, "A Machine Learning-Based Network Intrusion Detection System for Apache Weblogs Using the Random Forest Algorithm," Communications of the CCISA, vol. 31, no. 2 , pp. 46-59, May. 2025.
Full Text:
PDFRefbacks
- There are currently no refbacks.
Published by Chinese Cryptology and Information Security Association (CCISA), Taiwan, R.O.C
CCCISA Editorial Office
E-mail: ccisa.editor@gmail.com