Enhanced Ensemble Learning Approaches for Malicious URL Detection: A Comparative Analysis of Advanced Hybrid Models
DOI:
https://doi.org/10.54692/ijeci.2025.0902/261Keywords:
obfuscation, PICOS-based methodological framing, algorithmic URL generators, aAdaBoost, XGBoost, malicious URL detection systemsAbstract
Malicious URLs have become a constant menace on cybersecurity, serving as entry points to phishing campaigns, malware distribution and identity theft. The conventional blacklist and heuristic-based systems are becoming less effective in detecting these dynamic URLs especially those that use domain obfuscation algorithms, fast-flux hosts and algorithmic URL generators. Use of machine learning (ML) in the classification of URLs has already been thoroughly examined, but there is little comparative evidence regarding novel methods of sophisticated ensemble learning. This paper experimentally compares five ensemble algorithms, including Random Forest, Gradient Boosting, XGBoost, Stacking Classifier and AdaBoost, using the Malicious Webpages Dataset that has 1, 781 samples and 21 lexical, host-based, DNS and network features. The academic rigor of the paper is enhanced by systematic preprocessing, PICOS-based methodological framing, and literature synthesis based on PRISMA. Findings showed that XGBoost has the best accuracy of 98.31 %, precision of 97.85 %, and recall of 98.77 % and F1-score of 98.31 % which is better than the baseline AdaBoost accuracy of 96.89 %. The existence of confusion matrices, ROC curves, indicators of computational efficiency and feature importance rankings also confirm the high performance and ability of XGBoost to act in real-time. The research adds to a full comparative study, to the level of greater method clarity and practical considerations to create efficient malicious URL detection systems.