A Stacked Heterogeneous Ensemble Learning Framework for Ransomware Network Traffic Detection Using WEKA: An Empirical Study on CIC-IDS2018
DOI:
https://doi.org/10.54692/ijeci.2026.1001/270Keywords:
Ransomware detection, WEKA, stacked generalisation, CIC-IDS2018, network intrusion detection, machine learningAbstract
Network intrusions related to ransomware attacks are growing in both number and sophistication. A transparent, reproducible, and generalizable detection framework is therefore essential for diverse network traffic patterns. This study develops and validates a stacked heterogeneous ensemble: Random Forest, J48, Naïve Bayes, SMO and k-NN were employed as base learners and Logistic Regression as a meta-learner in WEKA 3.8.6, and per-fold predictions were exported for further statistical analysis on a ransomware-associated network-traffic classification dataset, CIC-IDS2018. Eighty raw attributes were reduced to twenty-two using the preprocessing pipeline following the CRISP-DM framework, in which the steps of cleaning, normalisation and correlation-based feature selection were performed. Within each cross-validation fold, SMOTE was applied independently to the training partition to balance the minority class. The stacked ensemble achieved the highest accuracy of 99.18±0.18% (using stratified ten-fold cross-validation with three independent seeds: (1, 7, 42)), an F1-score of 0.940, an AUC of 0.978 and an MCC of 0.936. The proposed model achieved statistically significant improvements over the best single classifier (Random Forest, accuracy 98.3%, F1-score 0.910), as demonstrated by paired t-tests (p = 0.00021) and the Friedman–Nemenyi procedure (χ² = 37.9, p < 0.001, CD = 1.62). The results demonstrate that heterogeneous stacking effectively captures complementary decision boundaries that individual WEKA classifiers cannot. The proposed pipeline is transparent and open-source, making it suitable for resource-constrained network intrusion detection and digital-forensic triage.