A Stacked Heterogeneous Ensemble Learning Framework for Ransomware Network Traffic Detection Using WEKA: An Empirical Study on CIC-IDS2018

Saddam Ali; Muhammad Tayyab Waqar; Fareeha Akbar; Abdul Wahid Soomro; Khalid Ali; Muhammad Ali

doi:10.54692/ijeci.2026.1001/270

Authors

Saddam Ali International Collaborative Research Group, Lahore Pakistan
Muhammad Tayyab Waqar Department of Computer Sciences, University of Management and Technology, Lahore, Pakistan
Fareeha Akbar International Collaborative Research Group, Lahore Pakistan
Abdul Wahid Soomro Department of Computer Systems and Technology, Universiti Malaya, Kuala Lumpur 50603, Malaysia
Khalid Ali Department of Computer Science, University College of Dera Murad Jamali, Lasbela University of Agriculture, Water and Marine Science Uthal, Balochistan, Pakistan.
Muhammad Ali International Collaborative Research Group, Islamabad, Pakistan

DOI:

https://doi.org/10.54692/ijeci.2026.1001/270

Keywords:

Ransomware detection, WEKA, stacked generalisation, CIC-IDS2018, network intrusion detection, machine learning

Abstract

Network intrusions related to ransomware attacks are growing in both number and sophistication. A transparent, reproducible, and generalizable detection framework is therefore essential for diverse network traffic patterns. This study develops and validates a stacked heterogeneous ensemble: Random Forest, J48, Naïve Bayes, SMO and k-NN were employed as base learners and Logistic Regression as a meta-learner in WEKA 3.8.6, and per-fold predictions were exported for further statistical analysis on a ransomware-associated network-traffic classification dataset, CIC-IDS2018. Eighty raw attributes were reduced to twenty-two using the preprocessing pipeline following the CRISP-DM framework, in which the steps of cleaning, normalisation and correlation-based feature selection were performed. Within each cross-validation fold, SMOTE was applied independently to the training partition to balance the minority class. The stacked ensemble achieved the highest accuracy of 99.18±0.18% (using stratified ten-fold cross-validation with three independent seeds: (1, 7, 42)), an F1-score of 0.940, an AUC of 0.978 and an MCC of 0.936. The proposed model achieved statistically significant improvements over the best single classifier (Random Forest, accuracy 98.3%, F1-score 0.910), as demonstrated by paired t-tests (p = 0.00021) and the Friedman–Nemenyi procedure (χ² = 37.9, p < 0.001, CD = 1.62). The results demonstrate that heterogeneous stacking effectively captures complementary decision boundaries that individual WEKA classifiers cannot. The proposed pipeline is transparent and open-source, making it suitable for resource-constrained network intrusion detection and digital-forensic triage.

A Stacked Heterogeneous Ensemble Learning Framework for Ransomware Network Traffic Detection Using WEKA: An Empirical Study on CIC-IDS2018

Authors

DOI:

Keywords:

Abstract

Downloads

Published

Issue

Section

Information