A Survey of Classical and Hybrid Machine Learning Models for Android Malware Detection: Techniques, Taxonomies, Challenges, and Future Research Directions

Mr. Ankit Singh

doi:https://doi.org/10.17492/computology.v5i1.2506

Journal Press India^®

www.journalpressindia.com

Submit Manuscript Login / Register Subscribe

Home

Editorial Board Members

Mission, Aims & Scope

Current Issue

A Survey of Classical and Hybrid Machine Learning Models for Android Malware Detection: Techniques, Taxonomies, Challenges, and Future Research Directions

Ankit Singh

https://doi.org/10.17492/computology.v5i1.2506

Published Online: September 12, 2025

Author Details ( * ) denotes Corresponding author

1. * Ankit Singh, Student, Computer engineering, NIT-Kurukshetra, kurukshetra, Rajasthan, India (emailto.ankit123@gmail.com)

The swift expansion of Android devices and ap- plications has greatly enlarged the attack surface for cyber threats, especially malware. Conventional signature-based detection methods have become inadequate because of the rise of advanced evasion strategies like obfuscation, polymorphism, and encryption. In reaction to these changing threats, machine learning (ML) has received significant focus as an effective method for creating intelligent and adaptable malware detection systems. This study offers an extensive examination of the latest progress in ML techniques for detecting Android malware, deliberately omitting deep learning methods to concentrate on traditional and ensemble ML models. The research starts by exploring different forms of malware and the methods used to avoid detection, then presents a summary of malware analysis approaches such as static, dynamic, and hybrid analysis. A significant focus is placed on feature engineering techniques covering both extraction and selection since they are vital for enhancing the accuracy and efficiency of ML classifiers. The assessment classifies and examines frequently employed ML algorithms like Random Forests, Support Vector Machines, Decision Trees, Naive Bayes, and k-Nearest Neighbors, emphasizing their usage scenarios, advantages, drawbacks, and documented effectiveness on standard datasets. Additionally, we examine the difficulties present in existing ML-driven methods, such as class imbalance, dataset diversity, overfitting, and insufficient generalization to new malware types. This document additionally highlights future research avenues, including the incorporation of hybrid analysis methods, explainable ML models, and adaptable learning strategies to tackle concept shift and adversarial interference. This review seeks to assist researchers and practitioners in creating effective, scalable, and robust ML-based solutions for detecting Android malware by synthesizing the current body of work.

Keywords

Android malware; Machine learning; Static analysis; Dynamic analysis; Hybrid analysis; Malware detection; Ensemble learning; Feature selection; Mobile security; Adversarial robustness; Classification algorithms; Cybersecurity

Alani, M., & Awad, A. (2024). PAIRED: Permission analysis using SHAP and RFE. IEEE Transactions on Information Forensics and Security, 19, 2231–2243.
Alazzam, M., Heidari, A. A., Mafarja, M., Dhiman, G., & Abualigah, L. (2023). Binary owl optimization for Android malware detection. Mathematics, 11(1), 1–15.
Alkahtani, A., & Aldhyani, T. (2024). Hybrid correlation-filtered malware detection using ML and DL. Future Generation Computer Systems, 144, 210–222.
Almarshad, F., Alshammari, R., & Alanazi, H. (2024). Few-shot learning for Android malware detection with Siamese networks. Sensors, 24(1), 89.
AlOmari, S., Alshamrani, A., & Alrassan, I. (2023). Evaluation of ML classifiers for dynamic Android malware detection. Electronics, 12(3), 749.
Bahtiyar, M., & Ertugrul, E. (2023). A regression-based prediction model for advanced malware detection. Journal of Cybersecurity and Information Management, 7(2), 55–62.
Bhusal, A., & Rastogi, R. (2022). Adversarial robustness in Android malware: A survey. Electronics, 11(8), 1215.
Ceschin, J., Moreira, G., Albuquerque, R. D., Junior, O. A., & Guidoni, D. L. (2023). Concept drift resilience in Android malware detection using adaptive learning. IEEE Access, 11, 1002–1013.
Chemmakha, M., Gharsellaoui, H., & Alimi, A. M. (2023). Embedded feature selection with LightGBM and RF for Android malware. Journal of Information Security and Applications, 65, 103107.
Chimeleze, K., Ezugwu, A. E., Aboudaif, M. K., & Almutairi, A. (2022). BFEDroid: Backward and forward exhaustive feature reduction for Android malware. Procedia Computer Science, 184, 93–102.
Dabas, N., Ahlawat, P., & Sharma, P. (2023). An effective malware detection method using hybrid feature selection and machine learning algorithms. Arabian Journal for Science and Engineering, 48(8), 9749–9767.
Garg, G., & Baliyan, N. (2024). Android malware detection using parallel ensemble machine learning. Telematics and Informatics, 84, 102006.
Ghazi, N., & Raghava, V. (2023). Android malware detection using the Mayfly algorithm and ensemble classifiers. Computers & Security, 123, 103042.
Ghorab, M., Ibrahim, M., & Soliman, H. (2024). Comprehensive benchmarking of ML classifiers for Android malware. arXiv Preprint, arXiv:2402.02953.
Gracia, M., García, S., & Devesa, J. (2021). Addressing malware drift using transfer learning and traditional ML. Applied Sciences, 11(9), 4004.
Gupta, R., Kumar, V., & Sharma, A. (2023). Rough set-based feature prioritization for permission-based malware detection. Applied Soft Computing, 125, 109151.
Han, Y., Xia, Y., Gong, L., He, J., & Yang, J. (2020). MalInsight: Profiling-based Android malware detection using behavioral traits. Future Generation Computer Systems, 108, 1302–1316.
Hossain, M., Rahman, M. A., & Islam, S. R. (2023). Particle swarm optimization with ML classifiers for Android ransomware. Expert Systems with Applications, 203, 117522.
Karbab, A., & Debbabi, M. (2021). MalDy: Portable, data-driven malware detection using NLP and machine learning. IEEE Transactions on Dependable and Secure Computing, 18(2), 390–404.
Liu, G., Zhang, H., Chen, Y., & Wu, J. (2025). Benchmarking traditional ML vs DL for Android malware detection. arXiv Preprint, arXiv:2502.15041.
Mahindru, N., & Sangal, R. (2022). FSDroid: Feature selection framework for malware classification. Security and Privacy, 5(3), e163.
Mat, S. R. T., Razak, M. F. A., Kahar, M. N. M., Arif, J. M., & Firdaus, A. (2022). A Bayesian probability model for Android malware detection. ICT Express, 8(3), 424–431.
Mehtab, S., Gupta, R., & Rani, S. (2022). AdDroid: Rule-based malware detection using Adaboost. Expert Systems with Applications, 199, 116830.
Odat, R., & Yaseen, M. (2023). Permission–API co-occurrence mining for malware classification. Computers & Electrical Engineering, 104, 108498.
Panman de Wit, N. P., Koot, M., Veldhuis, R., & Rutten, R. (2021). Detecting mobile malware through device-level indicators. IEEE Access, 9, 15513–15529.
Rathore, M. M., Ahmad, A., Paul, A., & Rho, S. (2019). Clustering-enhanced malware detection using machine learning. IEEE Systems Journal, 13(1), 532–539.
Roy, A., Khan, S., & Singh, A. (2022). SVM with non-negative matrix factorization for Android malware detection. Computer Networks, 208, 108940.
Sahin, N., Akleylek, S., & Kilic, E. (2022). LinRegDroid: Detection of Android malware using multiple linear regression models-based classifiers. IEEE Access, 10, 14246–14259.
Santosh, K. S., Smmarwar, G. P., Gupta, G. P., & Kumar, S. (2024). Android malware detection and identification frameworks by leveraging the machine and deep learning techniques: A comprehensive review. Telematics and Informatics Reports, 14, 100130. https://doi.org/10.1016/j.teler.2024.100130
Seraj, S., Khodambashi, S., Pavlidis, M., & Polatidis, N. (2023). MVDroid: An Android malicious VPN detector using neural networks. Neural Computing and Applications, 35(29), 21555–21565.
Shakya, P., & Dave, M. (2022). System-call-based Android malware detection using classical ML. Procedia Computer Science, 193, 15–24.
Shhadat, T., Al-Saleh, M., & Rawashdeh, M. (2021). Static feature-based classification of Android malware. International Journal of Computer Applications, 182(42), 28–33.
Soundrapandian, M., & Subbiah, S. (2022). Lightweight malware detection using Mahalanobis distance and evolutionary FS. Wireless Personal Communications, 125, 137–152.
Surendran, D., & Krishna, D. (2022). Hybrid malware detection with Tree-Augmented Naive Bayes. Wireless Networks, 28, 545–558.
Usman, M., Jan, M. A., Alam, M., & Khan, F. (2021). Malware forensics using IP-based decision tree analysis. Digital Investigation, 36, 301068.
Wahad, N., Singh, P., & Kaur, R. (2024). Improving Android malware detection via RFE and SHAP-based feature reduction. Journal of Computer Security, 32(1), 25–38.
Wang, L., Zhang, Y., Li, H., Chen, X., & Zhao, J. (2023). A network traffic analysis framework for Android malware detection. Computer Communications, 212, 85–92.
Wu, J., Chen, Y., Zhang, X., Yang, C., & Zhou, M. (2022). DroidRL: Deep Q-learning for malware feature selection. IEEE Access, 10, 12746–12757.
Xiaofeng, L., Fangshuo, J., Xiao, Z., Shengwei, Y., Jing, S., & Lio, P. (2019). ASSCA: API sequence and statistics features combined architecture for malware detection. Computer Networks, 157, 99–111.
Yerima, S. Y., Sezer, S., & McWilliams, G. (2016). Zero-day detection using Bayesian classification on Android apps. Information Security Journal: A Global Perspective, 25 (4-6), 213–225.