Author Details
( * ) denotes Corresponding author
The swift expansion of Android devices and ap- plications has greatly enlarged the attack surface for cyber threats, especially malware. Conventional signature-based detection methods have become inadequate because of the rise of advanced evasion strategies like obfuscation, polymorphism, and encryption. In reaction to these changing threats, machine learning (ML) has received significant focus as an effective method for creating intelligent and adaptable malware detection systems. This study offers an extensive examination of the latest progress in ML techniques for detecting Android malware, deliberately omitting deep learning methods to concentrate on traditional and ensemble ML models. The research starts by exploring different forms of malware and the methods used to avoid detection, then presents a summary of malware analysis approaches such as static, dynamic, and hybrid analysis. A significant focus is placed on feature engineering techniques covering both extraction and selection since they are vital for enhancing the accuracy and efficiency of ML classifiers. The assessment classifies and examines frequently employed ML algorithms like Random Forests, Support Vector Machines, Decision Trees, Naive Bayes, and k-Nearest Neighbors, emphasizing their usage scenarios, advantages, drawbacks, and documented effectiveness on standard datasets. Additionally, we examine the difficulties present in existing ML-driven methods, such as class imbalance, dataset diversity, overfitting, and insufficient generalization to new malware types. This document additionally highlights future research avenues, including the incorporation of hybrid analysis methods, explainable ML models, and adaptable learning strategies to tackle concept shift and adversarial interference. This review seeks to assist researchers and practitioners in creating effective, scalable, and robust ML-based solutions for detecting Android malware by synthesizing the current body of work.
Keywords
Android malware; Machine learning; Static analysis; Dynamic analysis; Hybrid analysis; Malware detection; Ensemble learning; Feature selection; Mobile security; Adversarial robustness; Classification algorithms; Cybersecurity