Artigo Revisado por pares

PowerDetector: Malicious PowerShell script family classification based on multi-modal semantic fusion and deep learning

2023; Institute of Electrical and Electronics Engineers; Volume: 20; Issue: 11 Linguagem: Inglês

10.23919/jcc.fa.2022-0509.202311

ISSN

1673-5447

Autores

Xiuzhang Yang, Guojun Peng, Dongni Zhang, Yuhang Gao, Chenguang Li,

Tópico(s)

Cybercrime and Law Enforcement Studies

Resumo

PowerShell has been widely deployed in fileless malware and advanced persistent threat (APT) attacks due to its high stealthiness and live-off-the-land technique. However, existing works mainly focus on deobfuscation and malicious detection, lacking the malicious PowerShell families classification and behavior analysis. Moreover, the state-of-the-art methods fail to capture fine-grained features and semantic relationships, resulting in low robustness and accuracy. To this end, we propose PowerDetector, a novel malicious PowerShell script detector based on multimodal semantic fusion and deep learning. Specifically, we design four feature extraction methods to extract key features from character, token, abstract syntax tree (AST), and semantic knowledge graph. Then, we intelligently design four embeddings (i.e., Char2Vec, Token2Vec, AST2Vec, and Rela2Vec) and construct a multi-modal fusion algorithm to concatenate feature vectors from different views. Finally, we propose a combined model based on transformer and CNN-BiLSTM to implement PowerShell family detection. Our experiments with five types of PowerShell attacks show that PowerDetector can accurately detect various obfuscated and stealth PowerShell scripts, with a 0.9402 precision, a 0.9358 recall, and a 0.9374 F-score. Furthermore, through single-modal and multi-modal comparison experiments, we demonstrate that PowerDetector's multi-modal embedding and deep learning model can achieve better accuracy and even identify more unknown attacks.

Referência(s)
Altmetric
PlumX