A Comparative Assessment of Malware Classification using Binary Texture Analysis and Dynamic Analysis

Abstract

AI techniques play an important role in automated malware classification. Several machine-learning methods have been applied to classify or cluster malware into families, based on different features derived from dynamic review of the malware. While these approaches demonstrate promise, they are themselves subject to a growing array of countermeasures that increase the cost of capturing these binary features. Further, feature extraction requires a time investment per binary that does not scale well to the daily volume of binary instances being reported by those who diligently collect malware. Recently, a new type of feature extraction, used by a classification approach called binary-texture analysis, was introduced in [16]. We compare this approach to existing malware classification approaches previously published. We find that, while binarytexture analysis is capable of providing comparable classification accuracy to that of contemporary dynamic techniques, it can deliver these results 4000 times faster than dynamic techniques. Also surprisingly, the texture-based approach seems resilient to contemporary packing strategies, and can robustly classify a large corpus of malware with both packed and unpacked samples. We present our experimental results from three independent malware corpora, comprised of over 100 thousand malware samples. These results suggest that binary-texture analysis could be a useful and efficient complement to dynamic analysis.
[PDF] [BibTex]
Lakshmanan Nataraj, Vinod Yegneswaran, Phil Porras, Jian Zhang,
Workshop on Artificial Intelligence and Security (AISec), Chicago, Oct. 2011.
Node ID: 566 , DB ID: 375 , Lab: VRL , Target: Workshop