[ Contact Info ]

This is a legacy project and no longer maintained

Malware Obfuscations

This page describes a completed study of prevalent obfuscation techniques used by malware and is supported by grants from DOE, DHS, NSF, and PC anti-virus reviews. The survey paper is available here [PDF].

SD-Dyninst's malware analysis results have moved here.

Project Description

Security analysts' understanding of the behavior and intent of malware samples depends on their ability to build high-level analysis artifacts from the raw bytes of program binaries. Thus, the first step in analyzing defensive malware is understanding what obfuscations are present in real-world malware binaries. To this end, we present a thorough examination of the obfuscation techniques used by the packer tools that are most popular with malware authors [Bustamante 2008]. Though previous studies have discussed the current state of binary packing [Yason 2007], anti-debugging [Falliere 2007], and anti-unpacking [Ferrie 08] techniques, there have been no comprehensive studies of the obfuscation techniques that are applied to binary code. While some of the individual obfuscations that we discuss have been reported independently, this paper consolidates the discussion while adding substantial depth and breadth to it.

We describe obfuscations that make binary code difficult to discover (e.g., control-transfer obfuscations, exception-based control transfers, incremental code unpacking, code overwriting); to accurately disassemble into instructions (e.g., ambiguous code and data, disassembler fuzz-testing, non-returning calls); to structure into functions and basic blocks (e.g., obfuscated calls and returns, call-stack tampering, overlapping functions and basic blocks); to understand (e.g., obfuscated constants, calling-convention violations, chunked control-flow, do-nothing code); and to manipulate (e.g., self-checksumming, anti-relocation, stolen-bytes techniques). We also discuss how to mitigate the impact of these obfuscations on analysis tools such as disassemblers, decompilers, slicers, instrumenters, and emulators. This work is done in the context of our project to build tools for the analysis [Jacobson et al. 2011][Rosenblum et al. 2008] and instrumentation [Bernat and Miller 2011][Hollingsworth et al. 1994] of binaries, and to recent work on extending these analysis to malware binaries that are highly defensive [Bernat et al. 2011][Roundy and Miller 2010].

Methodology

We use a combination of manual and automated analysis techniques for this study. We began by creating a set of defensive program binaries that incorporate all of the anti-analysis techniques found in real obfuscated malware. We created these binaries by obtaining the latest versions of the binary packer and protector tools that are most popular with malware authors [Bustamante 2008] and applying them to program binaries of various sizes. We carefully analyze the binaries, paying special attention to the obfuscated bootstrap code with which the modified program unrolls the original binary payload into the address space, and to any changes that the obfuscation tool made to the payload code itself.

We obtained most of our observations on these obfuscated binaries by adapting the Dyninst binary code analysis and instrumentation tool for the analysis of highly defensive program binaries, and then using it for that purpose [Bernat et al. 2011; Roundy and Miller 2011]. Our design and development process required a detailed understanding of the obfuscation techniques employed by these packers, which resisted our attempts to discover, analyze, monitor, and modify their code. Our ambitious analysis and instrumentation goals made this a significant challenge. Dyninst applies parsing techniques to disassemble obfuscated code and construct control-flow graphs (CFGs) of the program, updating this analysis at runtime by observing the behavior of the monitored program. Based on this analysis of the defensive binaries, we stress-tested our tool's analysis and instrumentation techniques by instrumenting every memory access and every basic block in the program. Our instrumentation tool is designed to be resistant to any errors in the analysis [Bernat et al. 2011], however, our initial prototype was not, and therefore ran head-on into nearly every obfuscation technique employed by these programs [Roundy and Miller 2011]. We automatically generate statistical reports of defensive techniques employed by these packer tools with our malware-resistant version of Dyninst and present those results in this study. We also spent considerable time perusing each binary's obfuscated code by hand in the process of getting Dyninst to successfully analyze these binaries, aided by the OllyDbg and IdaPro interactive debuggers (Dyninst does not have a code-viewing GUI). In particular, we systematically studied the bootstrap code of each packed binary to achieve a thorough understanding of its overall behavior and high-level anti-analysis techniques.

Publications

Kevin A. Roundy and Barton P. Miller. "Binary-Code Obfuscations in Prevalent Packer Tools", Submitted for publication. [PDF]

Andrew R. Bernat, Kevin A. Roundy, and Barton P. Miller. "Efficient, Sensitivity Resistant Binary Instrumentation", International Symposium on Software Testing and Analysis (ISSTA), Toronto, Canada, July 2011. [PDF]

Kevin A. Roundy and Barton P. Miller. "Hybrid Analysis and Control of Malware Binaries", Recent Advances in Intrusion Detection (RAID), Ottawa, Canada, September 2010. [PDF]

References

Bernat, A. R. and Miller, B. P. 2011. Anywhere, Any Time Binary Instrumentation. In Workshop on Program Analysis for Software Tools and Engineering (PASTE). Szeged, Hungary.

Bernat, A. R., Roundy, K. A., and Miller, B. P. 2011. Efficient, Sensitivity Resistant Binary Instrumentation. In International Symposium on Software Testing and Analysis (ISSTA). Toronto, Canada.

Bustamante, P. 2008. Packer (r)evolution. Panda Research web article.

Falliere, N. 2007. Windows anti-debug reference. Infocus web article.

Ferrie, P. 2008. Anti-unpacker tricks. In International CARO Workshop. Amsterdam, Netherlands.

Hollingsworth, J. K., Miller, B. P., and Cargille, J. 1994. Dynamic program instrumentation for scalable performance tools. In Scalable High Performance Computing Conference. Knoxville, TN.

Jacobson, E. R., Rosenblum, N. E., and Miller, B. P. 2011. Labeling library functions in stripped binaries. In Workshop on Program Analysis for Software Tools and Engineering (PASTE). Szeged, Hungary.

Rosenblum, N. E., Zhu, X., Miller, B. P., and Hunt, K. 2008. Learning to analyze binary computer code. In Conference on Artificial Intelligence (AAAI). Chicago, IL.

Roundy, K. A. and Miller, B. P. 2011. Hybrid analysis and control of malware. In Symposium on Recent Advances in Intrusion Detection (RAID). Ottawa, Canada. Yason, M. V. 2007. The art of unpacking. In Blackhat USA. Las Vegas, NV.

[ Contact Info ]