|
DOC is an abstract interpreter that supports finding obfuscated calls in x86 executables using what is known as an Abstract Stack Graph (ASG). It was implemented as part of a graduate-level research project and is still in its prototype stage. Abstract interpretation is the process of interpreting a program and applying operations to an abstract domain. It is different from the usual meaning of interpretation in that abstract interpretation aims to determine all possibles values for a program variable through all possible program paths. While interpretation may determine that a program variable has the value 5 if a certain branch is chosen and 10 if some other branch is chosen, abstract interpretation would conclude that the program variable is among one of the items in the set {5, 10}, regardless of the branch chosen. In this case, we are using a set of integers as an abstract representation of the program value at that point. In DOC, the abstract domain for detecting obfuscated calls is a reduced-interval congruence (an abstract representation of integers) and an abstract stack graph (abstract representation of a stack). Using these, we are able to detect places where the call instruction may be replaced by equivalent instructions. For instance, a call can be replaced by a push/jump combination without changing the meaning of the program. For more information on how this works, please see our papers. This guide is intended to assist the reader in getting DOC running. This guide does not go through any in-depth discussions of abstraction interpretation or the abstract stack graph. This is a very brief introduction to DOC and abstract interpretation. We invite the reader to review our past papers for a much more detailed explaination. HistoryThe DOC project first began with Eric Uday Kumar (now at Authentium) under the supervision of Dr. Arun Lakhotia (University of Louisiana at Lafayette). This first implementation performed call obfuscation detection using the abstract stack graph. Michael Venable later added the reduced-interval congruence to the abstract domain as well as refactored much of the code and rewrote the UI. Using the reduced-interval congruence makes it possible to determine the contents of registers and other program variables, making it possible to know the targets of indirect jumps, etc. This project is coming to a close at the University of Louisiana at Lafayette, so we've decided to make it open source with the hopes that others may be able to benefit from it and provide improvements. |