10 likes | 104 Views
Evaluating the Effect of the Application of a Heuristic on the Accuracy of a Malware Detector. Jessica Turner TSYS Dept. of Computer Science Columbus State University Saturday, April 18 th 2009 CSU Student Colloquium.
E N D
Evaluating the Effect of the Application of aHeuristic on the Accuracy of a Malware Detector Jessica Turner TSYS Dept. of Computer Science Columbus State University Saturday, April 18th 2009 CSU Student Colloquium MALWARE AND MARKOV CHAINSMalware (MALiciousSoftWARE): a program that is intentionally designed to perform often harmful or undesirable acts on the unsuspecting user. [1](e.g., viruses, worms, key-loggers, spyware, adware) Malicious programs are most often variations of once trustworthy programs. These un-safe variants can be made quickly and efficiently using automated program transformation tools i.e. code morphers.[2]In an effort to combat the spread of these malicious variants, a Markov chain-based framework is evaluated, which would offer a fast and approximate method of detection of known morphers.[2] IMPLEMENTATIONThe original malicious program, eve of the W32.Evol virus was obtained from http://vx.netlux.org/The instructions (from the left hand side) were extracted from eve using the Java programming language, in order to obtain a list of all available left hand side instruction possibilities. In the following instance, instruction 'mov' was extracted.Sample code from eve:'mov DWORD PTR [ebx+12],eax' 30 variants were generated:10 variations of eve, 10 first gen variants, and 10 second gen variantsTransition matrices were constructed for probabilities P(1, α)(*, β), the probability that 1 of any instruction would generate any number of another instruction. Future Steps: Implement the complete matrix. Sample W32.Evol Transformation Rule:add : push mov add pop push mov add add pop push mov add mov pop add TARGET MALWAREOur target for this method of detection is morphing malware that uses a morphing engine that applies a fixed, finite set of transformation rules. [2] Each rule maps an instruction (the left hand side) to a sequence of multiple instructions (right hand side), these instructions being in assembly language.[2] These rules are used by the engine to probabilistically substitute, in the variant being transformed, occurrences of the left hand sides of the rules with one of their corresponding right hand sides. [2] Incorporating Markov chain theory in detection methods for filtering malicious variants of programs can significantly hinder malware production by easing the detection of the potentially large volumes of variations that could be generated using simple morphers. [2]Specifically the framework provides the mathematical basis for defining filtersfor variants (with minimal false positives) of known programs built from known morphing engines that obey specific Markov properties. [2] Information about transition probabilities will be used to predict future, never-before-seen states of malware.According to Markov chain theory, an entity may change its state from it's current state to another state, or remain in the same state, according to a certain probability distribution. These changes of state are called transitions, and the probabilities associated with the various state changes are called transition probabilities. [3] INSTRUCTION FREQUENCY VECTORSDefinition (IFV):CODE: MOV ADD SUB MOV MOV PUSH SUB MOV IFV: (MOV, 4) (ADD, 1) (SUB, 2) (PUSH,1) The key to mapping the detection problem to a Markov chain problem is to treat the IFV as a “state”, and then map state transition changes as predictable changes to the IFV. [2]Filtering of known morphers is done by encoding the property changes as a transition matrix and then using Markov identities to test whether or not a given program could in fact be a malicious variant. [2] • CALCULATING PROBABILITIES- Probability that 1 instance of instruction I generates β instances of instruction J • - Probability that α instances of instruction I generate β instances of instruction J • - Probability that IFVigenerates β instances of instruction J • - Probability that IFVi generates IFVj MATRIX OPTIMIZATIONSPossible heuristics for optimizing the transition matrix include, but may not be limited to:(1) reducing the size of the instruction set by abstracting an assembly language instruction to its opcode mnemonic or by ignoring register names and variable values (2) imposing an upper bound on the possible frequency of each individual instruction(3) imposing an upper bound on an IFV’s norm ||.||∞(4) abstracting each component of an IFV to one of two values (“low” or “high”), depending on whether that component is less than or greater than some threshold REFERENCES 1. Konstantinou, E., 2008. Metamorphic Virus: Analysis and Detection. Technical Report. 2. Chouchane, M.R., Walenstein, A., and Lakhotia, A. Using Markov Chains to Filter Machine-morphed Variants of Malicious Programs. 2008. 3rd International Conference on Malicious and Unwanted Software.3. "Markov Chain" Wikipedia, The Free Encyclopedia. Wikimedia Foundation, Inc. 10 April. 2009. <http://en.wikipedia.org/wiki/Markov_chain>.