1 / 22

Presented by Ryan Gates

Polygraph: Automatically Generating Signatures for Polymorphic Worms James Newsome, Brad Karp, and Dawn Song Carnegie Mellon University. Presented by Ryan Gates. Overview. Goal Composition of a worm Invariant bytes and Tokens Types of signatures Conjunction Token Subsequence Bayes

Download Presentation

Presented by Ryan Gates

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Polygraph: Automatically Generating Signatures for Polymorphic Worms James Newsome, Brad Karp, andDawn SongCarnegie Mellon University Presented by Ryan Gates

  2. Overview • Goal • Composition of a worm • Invariant bytes and Tokens • Types of signatures • Conjunction • Token Subsequence • Bayes • Polygraph Signature Generator • Metrics • Results • Evaluation

  3. Goal • Automate the generation of worm signatures • Specifically polymorphic worms • Prevent polymorphic worms from going undetected • Including perfectly polymorphic instances

  4. Decomposition of a worm Figure 1. Polymorphed ApacheKnacker • Invariant bytes • Wild card bytes • Code bytes

  5. Invariant Bytes •  Invariant framing • Reserved key words or well known binary constants that are part of the wire protocol • For example "HTTP" or "GET" • Invariant overwrite values • High order bytes of the overwritten address • For example in BIND-TSIG "\xFF\xBF" • Many invariant substrings are not sufficiently long to not prevent false positives. • The solution is to let each set of invariant bytes be represented by a token

  6. Tokens • Tokens must not be a substring of another token • For example HTTP not TTP • Conjunction Signature • Token Sub-sequence Signature • Bayes Signature • Each token value represents the probability of that token being present in an actual worm flow.

  7. Conjunction Signatures • Every token in the conjunction signature must be found in the payload for there to be a match • All tokens are required to match • Reduce false positives • For example in the Apache-Knacker signature, ‘GET’, ‘HTTP/1.1\r\n’,’:’ are tokens in a conjunction signature

  8. Token Subsequence Signatures • Similar to the conjunction signature, but more restrictive. • All tokens must be present in the correct order to reduce false positives • Typically modeled using Regular Expressions • For example in the BIND-TSIG signature, “GET.*HTTP/1.1\r\n.*…”

  9. Bayes Signature • Set of tokens, and each with a score • If the sum the tokens exceeds a threshold then it is considered a match. • A sample signature would include ‘\x00\x00\xFA’: 1.7574 • Benefits • Less rigid, which helps prevent false positives for common tokens. • Higher quality signatures with a more diverse suspicious pool.

  10. Limitations of Signature Types • Bayes signature is unaffected by noise, until it grows beyond 80%. At this point there will be 100% false negatives. • Flow classifier did a very poor job of classifying the flows. • Conjunction and Token Subsequence cannot handle multiple types of worms • The solution is to use clustering to separate the worms into manageable clusters

  11. Clustering • Clustering helps the conjunction and token subsequence signatures deal with variety • Used to divide the suspicious flows into a number of different pools. • Divide the suspicious pool into several clusters which contain types of flows • Clusters should not be too general • Clusters should not be too specific

  12. Polygraph Signature Generator • The polygraph monitor must have access to the network's packet flow. • An imperfect flow classifier sorts packet flows into either the suspicious or innocuous pool.

  13. Polygraph Signature Generator • It will not distinguish between different worms, but merely suspicious flows and innocuous flows.  • Flow classifier is reliable, but imperfect.  • The result is noise.

  14. Polygraph Signature Generator • Uses samples to determine appropriate signatures for worms present in the suspicious flow pool. • Resilient to noise in the system

  15. Metrics • Quality • Low percentage of false positives and false negatives • Efficiency in generation • Lower computational cost • Efficiency in matching • Should not inhibit the network traffic • Generate small signature sets • Limit the number of signatures • Robustness • Yield high quality signature even with noise and a variety of worms • Resistance to clever evasion by worms

  16. Results | ApacheKnacker • Table 1. ApacheKnacker signatures. These signatures were successfully generated for innocuous pools containing at least 3 worm samples. • Best performer was Token Subsequence • The ordering used in the Token Subsequence signature helps reduce the number of false positives.

  17. Results | BIND-TSIG • Table 2. BINDTSIG signatures. These signatures were successfully generated for innocuous pools containing at least 3 worm samples. •  The best performers were Conjunction and Token Subsequence. • Bayes signature quality is degraded when the tokens are common in other innocuous flows.

  18. Results | Coincidental Pattern • Coincidental Patter attack injects invariant bytes in wildcard bytes to confuse the signature generater.

  19. Contribution • Polygraph helps to automate signature generation • Examined the effects that implementing polymorphism on worms could have on worm signature generation and matching. • Introduced imperfections in the classifying of network flows

  20. Limitations • Worms that lack invariant code • Requires a flow classifier and at least 3 worm samples • If the innocuous pool is too diverse, there will be too many false positives.

  21. Improvements and Future Work • Take advantage of multiple cores. • Incorporate the design of an efficient flow classifier • Determine how feasible it is to inspect network traffic • Determine an algorithm to choose best signature to use

  22. References • J. Newsome, B. Karp, and D. Song. Polygraph: Automatically generating signatures for polymorphic worms. In IEEE Security and Privacy Symposium, 2005.

More Related