1 / 42

Cthulhu

Cthulhu. A software analysis framework built on Phoenix. Who am I?. Matt Miller Leviathan Security Group Metasploit Framework Uninformed Journal Not a static analysis expert . What’s this talk about?. Cthulhu software analysis framework Very high-level architectural overview

brant
Download Presentation

Cthulhu

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Cthulhu A software analysis framework built on Phoenix

  2. Who am I? • Matt Miller • Leviathan Security Group • Metasploit Framework • Uninformed Journal • Not a static analysis expert 

  3. What’s this talk about? • Cthulhu software analysis framework • Very high-level architectural overview • Interesting features • Case study

  4. Phoenix Overview • Software optimization and analysis • Basis for future Microsoft compilers and tools • Robust and extensible architecture • Plugins • Phases • Check out Richard Johnson’s talk to learn more 

  5. Why extend Phoenix? • RDK/SDK not yet completely solidified • Encapsulation can help here • API is feature rich but verbose • No simplified wrapper • No solution for large-scale analysis • LTCG is not enough

  6. Cthulhu Overview • Software analysis framework • Hobby project started in June, 2006 • Written in C# • Currently around 28KLOC

  7. Cthulhu Goals • Simplified Programming Interface • Simple and extensible API • Fundamental independence • Large-scale analysis • Modeling behavior of large systems • Pie in the sky: Windows Vista  • Research Sandbox • A playground for experimentation • Phoenix can also be used directly for this purpose

  8. Cthulhu Architecture DB Data Flow IDA Control Flow Phoenix Peons Analysis Engine Fundamentals Tools Analysis Rendering

  9. Cthulhu Architecture DB Data Flow IDA Control Flow Phoenix Peons Analysis Engine Fundamentals Tools Analysis Rendering

  10. Analysis Engine Process • Uses a fundamental to load assemblies • Runs phases • Import • Analyze • Render • Peons register to be notified on certain events

  11. Import Phase Phoenix Fundamental DB 1. Load Assembly 2. Assembly Loaded Analysis Engine 4. Normalize Information 3. Import Event Importing Peons Basic Types 5. Import Event Control Flow Data Flow

  12. Analysis Phase 2. Denormalize Assembly Information DB Database Fundamental 1. Load Assembly 3. Assembly Loaded Analysis Engine 5. Normalize and Denormalize Information 4. Analysis Event Analyzing Peons Path Discovery 6. Analysis Event Leak Check

  13. Render Phase DB 2. Denormalize Rendering Peons Output Store Analysis Engine 1. Render 3. Display Console GUI

  14. Database Implications • Extensible and flexible way to represent binary information • May be used to support large-scale analysis • Hundreds of modules • More work needs to be done • Performance overhead is non-trivial • Processing time can be high • Volatile memory usage can be kept low

  15. A few cool features Simplified API Version-independent modeling Conceptual modeling

  16. Simplified API Abstract classes provide fundamental independence Assembly Module Data Type Method … Assembly Assembly Module Module Data Type Data Type Method Method DB Phoenix Concrete Implementations

  17. Version-independent Modeling Modeling version independent relationships between software elements in the database Appropriate versions can be selected at analysis time void CallExitProcess() { ExitProcess(0);} ExitProcess 1 ExitProcess 2 ExitProcess ExitProcess 3 CallExitProcess 1 ExitProcess 4 Call to version independent kernel32!ExitProcess Distinct versions of kernel32!ExitProcess

  18. Conceptual Modeling Universe VPN Client VPN Server Device Driver Daemon vpn.sys daemon.exe User Interface vpngui.exe dialogs.dll

  19. Case Study:Web Services Finding inter-component data flow paths

  20. Overview • Web Services is a simple remoting interface • Clients invoke methods hosted on a web server • Server handles requests and provides responses • Problematic for static analysis • Clients pass data to the server indirectly (network) • Limits the scope at which analysis can be performed • Let’s walk through an example

  21. Example Web Service [WebService] public class WebService { [WebMethod] public void ExecuteCommand(string command) { Process.Start(command); } } Simple web service that invokes a process using the supplied command string

  22. Example Web Service Client [WebServiceBinding] public class WebClient : SoapHttpClientProtocol { [SoapDocumentMethod] public void ExecuteCommand(string command) { Invoke("ExecuteCommand", new object[] { command ); } } Simple web client that wraps the invocation of the web service method

  23. Bridging the gap • To illustrate a relationship, the client invocation and server method must be bridged • Bridging can take a few different forms • Automatic detection of relationships • Manual description of relationships • Bridging is an abstract concept though • How do we make it concrete?

  24. Bridging the gap • A concrete relationship can be shown by linking formal parameters fin(ExecuteCommand, 0) WebService fin(ExecuteCommand, 0) WebClient

  25. Benefits of bridging Web Application Web Client Web Service WebClient.dll WebService.dll WebClient WebService ExecuteCommand ExecuteCommand Enter Block Enter Block fin(0) fin(0)

  26. What’s the point? • Describing indirect relationships improves the quality of analysis information • Widens the scope for control flow and data flow analysis • The Path Discovery peon can help illustrate this

  27. Path Discovery: Overview • Designed to find reachable flow paths • From a set of sources • To a set of sinks • Within a set of target assemblies • Current restrictions • Requires the database fundamental • Only operates on data flow information

  28. Path Discovery: Scenario • Command Injection represents one type of security flaw found in managed applications • This can happen when user-controlled data is used in conjunction with launching a process • For example, data passing… • From HttpRequest.get_QueryString • To Process.Start • This should be easy to detect, right?

  29. Path Discovery: Problem • Finding data flow paths from get_QueryString to Start can be problematic • Lowest level data flow information is conveyed with respect to instructions • What if hundreds of assemblies are being analyzed? • Not enough physical memory!

  30. Path Discovery: Solution • Path Discovery makes use of generalized data flow relationships • Block-tier, method-tier, type-tier, etc… • Reachable paths are identified using a simple algorithm • Progressive Qualified Elaboration (PQE) • PQE is designed to reduce the amount of analysis information that must be considered

  31. Progressive Qualified Elaboration Reachable paths are progressively found between source and sink flow descriptors within a set of target assemblies

  32. Flow descriptors for this scenario Source flow descriptor Sink flow descriptor

  33. Applying this to web services • Suppose there is some code in the web client that does the following • client.ExecuteCommand(request.QueryString[x]); • Bridging makes it possible to show a complete data flow path from get_QueryString to Start • Let’s see how we get there using PQE • PQE starts from a macro-tier, such as the component tier

  34. Reachability: Component Tier Data flow Def-Use relationships between components Interpretation: In at least one situation, vuses data defined by u

  35. Reachability: Assembly Tier Data flow Def-Use relationships between assemblies

  36. Reachability: Data Type Tier Data flow Def-Use relationships between data types

  37. Reachability: Method Tier Data flow Def-Use relationships between methods

  38. Reachability: Basic Block Tier Data flow Def-Use relationships between blocks

  39. Reachability: Instruction Tier Data flow Def-Use relationships between instructions

  40. The end-result • A complete data flow path is identified • Data flows across an indirect boundary • Without bridging, it would not be possible to seamlessly perform this analysis • This means the security issue would be missed • Note that the security issue exists in the web service independent of the web client • Example was meant to show simple indirect data flow

  41. Future Work • Import and analyze large data sets • All PE modules from Windows Vista? • Improve database performance • Optimization work has not started yet • It is currently very slow • Implement additional peons • Leak Check • And the list goes on…

  42. Conclusion • Phoenix is an exciting project • Software analysis is fun & challenging • Hopefully the database stuff pans out  • Questions?

More Related