520 likes | 639 Views
Authors: Aditya Thakkar , Ameya Karnik , Abhishek Dey Das & Brian Boyle. Monitoring VM Security. Contents. Introduction. Cloud computing environment allow customers to execute arbitrary code on HW owned by cloud provider
E N D
Authors:AdityaThakkar, AmeyaKarnik, AbhishekDey Das & Brian Boyle Monitoring VM Security
Cloud computing environment allow customers to execute arbitrary code on HW owned by cloud provider • Cloud provider use virtualization technology to ensure isolation between customers • Key Components of virtualization technology in cloud: • Virtualization Monitor (VMM): A thin layer of VM abstraction that resembles HW. It virtualizes all HW resources, allowing multiple virtual machines to transparently multiplex the resources of the physical machine. • Virtual Machine (VM): Software simulation (of HW) that provides complete system platform, which supports the execution of complete OS. Introduction–Cloud
Virtualization technology is at the heart of cloud computing • This brings added security challenges for the provider where malicious customer can leverage provider’s HW to launch attacks from VM they (client) owns or compromise VM itself. • Goal of this presentation is to explore ways an attacker can exploit Virtualization Technology on cloud and study architecture of various Intrusion Detection System (IDS) that defend the Virtualization Technology on cloud. Introduction–VM Security
A cloud system is vulnerable to common attack methods on servers and some specific attack methods on VM’s like: • Buffer Overflow. • Using memory errors to attack a VM. • Cross-VM side channels and their use to extract private keys. • Wrapping Attack. • Flooding Attack. • Browser Attack. Attacks On Cloud
Buffer overflow is a method of attack where the perpetrator attempts to subvert the function of a privileged program to gain access to it. If the function is privileged enough, it can also gain access to the host. • There are two primary steps in carrying out a buffer overflow attack. • Arrange for suitable code to be available in the program’s address space. • Get the program to jump to the attack code with suitable parameters loaded into registers and memory. Buffer Overflow
Arrange for suitable code to be available in the program’s address space • Injection: There are 2 types of Injection - • Where the attacker provides a string as input to the program, which the program stores in a buffer. • Where the attacker exploits an already available function by passing malicious parameters. For instance, if the program contains a line of code as:exec(arg)The attacker can exploit this function by passing something like:exec(“/bin/tsh”)This will give him access to the tsh shell Buffer Overflow–How?
Get the program to jump to the attack code with suitable parameters loaded into registers and memory • Method 1: Manipulate the activation records Buffer Overflow–How?
Method 2: Manipulate Function pointers – • Suppose there is a pointer to function returning void. Now function pointers can be allocated in the stack, heap or static data area. So all the attacker needs to do is find an overflowable buffer adjacent to a function pointer and overflow it to change the function pointer. • Method 3: Misuse Checkpoint System in C – • C has a checkpoint/rollback system called setjmp/longjmp. Setjmp(buffer) is used to set the checkpoint and longjmp(buffer) is used to rollback to a checkpoint. If the attacker corrupts the state of the buffer then longjmp(buffer) would jump to the corrupt code. Longjmp buffers are similar to function pointers and thus, an attacker just needs to find an adjacent overflowable buffer. Buffer Overflow–How?
Writing correct code approach • grep source code for highly vulnerable C libraries. • Have code auditing teams. • Use tools like fault injection tools, etc. • Operating System approach • Make the data segment of the program’s address space non-executable • Direct Compiler approach • Check array bounds • In-Direct Compiler approach • Code pointer integrity checking Buffer Overflow–Defense
The goal is to get access to a machine to which he has no physical access using this method • The attacker needs to run a program (social engineering, physical access to the system) on the host machine. • Then he needs to wait for even a single bit memory error due to radiation or any natural cause. • Once that happens the attacker can take over the JVM or .NET VM. Mem Errors to Attack VM
Let there be two classes A and B which are defined such that their size is a power of 2 including the object header. • Next the attacker insert the program and waits for a memory error. class B { A a1; A a2; A a3; A a4; A a5; A a6; A a7; }; class A { A a1; A a2; B b; A a4; A a5; inti; A a7; }; Mem Error–How?
Now if there is a fault in memory and there is a pointer variable p of class A containing address x • Once the attacker has equal pointers p and q of class A and B, he can take over the VM Mem Error–How?
Now considering the following code and its implications Mem Error–Example
Parity checking. • Error correcting codes (ECC) Mem Error–Defense
A malicious VM may use this form of attack to extract fine-grained information from a victim VM running on the same physical computer Cross VM Side Channel
Avoid co-residency. • Side channel resistant algorithms. • Core scheduling Cross VM–Defense
Orphan VMs are VMs that persist indefinitely. They exist on node controllers but are unknown to any user. • Orphan VMs are a type of resource leakage as resources assigned to these VMs cannot be used for any other purpose. • In this section we study results from VM Leakage study conducted on simulated setup called Koala VM Leakage
Simulation Setup • A discrete-event simulator inspired by the Amazon Elastic Compute Cloud (EC2) and eucalyptus open source software VM Leakage
Demand LayerConsists of a variable no of users who request number of instances of one or more of the VM types. Cloud Controller may respond with an allocation of instances(full or partial grant) or with a NERA fault • Supply Layer Consists of a number of clusters that manages a number of nodes. • Resource Allocation LayerAllocates resources using first-fit algorithm at cluster level and least full-first algorithm at cloud level • Internet/Intranet Layer assigns the cloud controller, cluster controllers and users to sites randomly located at x,y coordinates on a grid. VM Leakage
Possible Causes of VM Leakages: • Creation Orphans - VMs are created in response to a user request, but any of 3 different confirmation messages are lost: • From node to cluster controller confirming VM creation • From cluster to cloud indicating successful (full or partial) VM allocation • From cloud controller to user indicating a successful result • Termination Orphans - After user receives requested VMs, subsequent termination operations may fail (TerminateInstances) due to lost messages: • User may receive termination confirmation and think all is OK. If not, usually makes limited number of retries and then stops. • Eucalyptus makes no provision for retrying failed termination requests by cloud or cluster controllers; instead such failures are merely logged. VM Leakage
Orphan Control Method: • Orphan Creation - Node controller monitors receipt of DescribeInstancesrequests from users. If not received for a VM by 2h after boot up, VM is terminated and its resources are released. • Orphan Termination - Provide a persistent terminationcapability to both the cloud and cluster controllers. • A persistent terminator is activated by the cloud or cluster controller when no response is received to a termination request within a timeout • Once activated, the cloud persistent terminator resends termination requests to a cluster controller at specified intervals until one of the desired responses is received VM Leakage
Slow HTTP Test • SlowHTTPTest is a tool that simulates some Application Layer Denial of Service attacks • It implements most common low-bandwidth Application Layer DoS attacks, such as slowloris, Slow HTTP POST, Slow Read attack • Slowloris and Slow HTTP POST DoS attacks rely on the fact that the HTTP protocol requires requests to be completely received by the server before they are processed • If an HTTP request is not complete, or if the transfer rate is very low, the server keeps its resources busy waiting for the rest of the data • If the server keeps too many resources busy, this creates a denial of service Tools To Simulate Attacks
NMap • Nmap is a free and open source utility for network discovery and security auditing. It can be used to perform Port-Scan attack. • A port scan is an attack that sends client requests to a range of server port addresses on a host, with the goal of finding an active port and exploiting a known vulnerability of that service. • Nmap uses raw IP packets to determine what hosts are available on the network, what services (application name and version) those hosts are offering, what operating systems (and OS versions) they are running, what type of packet filters/firewalls. Tools To Simulate Attacks
SQL Ninja • SQL Ninja is a tool used to exploit SQL Injection vulnerabilities on a web application that uses Microsoft SQL Server as its back-end. • Its main goal is to provide a remote access on the vulnerable DB server. Tools To Simulate Attack
Air Snort • AirSnort is a wireless LAN (WLAN) tool which recovers encryption keys. • AirSnort operates by passively monitoring transmissions, computing the encryption key when enough packets have been gathered. • AirSnort requires approximately 5-10 million encrypted packets to be gathered. • Once enough packets have been gathered, AirSnort can guess the encryption password in under a second. Tools To Simulate Attack
Attack Too Kit (ATK) • It is a tool used to check for dedicated vulnerabilities. Tools To Simulate Attack
Intrusion Detection System (IDS) uses introspection techniques to monitor VMs for signs of misbehavior. • Following introspection factors define capabilities/strength of the IDS: • Power: Visibility of VM events i.e. scope of VM events it (IDS) can monitor and it’s ability to interpose on specific events. • Unintrusiveness: Amount of interference it introduces in the VM it is monitoring. • Robustness: Amount of assumptions it makes about the monitored VMs. • Fault-Recovery: Effects it has of running VM in case of IDS crash. IDS
Following are the key components of a IDS: • Physicsl State Monitor (PSM): Interprets and analyzes VM’s machine state such as symbol table, data types etc. • Policy Framework (PF): records metadata; interacts with PSM to know VM’s system state; monitors system events on behalf of Policy Modules via polling and event callback mechanism; and create VM checkpoint Policy Engine Policy Module Policy Module Policy Framework Physical State Monitor IDS–Core Architecture
Policy Module (PM): Implementation of actual security policies such as detecting malicious program based on signature, detect use of raw socket, detect tampering with OS code, check integrity of program binary on disk etc. • Policy Engine (PE): Layer that decides if intrusion has occurred and interpose execution of defected VM. • In some architecture PE’s role is performed by PM. Policy Engine Policy Module Policy Module Policy Framework Physical State Monitor IDS–Core Architecture
Host Based Agent:application that runs within the VM being monitored, either in user-space or as a kernel module. • Strength Analysis: • Power:It has high visibility of the VM as it resides within a VM. However, IDS may need to inject kernel code to increase visibility at cost of hampering robustness. • Unintrusiveness:This type of IDS has poor unintrusiveness as it breaks boundary separating the provider realm from customer realm. • Robustness: the agent becomes OS dependent. • Fault-Recovery: In case of agent failure, the entire VM fail-opens. Best>>Good>>Poor >>Worst IDS–Types
Trap and Inspect: This type od IDS examines execution of VM via VMM or another VM. • Strength Analysis: • Power:With support from VMM, this type of receives strong visibility of introspected VM. • Unintrusiveness:This type of IDS isolates the introspection code from the introspected VM, preventing tampering with VM code and interfering in VM’s execution. • Robustness:Need to insert trap in VM code. This requires knowledge of the code being inspected. In addition, trap placement is complex • Fault-Recovery:In case of agent failure, the introspected VM is not affected Best>>Good>>Poor >>Worst IDS–Types
Checkpoint and Rollback: VMM has the ability to checkpoint state of the VM. This type of IDS uses this feature to checkpoint the state of introspected VM and inject code for introspection. • Strength Analysis: • Power:With support from VMM, this type of receives strong visibility of introspected VM. • Unintrusiveness:This type of IDS isolates the introspection code from the introspected VM. • Robustness:Has improved robustness as it can use OS API for monitoring. However, it still relies on traps for notification. • Fault-Recovery:In case of agent failure, the introspected VM is not affected as we have check pointed VM state. Best>>Good>>Poor >>Worst IDS–Types
Architectural Introspection: The goal is to restrict monitoring to only well-defined interfaces that are difficult r unlikely to change. • Strength Analysis: • Power:Relies on passively monitoring HW events. This may have weaker capabilities but are enough for cloud monitoring. • Unintrusiveness:This type of IDS isolates the introspection code from the introspected VM. • Robustness:This has best robustness as it only monitors stable low level interfaces through VMM. • Fault-Recovery:In case of agent failure, the introspected VM is not affected we monitor stable low level interfaces via VMM. Best>>Good>>Poor >>Worst IDS–Types
Best>>Good>>Poor >>Worst IDS–Comparison
Livewire – A prototype implementation IDS–Example
Nagios is a host and service monitor designed to inform you of network problems • The monitoring daemon runs intermittent checks on hosts and services you specify using external “plugins” which return status information to Nagios • When problems are encountered, the daemon can send notifications out to administrative contacts. Nagios
Nagios has 3 main components: • Server – Runs on the introspection host • A core part of the server is scheduler that periodically checks the plugins for any changes/updates • Server receives results/information from plugins and plays the role of decision engine to decide if the network is in bad state • Disadvantage: • Nagiaos only provides support for MySQL to save the results • Plugins – Are user configurable and run on the introspected hosts to be monitored. • They check a service and return a result to the server • GUI – Displays the result from the server on web-pages Nagios Architecture
Nagios provides support to monitor following components: • Mail Server failure • Hard drive overload • Network outage Nagios–Functionality
Nagios is a Host Based Intrusion Detection System (IDS). • Advantages: • The plugins have high visibility, which allows it to thoroughly monitor network services • The robustness is much better compare to theoretical analysis (performed on previous slides). This is because the plugins do not inject any code in the network. • The fault-recovery is poor as, failure of Nagios server creates the fail-open window that allows the attacker to exploit the network. • Disadvantages: • It is intrusive as the hosts are required to install the plugins. • Nagios uses polling mechanism to check state of the network. Nagios–Observations
ICINGA is another OpendSource IDS that is widely used. • Similarities with Nagios: • ICINGA is a host based IDS architecture, similar to Nagios • Similar to Nagios, it uses polling to get the state of the plugin • Advantages over Nagios: • Provides support for Oracle Database and PostGresSQL • Provides better fault-recovery: ICINGA can have its various components split and scattered in a distributed set. If one component were to fail, another could take its place without disrupting the entire monitoring system. ICINGA IDS
Icinga Core communicates to the IDODB through IDOMOD and IDO2DB • Icinga Web is standalone that communicates with the system through the Icinga API to IDODB • Greater Security is also possible by distributing the System onto separate machines. • Therefore if the Icinga Core were to fail, and backup could be setup to take its place. ICINGA - Architecture
ICINGA provides following additional core features (plugins are same as Nagios) • Supports IPV6 • LDAP active directory to provide internal authentication • Icinga support Database backend. • Web Interface • StandandarisedAPI For better Addon Support • Performance charts • Visualization • Business Process Monitoring • Support for various platforms ICINGA–Features
On March 3, 2013, CloudFlare’s web security service went down for about an hour after it failed to prevent a DDoS attack. • This occurred due to outage of Juniper’s edge routers that CloudFlare uses to quickly propagate router rules across large number of routers. Real World Attacks