340 likes | 569 Views
To Users. FW. Mini-LAN. Monitor. &. Adapter. Server 2. Server 2. Sensors. Server 2. Server 2. Server 2. Server 2. Server 2. Server 1. HACQIT: Hierarchical Adaptive Control of QoS for Intrusion Tolerance. James E. Just Karl Levitt. The UC Davis Computer Security Laboratory.
E N D
To Users FW Mini-LAN Monitor & Adapter Server 2 Server 2 Sensors Server 2 Server 2 Server 2 Server 2 Server 2 Server 1 HACQIT:Hierarchical Adaptive Control of QoS for Intrusion Tolerance James E. Just Karl Levitt The UC Davis Computer Security Laboratory
Outline • Team • Overview • Goals • Approach • Major risks and risk management • Hypotheses, experiments and metrics • Policy assumptions and enforcement • Schedule • Technology transfer • Conclusions
HACQIT Team • Teknowledge Corporation • Quorum integrator • Distributed intelligent control • Architecture based development • Information assurance and CC2 • UC Davis • Fault tolerance • Intrusion detection and response • Attack modeling
The HACQIT Idea • Utilize robust hierarchical control to achieve IT, i.e., deliver critical COTS services to critical users • Significantly raise adversary work factors • Focus on useful military applications • Policy driven • Leverage current and new technologies • QoS/Quorum – DeSiDeRaTa, AQuA, QCS, others • IA&S – wrappers, intrusion and integrity sensors, active monitoring & response, randomization, VPNs, attack modeling (Jigsaw concepts), honeypots • Fault tolerance – separation, diversity, replication & check-pointing, fail-over • Others – out-of-band signaling, etc • Incrementally deliver capabilities
Project Goals • Prototype HACQIT controlled cluster delivering • 4 hours of intrusion tolerance under • Active Red Team attacks (network based) on hosts • Providing policy determined critical services from • COTS/GOTS applications to • Critical users (also policy determined) at • 75% capacity • Extensible model base for IT control • Focus • COTS HW & SW based for near term utility • Architecture based framework for longer term extensibility (hierarchical and fractal)
Approach • Focus on military applications (e.g., MS Office, email, portions of ACOA ACTD software, RT) • Prototype HACQIT • Extend selected Quorum, IA&S, and other technologies • Migration (Desiderata), fault tolerance (Aqua), sensing (wrappers, instrumented connectors) and integration (QoS Condition Service, QCS) to provide the initial intrusion tolerance framework for initial application • Integrated management and response system • Leverage additional sensors for integrity monitoring and control for operating systems, applications, & data • Test, iterate and extend for various types of applications • Implement extensible model (concepts, rules, etc.) for assessment & response at various levels
HACQIT Scope • Will address • QoS control for critical services to critical users • Hierarchical, extensible object control model • Host attacks on availability and integrity • Variety of COTS/GOTS applications • Policy specification of above • Won’t address • Network infrastructure attacks (e.g., routers or communication flooding attacks) • Development of new sensors or mechanisms, but will leverage them • Integrity of data sources used as inputs
FW = Sensors Key Server 2 Server 2 = Attacker Server 2 Server 1 Server 2 Server 2 Server 2 Server p Server = Critical Service User = Critical User User Server Monitor & Adapter = Non-Critical User = Non-Critical Service HACQIT “Reference Model” User 2 User J User q User 2 User 2 o User 2 User N LAN F W WAN Server r Server r Server r User M User P i HACQIT Controlled Cluster Backups & Decoys Primaries Communications with other HACQIT Controllers
Intrusion Tolerance Condition Service IT Condition Visualization HACQIT Responders IT Event Log, Other Clients HACQIT IT Condition Service IT Policies and Specs. Mediator Mediator Mediator Intrusion Sensors Integrity Sensors Performance Sensors
Server 2 Server 2 Server 2 Server 1 Typical Usage Architecture User 2 User J User q User 2 User 2 o User 2 User N LAN F W WAN Server r Server r Server r User M User P VPNs i HACQIT Managed Cluster FW Primaries Backups & Decoys Server 2 Server 2 Sensors Server 2 Server p Monitor & Adapter Communications with other HACQIT Controllers
Use Case • Assume a backup (hot or cold); • Detect an intrusion: • If intrusion does not constitute a threat to the critical application, then start a procedure to expunge the attack and block future occurrences, return; • If attack threatens the critical application, then switchover to backup, expunge the attack from the primary, block future occurrences of the attack, return; • Detect a performance or integrity problem in critical application (including data files), operating system, or other critical process that indicates an undetected intrusion • Switchover to backup, expunge the attack from the primary, block future occurrences of the attack, return
Spec. File DeSiDeRaTa Metrics Enactment Sensors Resource discovery Filter Eval Monitor Analysis Act H/W metrics Actuator Diagnosis Select action Distributed hardware RT paths
QoS Condition Service QoS QoS Logging QCS Visualization Service Clients Client QCS Clients API QCS Provider QoS Condition Service (QCS) QCS QCS Manager Repository QCS Agents API App. 2 In Mediator App. 2 In Mediator App. 1 Out Mediator App. 2 Out Mediator Desiderata Mediator QCS Agents Instrumented Instrumented TAO TAO Instrumented Connector Connector Event Event Connector Channel Channel
Intrusion Tolerance Condition Service IT Condition Visualization HACQIT Responders IT Event Log, Other Clients HACQIT IT Condition Service IT Policies and Specs. Mediator Mediator Mediator Intrusion Sensors Integrity Sensors Performance Sensors
Inference Engine Interface Illustrative Extended GlobalGuard IDS Architecture Responses JIGSAW to I. E. Translator DBS to I. E. Translator Attack Model Specification JIGSAW Application, System & Network Specification DBS Query Probe User Interface CISL Sensor Array CIDF Components Test File CIDF Other Input Sensor to I. E. Preprocessor
JIGSAW Concept Template Extended [abstract] concept <concept_name> [extends concept_name] requires [sensor|capability|config] <CapabilityTypes:LABEL_LIST>+ with <expression list> end; provides <CapabilityTypes:LABEL_LIST>+ with <assignments>* end; action <external actions>* [reportable <when|unless> <condition>] end; response <external response class required to stop “attack” now>* end; end.
Major Risks • Attacks • Against monitoring & control components • Common mode attacks • Active blocking after unknown attack • Accurate and rapid intrusion sensing & avoidance • Backups • Restoration speed (applications & connections) • Corruption (logical isolation from primary) • Overhead for different types of applications • Recovery of the primary server in a timely manner (not a major focus of base program) • Workload to use and maintain
Major Mitigations • Diversity (hw, sw, versions, etc) • Randomization of initialization/response • Content monitoring (and selective logging) of input/output streams • Out of band control system • VPNs among critical users and servers • Wrappers for low level sensing and control • Adaptive control • Deception (decoys, honey-pots, fishbowls, etc) • Restrictions on services
Hypotheses • The QoS mechanisms developed under the Quorum program can be extended to enable a managed cluster of COTS computers to provide intrusion tolerant services for critical COTS/GOTS applications to selected users on a LAN/ WAN • The proposed architecture will cost-effectively support application requirements ranging from traditional user-oriented client-server applications to mission critical, real-time distributed applications
Testable Goals • Enable 75% of critical function capacity to be maintained under coordinated attack over the network for four hours using COTS hardware and software with emphasis on denial of service and integrity attacks and • Add a new user while under attack • Add a critical service while under attack
Experiment Plan • Test goals every annually (at least) months against active Red Team on incrementally harder applications • Simple file based applications to client server to near-real time control • Red Team guidelines need to be worked out
Policy • Enforcement • Application specifications • Performance levels • Response abilities • Definition of critical users and services • Response guidance • Infocon postures • Response mechanisms and availability • Tradeoffs • Response directives
Milestones Note that these milestones are more aggressive than the official SOW. Depending on the results of detailed design effort, some adjustments may be necessary • Year 1 • Applications • Office • Email • Collaboration • Intranet web server • Control • Specification based performance and integrity • Replication and switchover • Year 2 • Applications • Simple planning application • Network-based military planning • Real time application • Control • Detected intrusions • Limited restoration • Options • Integration of new ITS technologies • Automatic generation of integrity monitors • Extensions for diagnosis and recovery
Technology Transfer • Who needs intrusion tolerant server capabilities for critical users and services – user pull • Government -- military and civilian • Commercial -- large corporations, ISPs and others who offer out-sourced application services • Development and maintenance organizations for above • Government development efforts (e.g., IO COP, GCCS) • Government ACTDs (e.g., AIDE, ACOA, CINC 21?) • Commercial security product/service providers (including Teknowledge?) -- significant commercialization costs • Mechanisms • Demonstrations • Ongoing communications • Publications • Code availability
Conclusions • Ready to start • Very ambitious goals • Leveraging other efforts will help • Downscoping may be required • Better insights after detailed design • Questions
Technical Plan • Decide on assumptions for the environment to be supported, i.e., decide on the application domain. • Decide on the intrusion types to be accommodated initiallly. • Develop the overall architecture. • Decide on a set of intrusion tolerance paradigms to be mechanized as part of the architecture. • Using COTS hardware and software, and available tools in support of security (including intrusion detection tools) and fault-tolerance, develop a prototype system. To a significant extent, Quorum tools will be used. • Develop an experimental plan to demonstrate the advances offered by our system.
Intrusion Tolerance Paradigms • Error detection • Error analysis • Intrusion confinement through separation of processes • Checkpointing in support of backups • Switchover to a backup • Achieving attack resistance in components that support intrusion tolerance, i.e., avoiding a hard core. • Recovery of the primary and reinstatement of its service