210 likes | 374 Views
Self-Managing Anycast Routing for DNS. NLnet Labs & SIDN Labs. Context. Providing High-Available & Reliable DNS Service. DNS service for important zones (globally) reliable (trustworthy, security, …) high-availability reduce (average) latency Examples ccTLDs , gTLDs … Common solution
E N D
Self-Managing Anycast Routing for DNS NLnet Labs & SIDN Labs
Providing High-Available & Reliable DNS Service • DNS service for important zones (globally) • reliable (trustworthy, security, …) • high-availability • reduce (average) latency • Examples • ccTLDs, gTLDs • … • Common solution • distribute DNS name servers • anycast addressing and routing (BGP and IGP)
Local/Global Anycast Nodes • Local with IGP • RIPv2, OSPF, IS-IS, EIGRP • redundancy, load distribution, low latency within a network • Global with BGP • just BGP-4 • redundancy, load distribution, low latency over global Internet
Research Question • Very generic thesis • distribution mechanism for flexible, adaptive deployment of DNS services (authoritative) • Find optimal placement of nodes • availability (also in relation with DDoS) • reliability (including security, integrity, trust) • low (average) latency • Alternative distribution mechanisms • p2p or some hybrid, e.g., zone files hosted at an ISP • anycast enhanced with self-management to support flexibility and adaptability
Plan & Approach • Solution should integrate/interoperate with current operational practices • Self-Managing Anycast Routing for DNS (SMARD) • BGP anycast: availability, reduce latency • self-* • configuration: flexibility, adaptability, … • optimization: load distribution, reduce latency, … • healing: recover from failures • protection: security, integrity, trust, …
Plan & Approach cont’d • Anycast & self-* to achieve mentioned goals, but … • Support for self-* loop • monitor, analyse, plan, execute • “Playground” to deploy anycast nodes at various/diverse topological locations • IaaS, …?
Self-* Autonomic Computing
Autonomic Computing • “The Vision of Autonomic Computing,” Jeff Kephart and D. Chess, IEEE Computer, January 2003. • “...main obstacle to further progress in IT is a looming software complexity crisis.” • computer systems are becoming too massive, complex, to be managed even by the most skilled IT professionals • the workload and environment conditions tend to change very rapidly with time
Autonomic Computing cont’d • System that can manage themselves given high-level objectives • objectives can be expressed in term of service- level objectives or utility functions • Analogy human autonomic nervous system • “responsible for monitoring conditions in the internal environment and bringing about appropriate changes in them” • autonomic nervous system functions in an involuntary, reflexive manner
Centralizedvs. Distributed Coordination monitor monitor monitor monitor knowledge knowledge knowledge knowledge execute execute execute execute analyse analyse analyse analyse plan plan plan plan
Example cont’d • Anycast nodes • M-A-P-E their own operation • monitor own behavior • local actions, global notification • SMARD global • M-A-P-E global operation • receive abstract/strategic monitor information • plan global actions for anycast nodes
Operation of Anycast Services, RFC 4786 • Load distribution (not load balancing) • node placement • “catchment” • global/local anycast nodes • … • Monitor availability changes according to location of client • signaling service availability • routing policies and topology changes • DNSMON and RIS/Route Views • … • Consistent service (trustworthy, availability, …) • data synchronization (consistent client response) • node autonomy & self-sufficiency (no cascading failure, but more complex management) • denial-of-service attack mitigation • service compromise • service hijacking • …
Results & Impact • Infrastructure for flexible, adaptive placement and management of DNS authoritative name servers • need a “playground” for placement and operational management • Infrastructure as a Service (IaaS)? • Full distributed vs. centralized coordination • bounded by need to be operational or practical deployable • operational costs vs. service and security • DDoS & spoofed traffic • DDoS mitigation • trace spoofed traffic to “real” source