200 likes | 446 Views
Openstack HA @ paypal. Open Stack Summit – Hong Kong - 2013. ABOUT PAYPAL. PayPal offers flexible and innovative payment solutions for consumers and merchants of all sizes . 137,000,000 users $300,000 payments processed each minute 193 markets / 26 currencies
E N D
Openstack HA @paypal Open Stack Summit – Hong Kong - 2013
ABOUT PAYPAL PayPal offers flexible and innovative payment solutions for consumers and merchants of all sizes. • 137,000,000 users • $300,000 payments processedeach minute • 193 markets / 26 currencies • The World’s Most Widely Used Digital Wallet
Agenda Why HA is important for PayPal? Our Learning Our Solution What is not solved? Q&A
Why Ha is important? “no perceived downtime” for cloud users Enterprise Class Auto Scaling & Flex up/down can never break API Integrations always succeed Everyone expected to use the cloud
AVAILABILITY REQUIREMENTS No SPOF “Under the Cloud” Scale Across the Data Center(s) Scale Across Racks & Containers Respect natural availability zones within the data centers No ‘cloud’ can impact any other ‘cloud’
Infrastructure rack Layer 2 versus Layer 3 Cattle & Puppies Access LB Active LB Passive 1g Mgmt 1g Mgmt 1g Mgmt 1g Mgmt 10g Passive 10g Passive 10g Passive 10g Passive 10g Active 10g Active 10g Active 10g Active … Compute Racks … Infrastructure / Controller Racks
Infrastructure rack OpenStack Services are all VM on KVM Every infra component resides on 2+ nodes Redundant physical racks Redundant power/switches in each rack Layer-3 connectivity between racks (no Layer 2) Enterprise Grade Physical LB (floating VIP)
Compute 1 2 Access LB Active LB Passive LB Active LB Passive 3 Compute Node 96 Hyperscale 16 Core 256GB Ram 1.1T Disk Compute Node 96 Hyperscale 16 Core 256GB Ram 1.1T Disk Compute Node 96 Hyperscale 16 Core 256GB Ram 1.1T Disk Compute Node 96 Hyperscale 16 Core 256GB Ram 1.1T Disk 1g Mgmt 1g Mgmt 1g Mgmt 1g Mgmt 1g Mgmt 1g Mgmt 1g Mgmt 1g Mgmt 10g Passive 10g Passive 10g Passive 10g Passive 10g Passive 10g Passive 10g Passive 10g Passive 10g Active 10g Active 10g Active 10g Active 10g Active 10g Active 10g Active 10g Active
compute Active Passive Top Of Rack Top Of Rack 10g 10g 10g 10g 10g 10g 10g 10g Hyperscale Raid-10 Hyperscale Raid-10 bond0 bond0 1g 1g Management 1g 1g
Openstack considerations LB VIP for every service (unless it can’t) Connect to LB VIP, not individual nodes Script to close Server Connections Pacemaker only works inside a single Layer-2 (not a large enterprise) Auto Restart using Monit MySQL Swift Cluster
continued… HEAT with Corosync/Pacemaker/keepalived(for now) KeyStone/ Nova / Glance / Swift Proxy Rabbit MQ Cluster Cinder Volume Service
Cinder services Workflow User request (create volume) 1 Figure shows a typical interaction between Cinder components to serve a end user request. (create new volume in this example). Cinder API Cinder Scheduler 3 2 AMPQ 4 5 Cinder Volume 6 Storage Back-end1 Storage Back-end2
Cinder services with ha User request (create volume) 1 How HA is implemented for Cinder Components: • API (stateless) – Load Balancer (A/A or A/P); • Scheduler (stateless) – Pacemaker, Queue itself (A/A or A/P); • Volume – Pacemaker, Queue itself (A/A or A/P). Load Balancer Cinder Scheduler A 2 Cinder API A Cinder API B Cinder Scheduler B AMPQ Cluster 4 3 5 Cinder Volume B Cinder Volume A 6 Storage Back-end1 Storage Back-end2
unresolved VIP-friendly Cinder Volume service Seamless Upgrade Flip Failed DB TX Reconciliation Consistent API Response Time
cloud@paypal.com Confidential and Proprietary
THANK YOUhttp://github.com/paypal/auroraScott Carlson - @relaxed137raj gedazhitEnghuangirc:winston-d