1 / 23

PS1 Prototype Systems Design Jan Vandenberg, JHU

PS1 Prototype Systems Design Jan Vandenberg, JHU. Early PS1 Prototype. Engineering Systems to Support the Database Design. Raw data size Index size Most end-user operations I/O bound Loading/Ingest more cpu-bound, though we still need solid write performance Time to do full table scans

aric
Download Presentation

PS1 Prototype Systems Design Jan Vandenberg, JHU

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. PS1 Prototype Systems DesignJan Vandenberg, JHU Early PS1 Prototype

  2. Engineering Systems to Support the Database Design • Raw data size • Index size • Most end-user operations I/O bound • Loading/Ingest more cpu-bound, though we still need solid write performance • Time to do full table scans • Time to do index scans • Need to do most work where the data is; can’t sling TB’s over the network quickly • …though we can brute-force past 1 Gbit Ethernet if necessary

  3. Fibre Channel, SAN • Expensive but not-so-fast physical links (4 Gbit, 10 Gbit) • Expensive switch • Potentially very flexible • Industrial strength manageability • Little control over RAID controller bottlenecks

  4. SATA • Fast • Cheap • Ugly, spooky • <cabling pic> • Tough to manage • <dlmsdb/sdssdb drive bay map>

  5. SAS • For our purposes, it’s SATA without the ugliness • Fast: 12 Gbit/s FD building blocks • Cheap: PS1 prototype MD1000 pricing versus Newegg media costs • Not Ugly: IB cables versus rats’ nest • Industrial strength manageability: pretty blinking lights and mgmt apps versus downtime plus white knuckles • <cabling pic>

  6. I/O Performance of Dell SAS Systems in the PS1 Prototype

  7. SAS Performance, Gory Details • SAS v. SATA differences

  8. Per-Controller Performance • Luckily, one controller is fast enough for one SATA disk box • <performance chart>

  9. Resulting PS1 Prototype I/O Topology • <topo diagram> • <aggregate performance chart>

  10. RAID-5 v. RAID-10? • Primer, anyone? • RAID-5 probably feasible with contemporary controller… • …though tough to predict real-world effects of latency… • …and not a ton of redundancy • But after we add enough disks to meet performance goals, we have enough storage to run RAID-10 anyway! • Remember sub-Newegg media costs

  11. RAID-10 Performance • Executive summary: RAID0/2 for single-threaded reads, RAID0 perf for 2-user/2-thread workloads. RAID0/2 writes

  12. PS1 Prototype Servers • <diagram of server roles plus storage and network interconnects>

  13. PS1 Prototype Servers • <iron photo (w/Will?)>

  14. Projected PS1 Systems Design • <diagram of 8-slice triply-replicated systems> • <plus geoplex?>

  15. Backup/Recovery/Replication Strategies • No formal backup • …except maybe for mydb’s, f(cost*policy) • 3-way replication • Replication != backup • Little or no history • Replicas can be a bit too cozy: must notice badness before replication propagates it • Replicas provide redundancy and load balancing… • Fully online: zero time to recover • Replicas needed for happy production performance plus ingest, anyway • Off-site geoplex • Provides continuity if we lose HI (local or trans-Pacific network outage, facilities outage) • <lava pic?> • Could help balance trans-Pacific bandwidth needs (service continental traffic locally)

  16. Why No Traditional Backups? • Not super pricey… • …but not very useful relative to a replica for our purposes • Time to recover • Money no object… do traditional backups too!!! • Synergy, economy of scale with other collaboration needs (IPP?)… do traditional backups too!!!

  17. Failure Scenarios • Easy, zero-downtime: • Disks • Power supplies • Fans • Not so spooky, maybe some downtime and manual replica cutover: • System board (rare) • Memory (rare and usually proactively detected and handled via scheduled maintenance) • Disk controller (rare, potentially minimal downtime via cold-spare controller) • CPU (not utterly uncommon, can be tough and time consuming to diagnose correctly) • More spooky: • Database mangling by human or pipeline error • Gotta catch this before replication propagates it everywhere • Can’t replicate too aggressively • (and so off-the-shelf near-realtime replication tools don’t help us) • Catastrophic loss of datacenter • Have the geoplex • …but we’re dangling by a single copy ‘till recovery complete • …but are we still screwed? Depending on colo scenarios, did we also lose the IPP and flatfile archive? • Terrifying: • Unrecoverable badness fully replicated before detection • Catastrophic loss of datacenter without geoplex • Can we ever catch back up with the data rate if we need to start over? • At some point in the survey, the answer likely becomes “no”.

  18. State Diagram for Replicas? • Loading • Replicating • Load balancing • Failing • Recovering • Possibly repeat-loading

  19. Operating Systems, DBMS? • Sql2005 EE x64 • Why? • Why not DB2, Oracle RAC, PostgreSQL, MySQL, <insert your favorite>? • (Win2003 EE x64) • <Why EE?> • Platform rant from JVV available over beers • <JVV/beer graphic?>

  20. Systems/Database Management • Active Directory infrastructure • Windows patching tools, methodology • Linux patching tools, methodology • Monitoring • Staffing requirements

  21. Facilities/Infrastructure Projections for PS1 • Cooling • Rack space • Network ports • (plus AD/WSUS/monitoring infrastructure above)

  22. Operational Handoff to UofH

  23. Mahalo!(See Ya, Hon!)

More Related