1 / 29

Successful Bandwidth Management at Carnegie Mellon

Successful Bandwidth Management at Carnegie Mellon. Peter Hill & Kevin Miller Internet2 Joint Techs – August 2003. About Us. Responsible for campus backbone, Internet connection, 80% of data ports Internet connectivity (commodity & research) through Pittsburgh GigaPOP

Download Presentation

Successful Bandwidth Management at Carnegie Mellon

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Successful Bandwidth Management at Carnegie Mellon Peter Hill & Kevin Miller Internet2 Joint Techs – August 2003

  2. About Us • Responsible for campus backbone, Internet connection, 80% of data ports • Internet connectivity (commodity & research) through Pittsburgh GigaPOP • Historically, net exporter of data to Internet • 8000 students, 1300 faculty, 2000 staff • 3:1 computer:person ratio • 10,000+ registered Wireless cards

  3. Timeline – Nov 2001 • Demand rising faster than willingness to supply • P2P file-swapping allowed users to easily create high volume servers

  4. No Bandwidth Management • Outbound bandwidth hits limits of OC-3 ATM interface • High rate of packet loss from single cell drops • TCP retransmits cause egress meltdown • Upgraded physical connection to GbE • Demand continues rising

  5. Timeline – March 2002 • GbE link to GigaPOP uses 802.1q trunk, separate Vlans for commodity vs. research traffic • Router unable to rate-limit outbound traffic on single Vlan

  6. Emergency Solution • Complex, ‘delicate’ engineering to split commodity vs. research traffic from internal core to border router • Multiple OSPF processes • Research route redistribution to OSPF • Applied rate limits to commodity traffic upon ingress to border router

  7. Emergency Solution • 75Mbps rate limit – single queue, tail drop only • Results weren’t pretty: high latency Out In

  8. Timeline – Summer 2002 • Messy solution to a complicated problem • No discrimination between legitimate and discretionary traffic • Research traffic unaffected, though

  9. Tail Drop Issues • School resumes: hard limits hit immediately • P2P consuming a significant amount of bandwidth • Users reporting problems accessing email from home • Interactive sessions suffer (SSH, Telnet)

  10. (Un)fair Bandwidth Access • High priority traffic? • Nine machines over .5% each; 21% total

  11. (Un)fair Bandwidth Access • On the same day, 47% of all traffic was easily classifiable as P2P • 18% of traffic was HTTP • Other traffic: 35% • Believe ~28% is port-hopping P2P

  12. Researching Solutions • Middlebox Traffic Shapers • Compared Allot NetEnforcer vs. Packeteer Packetshaper • Determined NetEnforcer better matched our requirements • Adds slight delays to constrain TCP • Fair bandwidth distribution among flows • Easier to configure policy • Better performance when demand equals limit

  13. Raised Bandwidth Limit • Nov 2002: Campus limit raised to 120Mbps

  14. Timeline – January 2003 • NetEnforcer deployed in October • Policy developed in late fall • Implemented policy in early January

  15. NetEnforcer Policy • Technical bandwidth policy using NetEnforcer • Used per-flow class-based fair bandwidth queuing Network critical traffic High Priority Interactive (SSH, telnet); limited per-flow Traffic on well-known service ports (IMAP, HTTP) Non-classified traffic Low Priority P2P traffic, classified by port number

  16. NetEnforcer Policy • Improved interactive performance • Fewer complaints about accessing campus services remotely • Traffic consistently at 120Mbps Winter Recess

  17. Limits of Shaping Solution • Per-host fair queuing not possible • User with 100 flows allowed 100 times bandwidth of user with one flow • Poor SSH X forwarding performance • High latency for UDP gaming services • Demand still high – socially, nothing had changed

  18. Tighter P2P Limits • Software to classify P2P traffic at Layer 7 available in February 2003 • Added L-7 classification • Put absolute bandwidth cap on P2P

  19. Timeline – February 2003 • Demand still close to limit • Consider alternate approach: social engineering • Identifying machines using high bandwidth

  20. Solution • Add a targeted user education/dialogue component to strictly technical solutions • Blanket notices were/are ineffective • Why not before? • Usage accounting software was in development • Only anecdotal information on ‘top talkers’ • Unsure of community reaction • Didn’t know how easy it would be

  21. Solution • Created official Campus Bandwidth Usage Guidelines • Established a daily per-host bandwidth usage limit • 1GB per day for wired machines (outbound to commodity Internet) • 250MB per day for wireless (to any destination) • Published guidelines and requested comments from campus community • Began direct notification of hosts exceeding guidelines on February 24, 2003

  22. Guideline Enforcement • Accounting data generated • ‘Argus’ utility captures raw packet data from egress span port • Post-processing aggregates flows by host, sums bandwidth usage • Nightly top-talkers report Hostname iMB oMB iKp oKp flows iMbs oMbs ips ops web4.andrew.cmu. 3456 107621 51917 83038 1948667 0.1 2.0 120 192

  23. Guideline Enforcement • Over-usage notification • Initial mail worded carefully, informed owner of quota and overage, and requested information on usage reasons • Approximately a 0.25 FTE process • Automated in July • Disconnections handled like other abuse incidents

  24. Positive Results • Immediately noticed a decline in bandwidth utilization (within a few hours of first notifications) Applied Strict Layer-7 P2P Limits First Usage Notices Sent

  25. Positive Results • Very few negative responses • Some hosts granted exclusions (official web servers) • Many notified were completely unaware of bandwidth usage • P2P programs left running in background • Compromised machines • Trojan FTP servers serving French films • Favorite responses • “i don't understand--what's bandwidth?? how do i reduce it?? what am i doing wrong?? what's going on???” • “i have no idea what bandwidth is or how one goes about using it too much.”

  26. Positive Results Now: (Summer) Summary:

  27. Timeline – May 2003 • With guideline enforcement, traffic drops by half • NetEnforcer still has an impact on P2P – need to assess legitimate and discretionary P2P uses

  28. Considerations • Per-machine limits might create artificial commodity in addresses • Role of Enterprise or Service Provider? • Packet shaping tends to apply Enterprise mindset – determine organization priorities • Bandwidth quotas use Service Provider mindset – how are quotas applied (port, machine, user?)

  29. Questions? • Argus: http://www.qosient.com/argus • Related links/resources: http://www.net.cmu.edu/pres/jt0803 • Packet shaper evaluation • Argus accounting • Usage guidelines • Peter Hill: peterjhill@cmu.edu • Kevin Miller: kcm@cmu.edu

More Related