1.4k likes | 1.41k Views
Join us at the Focused Workshop on Advanced Networks in San Diego to learn about going faster on Internet2 Campus. Discover practical solutions, discuss challenges, and gain insights from our experiences with Usenet News at the University of Oregon.
E N D
Going Fast(er) On Internet2Campus Focused Workshop on Advanced Networks, San Diego4/12/2000 Joe St Sauver (joe@oregon.uoregon.edu) Computing Center University of Oregon
Disclaimer • What we’re going to tell you today is based on our experiences working primarily with Usenet News at the U of O; it may/may not pertain to other applications elsewhere. • We tend to look for simple, scalable, workable solutions which we can roll out now, e.g., overprovisioning rather than QoS • We tend to be cheap, skeptical, and cynical • We tend to be good at pushing things until they break; it is an acquired/teachable skill.
A Sidenote About This Presentation • It is longer than it should be, but we’ll go until we run out of time and then stop. • Sorry it is so graphically boring. :-) • It is outlined in tedious detail because that way we won’t forget what we wanted to say, and thus you won’t need to take notes. • Hopefully, it will thus be able to be decoded by someone stumbling upon it post hoc.
I. Introduction Or, "Are You Really Sure You Want to Go Fast(er)?"
Now That I'm On I2, Everything Will Get Really Fast… Right? • It is a popular misconception that once your campus gets connected to Internet2, everything you do on the network will suddenly, magically, and painlessly go "really, really fast." • The reality is that going even moderately fast can take patience, detective work, tinkering, and maybe even forklift upgrades.
Do You Really NEED or Even WANT To Go Fast(er)? • Going fast(er) can be a big pain. Huh? …-- It will take a lot of work-- It may cost you some money-- It almost always requires the active assistance of lots of folks-- You may find yourself (in the final analysis) only partially successful, and -- Fast boxes are choice targets for crackers-- Lots of happy people DON’T go fast
As-Is/Out-of-the-Box Might Be Good Enough • Unless you're running into a particular problem (e.g., you HAVE to go fast(er)), one perfectly okay decision might be to just go however fast you happen to go and not worry about anything beyond that. • E.G., a Concorde may be very fast, but a Concorde might not be the best way to get to the corner store for a loaf of bread.
What Can I Get By Default?Example: Oregon<--Oklahoma • At UO, from a relatively vanilla W2K workstation connected via fast ethernet, one can ftp a binary file (hdstg2.img, 2135829 bytes) from the University of Oklahoma's ftp archive (ftp.ou.edu /mirrors/linux/redhat/redhat-6.2/i386/RedHat/base/) in 9.43 sec:226 Kbyte/second (or 1.8 Mbit/second)
For Comparison, A Second Local-Only Example... • Retrieving that same file from a local ftp mirror (ftp://limestone.uoregon.edu/.1/ redhat/redhat-6.2/i386/RedHat/base/) that same workstation allowed me to get the filein 0.32 seconds, which translates to:6,653.67 Kbyte/sec (or 53.2Mbit/sec)
Thinking About Those Examples A Little • As always, closer will usually be faster [mental note… value of replicated content] • Quoted throughput should be considered approximate (e.g., the times aren't exact). • There are start up effects (which will tend to pull the overall throughput down); e.g., if the file was larger, we'd look/be "faster" • Ten seconds or 1/3 of a second, either way you won't have time to go get coffee
Make An Effort to Know How Fast You HAVE to Go • As you try to go fast(er), it will be important for you to know how fast you HAVE to go. • For example: "I need to be able to deliver 1.0Mbps sustained for MPEG1-quality video" or "I need to be able to transfer 180GB of data per day on a routine basis." • Get your requirement into Mbps format so you can readily make comparisons
Converting Data Transfer Requirements Into Mbps • Example:180 gigabytes/day == (180,000 megabytes)(8 bits per byte)---------------------------------------------- ==(24 hrs/day)(60 mins/hr)(60 secs/min)roughly 17 megabits/sec 'round the clock
Be Sure To Remember... • Very few data transfer requirements are "uniformly distributed 'round the clock" --plan for peaking loads • Best case/theoretical requirements should be considered a lower (not upper) boundon bandwidth requirements. • Plan for system/application downtime. • What's the data transfer rate of growth?
It's Not The Volume, It's The Time It Takes To Double... • “It’s not the heat, it’s the humidity…” • Example: Daily Usenet News volume (e.g., ~200GB/day now, doubling every 6 mos.) • Data from http://newsfeed.mesh.ad.jp/flow/
That Implies, For Example... • Today: 200GB/day (e.g., 18.5 Mbps) • 6/2001: 400GB/day (37 Mbps) • 12/2001: 800GB/day (74 Mbps) • 6/2002: 1.6TB/day (148 Mbps) • 12/2002: 3.2TB/day (296 Mbps) • … and of course, that’s assuming we don’t see another upward inflection in the rate of NNTP traffic growth (but trust me, we will).
What does ftp.cdrom.com say? • “Wcarchive is the biggest, fastest, busiest public FTP archive in the world. * * * Each month, more than 10 million people visit wcarchive -- sending out to them more than 30 terabytes of files (as of June, 1999), with the only limit being the Internet backbone(s).” See: ftp://ftp.cdrom.com/ archive-info/configuration • 30 TB/mo = “only” a steady ~92.6Mbps
In Most Cases, The Only Reason You Need to Go Fast Will Be LOTS Of Data…. • By "LOTS" of data, you should be thinking in terms of hundreds of gigabytes/day on a routine/ongoing basis. • Assuming even moderate data retention times (e.g., a week), 100’s of GB/day implies use of what would traditionally be considered a large disk farm.
Again Looking At cdrom.com... • In the “old days,” (two or three years ago?) large capacity disk farms were physically large, expensive and quite uncommon... • For example, Cdrom.com is/was fielding a 1/2 terabyte of disk consisting of 18x18GB plus 20x9.1GB
Terabyte of Data on The Desktop, Anyone? • Now there are 82GB Ultra ATA Maxtors (and for only $300 or so!) and 180GB Ultra160 Barracudas will be shipping soon • A terabyte of data can now happily run from an undergrad’s desktop PC...
The Good News? • In spite of the cheap availability of large disks, there are really very few applications which NEED to go very fast (either for long periods of time or on a frequently recurring basis between any two particular points). • That is, most large flows are non-recurring, and not particularly time sensitive. An example might be one scientist ftp'ing one large data set from one colleague one time.
Got Non-Reocurring, Non-Time-Sensitive Flows? Relax... • If you are working with non-recurring, non-time sensitive flows, you have a fair amount of slack: even if you don’t succeed in going fast, the transfer will still get done eventually, one way or the other. • Put plainly, “Sort of slow may still be fast enough.”
The (Sort Of) "Bad" News... • There are LOTS of folks who WANT to go fast(er) (whether they NEED to or not) • There are MANY applications that IN AGGREGATE may need to deliver "lots" of data (e.g., not a tremendous amount to any one user, but some to LOTS of users) • Most apps can't distinguish between Internet2 and the commodity Internet.
Why Would A Broad Interest in Going Fast Be (Sort of) Bad News? • Recall my earlier proposition that going fast(er) is hard/expensive/requires help from lots of people, and often only sorta works. • It wouldn’t take a tremendous number of people going really fast to flattop existing Internet2 capacity. • For now, it is still expensive to buy I2 size pipes to the commodity Internet.
Abilene OC3 Cost vs. Commodity Internet Costs • Abilene (Internet2) OC3: $110,000/yearCWIX OC3: $1,082,400/yearSprint OC3: $1,489,200/yearGenuity OC3: $2,064,000/year==> Commodity OC3's are expensive and it doesn't take many people who're even doing “just” 30 Mbps to fill an OC3.(prices from http://www.boardwatch.com/ isp/bb/Backbone_Profiles.htm)
“I asked for a mission, and for my sins they gave me one.” • When you may be striving to build a campus network enabling high throughput to Internet2, beware: you are ALSO building a network which will deliver high throughput to the commodity Internet. • If you encourage users to go fast to I2, they will go fast everywhere (assuming they go fast anywhere) because users don’t know when they’re using Internet2.
Are We Racing To The Precipice? Probably Not... • Good news is (may be?) coming… • Some vendors (e.g., Cogent Communications) will soon be selling 100Mbps of commodity transit for $3K/month, flat rate… if you're in one of the “NFL cities” where they have a POP. • Perversely, one of the things that determines where carriers build out their POPs is the existing/demonstrated bandwidth demand!
“I can’t get cheap commodity transit where I’m located…” • If you can’t get cheap commodity transit, the only bandwidth provisioning solution that financially scales to the high bandwidth scenarios we’re all moving toward is to go after settlement free peering with large network service providers. Doing this implies you need fiber to one or more exchange points, and you need to be able to convince providers of interest to peer…
Some University-Affiliated Commodity Exchange Points • Oregon IX (http://www.oregon-ix.net/) • Hawaii IX (http://www.lava.net/hix/) • SD-NAP (http://www.caida.org/projects/sdnap/content/) • BC IX (http://www.bcix.net/) • Hong Kong IX (http://www.cuhk.hk/hkix/) • and many more… see http://www.ep.net/
“What if those sort of strategies aren’t right for us?” • You have (or soon will have) problems • You will spend your time making users go slower, not helping them to go fast(er) • Transparent web caching may help (some), but watch out for witch hunt opportunities. • Maybe try going after edge content delivery networks (Akamai, iBeam , etc.)? Maybe try bandwidth management appliances?
But... • Users will go faster, even if you work hard at trying to slow them down • Transparent web caching may reduce your traffic by a factor of two (but if your traffic is doubling every 6 months, that implies doing caching is only going to buy you 6 months worth of breathing room, and then you’re back where you started from...)
But… But… • Edge content delivery networks may help with some specific content, but there’s still a lot of other content that will NOT be getting distributed via those ECDN’s. • Bandwidth management appliances invite user efforts to “beat the system” by exploiting any weaknesses in your traffic management model (just like in the bad old mainframe chargeback days, ugh!)
On The Other Hand... • Everybody may be talking about OC12’s, OC48’s and OC192’s, but even a major NSP like Abovenet still has a lot of OC3’s, fast ethernet and DS3 class links... • See Above.Net’s publicly available traffic reports (http://west-boot.mfnx.net/traffic/) • The lesson of Above.Net’s stats? OC3 class traffic is still relatively rare/a big deal... and not something to treat casually.
Free Advice (And You Know What That’s Worth) • Be sure you really need/want to go fast(er) • Strive to understand your current traffic requirements • Never lose sight of the fact that going fast on Internet2 will mean that you probably need to go fast on the commodity Internet, too • Work to deploy scalable solutions
II. So Who’s Going Fast On Internet2 Right Now? “The All News Network, All The Time.” [CNN moto]
Large TCP/IP Flows • Our focus/interest is on large TCP/IP flows which result in lots of bytes getting transferred. • We’re not worried about/interested in UDP traffic; it will implode on its own. :-) • We ignore brief one-off spikes associated with demonstations/stunts/denial of service attacks/etc. -- long term real base load is of the greatest interest to us.
We Don’t Have a Per Application Breakdown for Abilene, But…. • … Canarie DOES report the most common applications (including reporting the most popular applications for the three Canarie-Abilene peering points). • See http://www.canet3.net/stats/reports.html(the Abilene/CANet3 peering points are labeled Abilene, AbileneNYC & SNAAP)
Making Traffic Statistics Intuitively Meaningful • While we could compare application traffic in terms of Mbps or percentages or other abstract units, it may help to characterize I2 traffic relative to a common traffic base we all intuitively understand: WWW activity.(excellent idea, CANet, bravo!) • On the commodity Internet, we all know that WWW traffic is the dominant protocol. But what about on Internet2?
Most Popular TCP/IP Apps at CANet/Abilene Peering Points, Relative to HTTP as 1.0X for the week ending 11/5/2000 • Abilene (Chicago): NNTP 2.31X FTP 1.59X • Abilene (NYC): NNTP 4.11X FTP 1.35X • SNNAP (Seattle): NNTP 13.9X FTP 1.23X
Most Popular TCP/IP Apps at Selected CANet3 Sites, 11/05/2000, Relative to HTTP,and As A % of Total Octets • BCNet: NNTP 49.4X 77.1% FTP 2.14X 3.3% • MRNet: NNTP 90.7X 74.6% FTP 8.81X 7.2% • RISQ: NNTP 31.1X 72.0% FTP 1.31X 3.0%
==> Usenet News & FTP Are The Dominant Applications on I2 (Thank God…!) • Usenet News (NNTP) is the dominant TCP/IP application (which is good, since most campuses centrally administer Usenet news, and thus can manage it carefully) • FTP is the second largest TCP/IP application (which is also good since it is typically non-time sensitive/non-recurring, or is it non-recurring?)
Why Is Usenet News The Most Successful Application on I2? • News admins have been working hard at making systems go fast for a long time now • NNTP is architected to scale well • News admins have a long history of collaborating well with their peers. :-) • Non-I2 News traffic quickly gateways onto and off of I2 news servers at multiple points • Performance matters (e.g., ‘Freenix effects’)
An Hypothesis About Internet2 FTP Traffic Levels • FTP, as the number two application on Internet2, is also of interest to us. As we began to think about it, we came up with a hypothesis about what that FTP traffic represented. All that FTP traffic *could* be wild-haired misbuttoned boffins happily transferring gigabytes and gigabytes worth of spatial data on the mating habits of Peruvian tree frogs... but we doubted it.
OR That FTP Traffic Could Be Site-to-Site Mirroring Traffic • Just beginning to think about this... • Will we be able to differentiate mirroring traffic from user traffic? Maybe, maybe not. • Some observable flow characteristics:-- both endpoints would be ftp servers (duh)-- chronological patterns (e.g., assume cron’d invocation of mirroring software) • FTP log analysis from major FTP sites? (particularly looking for ls -lR transfers…)
Interactive vs. Automated FTP Traffic SubHypotheses • SubHypothesis 1: web distribution of files should have virtually replaced anonymous ftp retrieval of files • SubHypothesis 2: scp should be replacing non-anonymous interactive ftp’ing • SubHypothesis 3: cvsup should be replacing traditional development tree mirroring
More SubHypotheses... • SubHypothesis 4: to account for the volume we’re talking about, there should be multi-threaded mirroring tools in use (see, e.g., “Mirror Master” available from ftp://sunsite.org.uk/packages/mirror/ ) • SubHypothesis 5: user-level semi-automated ftp tools may cloud the analysis (e.g., http://www.ncftp.com/ncftp/); trueWindows-based mirroring software also exists (e.g., http://www.netload.com.au/)
Do We Even Know What Mirror’ers Are Doing? • Smart mirroring tools should minimize unnecessary transfers by only transfering that which has “changed” -- but what’s a change? Later mtime and different file size? MD5 hash delta? ==> Varies by package. • Field work opportunity for computer anthropologists: go talk to the guys who run the big ftp servers out there…
III. Thinking About Your Application and I2 Or, "What do you mean I can't make a lemon chiffon cake out of a package of venison T-bones?"
Not All Applications Are Well Suited to Going Fast on I2 • We did an article for the UO Computing Center newsletter describing what sort of applications are well suited to Internet2; the NLANR Application Support Team liked it well enough that they now have a version of it up at http://dast.nlanr.net/Guides/ writingapps.html
Mentally Categorizing Applications • Applications where you can control WHO you work with, WHERE they are working from, WHAT they are doing and WHEN they are doing it, tend to work best on I2 • Simplest example: getting one file to one colleague one time via a passworded server • Degenerate case: large video on demand files on a generically accessible web server