How High is High Performance Transaction Processing?

How High is High Performance Transaction Processing? Jim Gray, Microsoft Research HPTS 99 Asilomar, CA 1 Oct 1999 http://research.Microsoft.com/~Gray/Talks/

Outline • Sizing the business: 0B$ or 1T$? • TP is dead: long live HTTP-XML • i.e. TP monitors morph to web servers. • Transactions are C2C (b2b not enough) • Scaleability terminology if there is time.

Where are we? Where are we headed (will we run out of transactions)? What’s Ultimate tpd demand? • We started out to do 1,000 transactions per second • 1985 Datamation article tpd What’s Current tpd demand? 1Ktps = 80 Mtpd time

TpcA • 1998: • 208 DebitCredit @ 45 k$/tps (Tandem 32x T16) • 1991 Sept • 212 tpsA @ 16,331 $/tpsA (Tandem 64xCLX) • 62 tpsA @ 11,945 $/tpsA (Rdb 6xVAX) • 1993 Sept • 1,002 tpsA @ 9,313 $/tpsA (Oracle 32x Sequent) • 529 tpsA @ 6,341 $/tpsA (Rdb 4xAlpha) • 1995 Peak • 3,692 tpsA @ 4,873$/tpsA (Rdb 20x Alpha) • 662 tpsA @ 4,401 $/tpsA (Rdb 4xAlpha) • 1999 guess • 4,000 tpsA @ 200 $/tpsA = (= 320 M tpd) PER NODE!!

TPC C Progress • Jan 1993: • 23 tpmC @ 2.3k $/tpmC • 269 tpmC @ 3.0k $/tpmC • Nov 1995: • 2,455 tpmC @ 242 $/tpmC (SS 4x P5 133) • 11,456 tpmC @ 286 $/tpmC (Oracle 8xAlpha 350) • Sept 1997 • 12,026 tpmC @ 40 $/tpmC (SS 6x P6 200) • 39,469 tpmC @ 95 $/tpmC (Sybase 16xHPPA 200) • Sept 1999 • 40,368 tpmC @ 19 $/tpmC (SS 8xIntel 550) • 135,461 tpmC @ 97 $/tpmC (Oracle 4x24 Sparc 400)

Prediction • 1 MtpmC @ 10$/tpmC in 3 years (or tpcC disappears) • 25 M$ of stuff today • 8 M$ of stuff in 3 years • That’s ~ 3 Btpd at much less than 0.01$/tpd a PENNY PER tpd.

How Many tpds Are There? • 10 second think time • 12 hour days • 5,000 tpd/person • 6 billion people • 30 Ttpd • actual guess is 100x less than that • People think slower, work/play less • Not everyone is wired

Where Are We? • Market is not saturated1998 IBM annual report: 20 Btpd on IBM systems (0.1% of demand). • It’s a big market 40k tpmC @20$/tpmC ~ 100 Mtpd @ 1M$ (a penny per tpd) So: 30 Ttpd ~ 300B$ industry for hw/sw

Wow!! 300 B$ Business, GREAT!! • But…. • What about my 100x over-estimate? • A 3B$ business? • What about Moore’s law: 2x decline/year? • A 0B$ business? • Time to find a second career? • Go into services/consulting/operations? • Count on shadow transactions? every human transaction = 100 shadow transactions (B2B)

Conclusion: Sizing traditional TP • It’s a big business now and for a while • But people will be limited to 30 Ttpd • Ultimately a 0B$ industry • A penny per tpd today (a microdollar per transaction) • Web nearing commercial TP rates

Outline • Sizing the business: 0B$ or 1T$? (30 Ttpd) • TP is dead: long live HTTP-XML • i.e. TP monitors morph to web servers. • Transactions are C2C (b2b not enough) • Scaleability terminology if there is time.

A Brief History of Computing • In the beginning there was batch. • Automated back office • Then came Timesharing/OLTP • Automated front office • Then came the web • Automated the customer • Then came ? • Automated the process • Computers talk to computers We are here

Some busy web servers • AOL: ~ 3 B hits per day (~3B tpd) • Yahoo: ~ 1 B hits per day • Top 10: ~ 6 B hits per day

AOL NaviServer/IRIX Yahoo Apache/FreeBSD, IIS/NT MSN IIS/NT Hotmail Apache/FreeBSD IBM Domino/AIX Compaq IIS/NT Dell IIS/NT ATT Netscape/Solaris Lucent Netscape/IRIX Cisco Netscape/Solaris Oracle Oracle/Solaris NASDAQ IIS/NT NYSE Netscape/AIX FedEx Netscape/Solaris LL Bean Netscape/AIX Schwab Netscape/Solaris (!!!) Etrade Netscape/Solaris Ebay IIS/NT Amazon Netscape/DEC Unix ??? CICS/NT ??? CICS/AIX ??? CICS/OS390 ??? Tuxedo? !!!! Yes, there are HTTP-SNA front ends, but… why?? So, What happened to the TP monitor?

Body Count • 7 M servers (IP addresses) • Minus squatters • Plus servers behind the firewall • Plus Intranets Courtesy of Netcrafthttp://www.netcraft.com/survey/

How many TP monitors are there? • Guess: 100,000 nodes • (CICS, CICS, CICS, Tuxedo, IMS, Encina, ACMS, Pathway, …) • IBM estimate of 20 B tpd on IBM gear is impressive • equal current Internet traffic (ignore intranets). • At mainframe prices (200$/tpmC) ~ 0.25$/tpdC ~ 4 B$ seems very conservative • Installed base ~ 150B$, so suggests ~ 1 T tpd • Or transactions MUCH bigger than tpmC • Or low utilization (peak:average is 10:1 ?) • Or, systems not doing TP. • Or, more expensive • Probably, most of these statements are true

Claim: Climbing the value chain TP = ORB = HTTP = Tplite = RPC =IPspray • TP monitors multiplex clients to servers & manage servers • ORBs multiplex invocations to methods and… • HTTP servers multiplex GET/POST to pages and… • TPlite multiplex clients to stored procs and… • RPC multiplex callers to callees and… • They are stealing our tp tricks. • The revenge of TPlite: • It’s 3 tier, but your/my stuff is not in the middle tier.

Web servers are behind but… • They are learning about manageability. • They are learning about functionality • Take a page from the OO vs OR war: • Easier to add Objects to a DB • Than add DB to Objects. • Guess: • Easier to add HTTP to TP • Than to add TP to HTTP

Web Servers are Learning TP Tricks 1 year’s progress • Multiplexing server pools: • Fast CGI • COM+ connection pooling • Client context • Cookies • Sessions • Load balancing (various) • Security (SSL, certificates, CR,…) • High availability • Failover, IP mobility, Redirection,… • Queuing

The birth of C2C • So, if the Person2Computer business is 0B$ (30 Ttpd) • Then the B2B business (shadow transactions) is 100B$ • Great, but what about Moore’s law • Solution: • C2C transactions: computers to computers • No people involved! • Ultimate automation. • Smart dust (Avagadro’s number of users)

C2C TP (actually making it work) • Interop is key (no people to do format translation). • Use a STANDARD protocol • No IBM/Microsoft/Intel “standard” • Standard protocols: • HTTP++ (get/post/queue/dequeue) • XML for data format • Transactions and queues • Authentication/Authorization (certificates/signatures) • Pico-pricing • Does this sound like B2B?

User Productivity:one person generates 1K C2C* transactions * computer to computer • No obvious limit to the number of tps. • Obvious need for transaction properties • That’s how social systems work (transactions). • So, its not a 0B$ industry (thank goodness!) • But, its not your father’s TP monitor.

Some sample XML(outer frame of this talk in XML) <xml xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:p="urn:schemas-microsoft-com:office:powerpoint"> <p:presentation> <p:... slots="title,body,dateTime,footer,slideNumber"> ... </p:master> <p:slide id="1" href="slide0001.htm" layout="title_subtitle" slots="centerTitle,subTitle"/> ... <p:viewstate type="slideView" slidehref="slide0030.htm"/> <p:font name="Times New Roman" charset="0" type="4" family="18"/> <p:headersfooters noheader="t"/> <p:pptdocumentsettings framecolors="WhiteTextOnBlack" hideslideanimation="t"/> </p:presentation> <o:shapedefaults v:ext="edit" spidmax="34820"> <o:colormru v:ext="edit" colors="#3cc,#09f,fuchsia,#6f3,#ff6"/> <o:colormenu v:ext="edit" fillcolor="#ff6"/> </o:shapedefaults></xml> Go here for DTDs • Anyone can define anything. • “When I use a word it means exactly what I want it to mean, nothing more and nothing less.”

Yes, but….. • XML is WONDERFUL! • But.. XML is no panecea. • XML uses DTDs Syntax & presentation, not schema, not semantics!!! • DTDs as a competitive advantage • SAP will publish its DTDs • Defines customer, employee, … (all business objects) • Other vendors will “compete” for these definitions • Lassettre’s Dog: (www.ibm.com/dog, www.jim.com/dog, ….) no ‘www.top.org/dog’ • World needs a global schema • Data + Methods (= semantics)

Oh!! And by the way…. • B2B and C2C need workflow • Scripts • Execution • Status • Good luck….

Summary • People are a 30 Ttpd business • B2B is a 3 Ptpd business • (shadow transactions) • C2C is an infinite business (smart dust) • The web servers are coming! The web servers are coming! • XML needs schema definitions • Syntax + Semantics. • Formats + Protocols + workflows

Terminology for scaleability Farm • Farms of servers: • Clones: identical • Scaleability + availability • Partitions: • Scaleability • Packs • Partition availability via fail-over Clone Partition Pack

Unpredictable Growth • The TerraServer Story: • We expected 5 M hits per day • We got 50 M hits on day 1 • We peak at 15-20 M hpd on a “hot” day • Average 5 M hpd after 1 year • Most of us cannot predict demand • Must be able to deal with NO demand • Must be able to deal with HUGE demand

An Architecture for Internet Services? • Need to be able to add capacity • New processing • New storage • New networking • Need continuous service • Online change of all components (hardware and software) • Multiple service sites • Multiple network providers • Need great development tools • Change the application several times per year. • Add new services several times per year.

Premise: Each Site is a Farm • Buy computing by the slice (brick): • Rack of servers + disks. • Grow by adding slices • Spread data and computation to new slices • Two styles: • Clones: anonymous servers • Parts+Packs: Partitions fail over within a pack • In both cases, remote farm for disaster recovery

Scaleable SystemsScale UP and Scale OUT • Everyone does both. • Choice is • Size of a brick • Clones or partitions • Size of a pack

Everyone scales outWhat’s the Brick? • 1M$/slice • IBM S390? • Sun E 10,000? • 100 K$/slice • Wintel 8X • 10 K$/slice • Wintel 4x • 1 K$/slice • Wintel 1x

Clones: Availability+Scalability • Some applications are • Read-mostly • Low consistency requirements • Modest storage requirement (less than 1TB) • Examples: • HTML web servers (IP sprayer/sieve + replication) • LDAP servers (replication via gossip) • Replicate app at all nodes (clones) • Spray requests across nodes. • Grow by adding clones • Fault tolerance: stop sending to that clone. • Growth: add a clone.

Facilities Clones Need • Automatic replication • Applications (and system software) • Data • Automatic request routing • Spray or sieve • Management: • Who is up? • Update management & propagation • Application monitoring. • Clones are very easy to manage: • Rule of thumb: 100’s of clones per admin

Partitions for Scalability • Clones are not appropriate for some apps. • Statefull apps do not replicate well • high update rates do not replicate well • Examples • Email / chat / … • Databases • Partition state among servers • Scalability (online): • Partition split/merge • Partitioning must be transparent to client.

Partitioned/Clustered Apps • Mail servers • Perfectly partitionable • Business Object Servers • Partition by set of objects. • Parallel Databases • Transparent access to partitioned tables • Parallel Query

Packsfor Availability • Each partition may fail (independent of others) • Partitions migrate to new node via fail-over • Fail-over in seconds • Pack: the nodes supporting a partition • VMS Cluster • Tandem Process Pair • SP2 HACMP • Sysplex™ • WinNT MSCS (wolfpack) • Cluster In A Box now commodity • Partitions typically grow in packs.

What Parts+Packs Need • Automatic partitioning (in dbms, mail, files,…) • Location transparent • Partition split/merge • Grow without limits (100x10TB) • Simple failover model • Partition migration is transparent • MSCS-like model for services • Application-centric request routing • Management: • Who is up? • Automatic partition management (split/merge) • Application monitoring.

Always UP: Farm pairs • Two farms • Changes from one sent to other • When one farm failsother provides service • Masks • Hardware/Software faults • Operations tasks (reorganize, upgrade move • Environmental faults (power fail)

Services on Clones & Partitions • Application provides a set of services • If cloned: • Services are on subset of clones • If partitioned: • Services run at each partition • System load balancing routes request to • Any clone • Correct partition. • Routes around failures.

Clones for availability Packs for availability Load Balance Web Clients Cluster Scenarios: 3- tier systems A simple web site SQL Database Web File Store SQL Temp State Front End

Packed Partitions: Database Transparency SQL Partition 3 SQL Partition 2 SQL Partition1 SQL Database replication Web File StoreB Cloned Packed file servers The FARM: Clones and Packs of Partitions Cluster Scale Out Scenarios Web File StoreA SQL Temp State ClonedFront Ends(firewall, sprayer, web server) Web Clients Load Balance

Talk 2 (if there is time) Farm • Terminology for scaleability • Farms of servers: • Clones: identical • Scaleability + availability • Partitions: • Scaleability • Packs • Partition availability via fail-over Clone Partition Pack

How High is High Performance Transaction Processing?