360 likes | 478 Views
Performance Engineering. Bob Dugan, Ph.D. Computer Science Department Rensselaer Polytechnic Institute Troy, New York 12180. The Nightmare Scenario. Product pre-sold by marketing as carrier scalable Demos are flashy, fast and successful
E N D
Performance Engineering Bob Dugan, Ph.D. Computer Science Department Rensselaer Polytechnic Institute Troy, New York 12180
The Nightmare Scenario • Product pre-sold by marketing as carrier scalable • Demos are flashy, fast and successful • Product is supposed to ship to big name customers like GM, Fidelity, and AT&T a week after QA • During QA product is performance tested • Performance tests uncover serious scalability problems • Analysis shows a fundamental architecture flaw • Months of redesign and testing necessary to fix
Overview • Background • Methodology • Resources Incorporate performance into software’s entire life cycle to achieve performance goals.
Background What is software performance?
Background Response Time Resource Utilization Throughput
Background: Response Time • How long does it take for a request to execute? • Example: Web page takes 100ms to return to browser after request. • Interactive applications require 2000ms or less. • Tells us a lot about how system is performing. • Response time has big impact on the holy grail of performance THROUGHPUT.
Background: Throughput • How many requests per second can be processed? • Example: • A server has throughput of 30 requests/sec • Supports roughly 1 million requests/10 hour day • Assume average user makes 10 requests/day • Server will support approximately 100,000 users • Inverse of response time on lightly loaded system. • Combined with user model, can be used for performance requirements, capacity planning, sales, and marketing.
Background: Resource Utilization • Resources consumed by code processing request. • Examples: CPU, memory, network, disk • In a closed system, as load increases: • Throughput rises linearly • Resources are consumed • Response time remains near constant • When a resource is completely consumed: • Throughput remains constant • Resource utilization remains near constant • Response time rises linearly with load
Background: Resource Utilization • Resource utilization is critical to determining throughput/response time relationships. • During performance testing, resource utilization helps identify the cause of a performance problem.
Performance Engineering Methodology Incorporate performance into software’s entire life cycle to achieve performance goals.
Software Life Cycle Requirements Specification Design Implementation Integration Test Release Maintenance
Requirements • Functional requirements identified. • What are the performance requirements? • Do any functional requirements interfere with performance requirements?
Performance Requirements • What is the capacity planning guide for the system? • How much is a customer willing to pay for performance and scalability? • Hardware • Software licensing (e.g. OS, Oracle, etc.) • System Administration
Example: Internet Bank • View accounts • Search for specific transaction • Transfer money between accounts • Export account to Quicken • 10 million potential users
Performance Model • Make some assumptions (refine later) • Three tier system: browser, web farm, database server • Database updated nightly with day’s transactions (e.g. read mostly) • User logs in once per 5 day work week, between 8AM-6PM EST • Logins evenly distributed • Typical user does 3 things, then logs off • About 20% of customers will actually use online banking
Performance Model 10,000,000 users x 20% adoption rate = 2,000,000 users/week 2,000,000 x 3 requests per user = 6,000,000 requests/week 6,000,000 / 5 day work week = 1,200,000 requests/day 1,200,000 / 10 hour day = 120,000 requests/hour 120,000 / 60 minutes per hour = 2000 requests / minute 2000 / 60 seconds per hour = 33 requests per second
Performance Requirements • The customer wants to pay as little as possible for the system hardware. • Your company wants the system to perform well, but there’s a development cost. • YOU must find the balance. • What are reasonable service times and throughput for web and database servers?
Requirements Goal: Identify/eliminate performance problems before they get into Functional/Design/UI specifications.
Functional/Design/UI • Goal: Eliminate performance problems before writing a line of code. • Example: • Requirements say that users should be able to search on account activity using any combination of activity fields (e.g. date, payee, amount, check#). • Functional/Design specification describes an ad-hoc query mechanism with pseudocode that allows users to conduct this search using a single database query. • Performance analysis of prototype ad-hoc query shows a throughput of 2 req/sec with 100% CPU utilization on a two processor database server.
Prototyping • Great time to play • Investigate competing architectures • Don’t forget performance! Example: HTML Tag Processing Engine for Internet Bank • Initial performance analysis showed 5 tags/sec. Web server CPU 100%. Dependency on size of page. • Second iteration improved to 20 tags/sec. Still too slow! Service time allotted completely consumed by tag processing. • Third iteration at 60 tags/sec. No page size dependency.
Implementation Goal: Identify and eliminate performance problems before they are discovered in QA. • Long duration • Break into drops • Performance assessment of drops • Track progress • A maturing system increases in complexity and jeopardizes performance • Use instrumentation!
Instrumentation • Code must be instrumented by development • Allows self-tuning • Provides execution trace for debugging • Aids performance analysis in lab • Useful for monitoring application in production
Example: Instrumentation Sample code sub unitTest { eCal::Metrics->new()->punchIn(); my $tableName; my $result = tableSelect("users"); print $result."\n"; eCal::Metrics->new()->punchOut(); } Activating instrumentation eCal::Metrics->new()->setEnabled("true"); eCal::Metrics->new()->setShowExecutionTrace("true"); unitTest; Sample instrumentation output PUNCHIN eCal::Metrics::TableStatisticsDB::unitTest [] |PUNCHIN eCal::Metrics::TableStatisticsDB::tableSelect [] ||PUNCHIN eCal::Oracle::prepare [] ||PUNCHOUT eCal::Oracle::prepare [] 131.973028182983 msecs |PUNCHOUT eCal::Metrics::TableStatisticsDB::tableSelect [] 642.809987068176 msecs PUNCHOUT eCal::Metrics::TableStatisticsDB::unitTest [] 643.355011940002 msecs
Testing Goal: Identify and eliminate performance problems before they get into production. • Performance testing and analysis must occur throughput development!!! • In late cycle QA, should be a formality with no surprises. • A surprise at this point will delay product release or potentially kill a product.
Maintenance Goal: Identify and eliminate performance problems before they are detected by users. • Management console for resource monitoring • Metrics pages • Instrumentation
Conclusion Incorporate performance into software’s entire life cycle to achieve performance goals.
Resources: Books • Smith/Williams, “Software Performance Engineering” • Jain, “The Art of Computer Systems Performance Analysis” • Tannenbaum, “Modern Operating Systems” • Elmasri/Navathe, “Fundamentals of Database Systems” • Baase, “Computer Algorithms: An Introduction to Design and Analysis”
Resources: Software • Resource Monitoring: • Task Manager, Perfmon • Sar/iostat/netstat/stdprocess, SE Toolkit • BMC Best/1, HP OpenView, Precise Insight • Load Generation • LoadRunner, SilkPerformer • Webload • Automated Instrumentation • Numega True Time, Jprobe • Tkprof, Explain Plan, Precise In Depth for Oracle
Resources: Literature/Web • www.perfeng.com - Dr. Connie Smith’s Website • www.spec.org - Benchmarks for computer hardware • www.tpc.org - Benchmarks for databases • Computer Management Group – annual conference in December. • Workshop on Software Performance – semi-annual conference in late summer/early fall • ACM SIGMETRICs – annual conference in early summer. • ACM SIGSOFT/SIGMETRICS publications – periodically feature papers on performance engineering.
Case Study: Microsoft VBScript • Website uses IIS, Microsoft ASP, VBScript • Critical page takes 3000 ms, CPU bound • Instrumentation shows 2500 ms in a single subroutine • Subroutine executed just before html returned to browser • Approximate size of HTML page is 64K resp = resp & “<ul>” I=0 while (I<MAX) { resp = resp & “<li> List Element” & I & oneKString } resp = resp & “</ul>”
Case Study: Microsoft VBScript • The more the loop iterates, the longer each iteration takes. • VBScript does not support string concatenation • Each string operation results in a malloc(), copy, and free which is dependent on the current size of the html string • Why is that so bad?
Case Study: Microsoft VBScript n cost of malloc() = oneK string malloc() I = 1 Sn = 1 + 2 + … + (n-1) + n Sn = n + (n-1) + … + 2 + 1 2Sn = (n+1) + (n+1) + … + (n+1) +(n+1) Sn = n(n+1)/2
Case Study: Microsoft VBScript Solutions?