Chapter 4

Chapter 4 Infrastructure for Electronic Business

Introduction • How can one plan, manage, and evaluate the infrastructure of an electronic business? • Is the infrastructure of a certain e-commerce company sized for peak traffic? • Example: Thanksgiving to Christmas timeframe • Is the infrastructure scalable to cope with customer surges during a sales promotion? • To answer these questions require the understanding of the system architecture and its infrastructure

Infrastructure • The infrastructure of an electronic business • Identifies the functionalities of the hardware and software components • Specifies the corresponding service level requirements • Describes the management and operation of the whole system • The infrastructure is usually sharedby many applications that rely on the components of the infrastructure and management procedures (e.g. software distribution, backup, recovery, and capacity planning) • Various models involved in the specification of the infrastructure of an electronic business

Infrastructure AVALABILITY MAINTAINABILITY FUNCTIONAL: HW BUSINESS MODEL FUNCTIONAL: SW PERFORMANCE SCALABILITY Infrastructure • Given the following information, one should be able to specify an infrastructure that meets the requirements at minimum cost • Nature of electronic business • Some quantitative information about the business • Functional model of the applications • Architecture requirements • Key issues • Performance • Availability and Maintainability • Scalability

Business Model: • Helps one to obtain a detailed characterization of the business system • Consists of several parts, specify the following business characteristics • Situation • Purpose • Outcome • Functions • Resources • Locations • A starting point in the process of specifying the infrastructure requirements

Example: Google

Functional Model • Specifies the business processes and applications needed to accomplish the services and functions offered to customers • Why is that important for infrastructure? • The process of defining and setting service levels is closely related to the very nature of business processes and applications • Business applications are implemented using the infrastructure services • Consider Google

Functional Model: Google

Electronic Business Infrastructure Business Model Functional Model Electronic Business Architecture Functional Operational Electronic Business Architecture • Comprises • The services provided by hardware and software components • Third-party services • How servicesinteract • Important issues: • Dynamics of the interactionamong services • The notion of service providers • The properties of service providers in the context of electronic business, that involve many participants

Electronic Business Architecture • Consists of two blocks of descriptors • Functional & Operational • Functional (static) • Describes the structure, its components, their interactions, and their interfaces • Operational (dynamic) • Focuses on the operational view of the system • Consists of • Network topology, geographical locations, application service levels • Expressed by • Performance, availability, security requirements

Reference Architecture • An infrastructure model provides a static description of resources and services • The architecture includes the dynamics of the system • A reference architecture of a system • describes its structure, its components, and their interrelationships • provides a framework for its evolution and for making decisions about the future, such as what technologies to adopt and when to change the system Electronic Business Architecture N-Tier Site Third-Party Services Internet Ads Payment Authorization Certification Web Servers Application Servers Database Servers • Components of the e-Business System • Web Server ; Application Server; Transaction and Database Servers • Mainframe & Legacy Systems • Proxies and Caches • Internet Service Provider (ISP)

The Google Components: Servers

The Google Components: Servers (2) • In-hose rack design • PC class motherboards • Linux-no paging • Clusters of tens of thousands machines .

Logical Components – Web Server • A combination of a hardware platform, operating system, networking software and an HTTP server • Web server software • Also known as HTTP server or HTTP daemon • Control the flow of incoming and outgoing data on a computer connected to an intranet or to the Internet • HTTP 1.1 provides persistent connection • One TCP connection may be used to carry multiple HTTP requests • Eliminate the cost of many opens and closes • Usually handle more than one request at a time, either by • Forking a copy of the HTTP process for each request • Multithreading the HTTP program • Spreading the requests among the processes of pool of running processes persistent connection

Web Server (2) • Latency & throughput are two most important performance metrics • Throughput • The rate at which HTTP requests are serviced • Expressed in HTTP operations/sec • Latency • Time required to complete a request at the server • Average latency • Average time for handling the requests • Customer response time • Latency at the server + time spent communicating over the network + processing time on the client machine

Logical Components – Application Server • The software that handles all application operations between browser-based customers and a company’s back-end databases • Example • An application server in a travel agency web site translates search requests for flight scripts that access the back-end database • Receives client request, executes the business logic, and interacts with transaction servers and/or database servers • Exhibit the following characteristics • Host and process application logic written in different programming languages • Manage high volumes of transactions with back-end databases • Comply with all existing Web standards (HTTP, HTML, CGI, NSAPI, ISAPI, Java, etc.) • Work with most of the popular Web servers, browsers and databases

DatabaseServer Customer Requests Web Server CGI, NSAPI, ISAPI, etc StoredProcedures Database Logical Components – Transaction and Database Servers • Database Server • Executes and manages transaction processing applications • Can be a relational database system that supports stored procedures that can issue SQL requests to the database LAN

Transaction and Database Servers: Requirements

Transaction and Database Servers: Performance

Transaction and Database Servers: Availability

Transaction and Database Servers: Scalability

Transaction and Database Servers:Performance Measurements • Transaction Processing Performance Council (TPC) • Defines transaction processing and database benchmarks • TPC-C, -H are commonly used industry benchmarks that measure throughput and price/performance of online transaction processing (OLTP) environments and decision support systems • TPC-C measures the maximum sustained system performance • Number of New-Order transactions completed per minute while the system is executing four other transaction types (Payment, Order-Status, Delivery, and Stock-Level) • Response time of all transactions should be less than five seconds • TPC-H benchmark represents decision support environments where users do not know which queries will be executed against a database system

Logical Components–Mainframe and Legacy Systems • Very large volumes of business data exists on mainframes • Legacy systems with databases and online transaction applications have been used by companies for decades • A valuable asset • Techniques for integration of web and mainframes • Wrapping • Hiding the existing legacy application behind an abstraction layer, representing a programming model • Back-end scripting

Mainframe and Legacy Systems

Logical Components – Proxies and Caches • Techniques used for improving web performance, scalability and security • Caching • Reduces access time by bringing the data as close to its consumers as possible • Improves access speed and cuts down on network traffic • Reduces the server load and increases availability by replicating objects among many servers • Uses for dynamic objects is still restricted

Proxies and Caches • Proxy server • Special type of web server • Able to act as both a server and a client • Acts as an agent • Represent the server to the client • Represent the client to the server • Originally designed to provide access to the Web for users on private network

Proxies and Caches • Caching has been used in three ways • Client side • Browsers maintain small caches of previously-viewed pages on the user’s local disk • Caching proxy • Located on a machine on the path from multiple clients to multiple servers • Reverse proxy • Located side-by-side with the servers of a website • Distribute the load among the back-end servers

Proxies and Caches • Measure caching effectiveness • Hit ratio • The number of requests satisfied by the cache over the total number of requests • Byte hit ratio • Hit ratio weighted by the object size • Data transferred • The total number of bytes transferred between the cache and the outside world during an observation period

Proxies and Caches • Example: Saving of network traffic • Website of an e-retailer receives 3,200,000 (3.2 million) requests per month • Consider using a caching proxy on the ISP • 65% requests are for GIF files, others are dynamically generated HTML pages • Average size of a GIF file: 7,300 bytes • Average size of HTML pages: 13,500 bytes • Caching proxy is used to holds GIF pages only • Estimated hit ratio is 65% • Saved Bandwidth to ISP= NoOfRequestsPerPeriod x HitRatio x AverageSize= 3,200,000 x 0.65 x 7,300= 14,828,125 KBytes/month = 45.77 Kbps

Presentation Business Logic Data Services • gathers user’s inputs • provides a standard set of interfaces (browser) • provides access to business services • contains rules for handling data • defines the application business logic • maps business functions to operations on business objects • stores data • protects data against failures & inconsistencies • provides access to mainframe databases Multi-Tier Architecture • The architecture of modern information systems consists of three layers • Web-based application can also framed in a three-tier architecture

Multi-Tier Architecture • Web sites implement the application and data layers of a system architecture

Multi-Tier Architecture • Benefits • Scalability & Availability • Multiple servers at each layer • Security • Use of firewalls that restrict access to applications servers and database servers to Web servers • Use of authentication procotols such as SSL and TLS • Integration with legacy data that reside on mainframes

Dynamic Load Balancing • Problem • Traffic to a Web server may get too high for computer to handle it effectively • Solution • Put more servers to work • Techniques to split the traffic across servers • DNS-based (Domain Name Server) • Dispatcher-based • Server-based

Dynamic Load Balancing • DNS-based • Translates a domain name to an IP address • DNS responses the DNS mapping request to one of the IP addresses of servers in the cluster • Round-robin DNS systems • Web server name is associated with a list of IP addresses • Each IP address maps to a different server • Each server contains a mirrored version of the website or access to a common file system • When a request is received, the Web server name is translated to the next IP address on the list • By translating web server names to IP addresses in a round-robinfashion, this technique tries to balance the load among the servers

Dynamic Load Balancing • Dispatcher-based • Single-IP-image • Uses the address of a special TCP router as the single address of the web cluster • Client requests are addressed to the router that dispatches them to a server according to some scheduling rules • To make the dispatching transparent to users, the selected server returns the response with the router address, rather than its own address

Dynamic Load Balancing • Server-based • Server responds to a client with a new server address to which the client will resubmit the request • The address can be same computer or any one of several back-end mirror computers • Load balancing mechanism • Transparent to users • Adds an extra connection to the original request • May increase user response time and network traffic

Third Party Services - Example • The infrastructure for electronic business includes services provided by many independent institutions and companies • A special class of third party service providers is defined as “A trusted third party is an impartial organization delivering business confidence, through commercial and technical security features, to an electronic transaction. It supplies technically and legally reliable means of carrying out, facilitating, producing independent evidence about and/or arbitrating on an electronic transaction. Its services are provided and underwritten by technical, legal, financial and/or structural means” Web servers of a portal company are served by a third-party ad server cluster. Each page sent out by the portal contains three ad banners. A reply to each incoming HTTP request to the portal generates 3 requests to the ad server Maximum throughput of the ad server cluster is 1800 ads/sec What is the maximum number of HTTP requests that can be served by the Web portal? Throughput cannot exceed one third of the throughput of the ad server PortalThroughput = AdServerThroughput/NumberOfVisitsToAdServer = 1,800/3 = 600 requests/sec

Operational Example: Google Network—Geographical Locations

Operation

Operation: Management

Performance • Example of performance problem • Large Internet retailer store • Customers were locked out of the site because of a surge in shoppers during a sales promotion • Show “Please try again later” message “Due to enormous turnout, the check-out lines are currently full. Please try again later.” 如何拒人千里，又引人想?

Performance (2) • Performance problem may arise in many points of the WWW • End user • Obsolete system technology • Lack of bandwidth of the link to ISP • ISP • Inadequate server and network capacity • Backbone provider • Excess of traffic bring congestion and delays • E-commerce site

Performance (3) • Bandwidth and server capacity have improved in recent years • Response time continues to be a challenge! • Response time degradation due to • Complex web-based commerce applications • Unpredictable nature of traffic • Execution of web transaction demands information from other external sites • Demand techniques and tools to analyze and understand system behavior

Availability and Maintainability • Availability is one of the main level goals of any electronic business • Low availability can cost an e-business lost revenue, reduced market share, and bad publicity • How can an electronic business achieve high availability? • Infrastructure reliability • Software robustness

Availability and Maintainability (2) • A starting point toward high availability • Geographically separate sites with multiple levels at each site • Multiple machines at each level • Load balancing mechanism • Redundant networks (Remember when the Under-sea cable broke down in the Pacific Ocean a few years ago!) • Permanent system monitoring and measurement procedures can anticipate problems and enhance availability

Availability and Maintainability (3) • Before one establishes availability goals for an e-commerce site, answers to the following questions should be reviewed • Where are the single points of failure? (critical path?) • What is the minimum configuration needed to run the site? • How much self-repairing is the site able to do? (logging) • How much diagnostic and alert information is available to technical and management people? (monitor) • What are the emergency procedures? (man, machines) • What is the historical MTTF (i.e. mean time to failure) and MTTR (i.e. mean time to repair) for the past failures experienced by the site? (agreements)

Availability and Maintainability (4) • Example • The site of a large portal, visited by millions of people every day • The site is considering several high availability goals, vary from 99.9% - 99.999% • In determining the proper availability ratio, management is aware that in order to estimate the downtime hours, the following factors should be taken into account • Average time to shutdown and boot the computers (improving?) • Time to discover the problems (diagnosis) • Average time to repair the problems (duplicate available?) • Worst case situation, which represents the longest time to discover the problems (prediction) • Maintenance hours (warning)

Availability and Maintainability (5)—Rule of Thumb • In demanding environment, such as electronic commerce, the cost of maintenance and administration is very high and can vary fromtwo to twelve times the hardware cost • Key concept in maintainability • The ease of replacing or upgradingsoftware (??) and hardware components

Availability and Maintainability Example (6) : Google

Availability and Maintainability Example (7): Google

Chapter 4

Chapter 4

Presentation Transcript

Chapter 4-4

Chapter 4 - 4

Chapter 4

Chapter 4

Chapter 4

Chapter 4

Chapter 4

Chapter 4

Chapter 4

Chapter 4

Chapter 4

Chapter 4

Chapter 4

Chapter 4

CHAPTER 4

Chapter 4

Chapter 4

Chapter 4

Chapter 4

Chapter 4

Chapter 4

Chapter 4