Web Prefetch

Web Prefetch 張燕光資訊工程系成功大學 ykchang@mail.ncku.edu.tw

Introduction • Prefetch a web page before a user really requests this page. • The ultimate goal of Web prefetching is to reduce what is called User Perceived Latency (UPL) on the Web. • The delay that an end user (client) actually experiences when requesting a Web resource. • A user perceives Web latency as the timebetween issuing a request for a resource and the time the Web page is actually displayed in the browser window. • The reduction of UPL does not imply the reduction of actual network latency or the reduction of network traffic. • On the contrary in most cases even when UPL is reduced, network traffic increases. 2

Introduction • Sources for User Perceived Latency • Round trip time (RTT) at the lower level • Processing latency in end systems – load of the end system • Communication latency over the network – queuing delay and propagation delay • Bandwidth, • size of the web pages 3

Introduction • Besides prefetching, what other methods can reduce the User Perceived Latency (UPL) on the Web. • Increase the size of browser caches. Browser caches typically have a default size of MB. Increasing the size of the cache increases the hit ratio and reduces modem tracfic. • Use delta compression to transfer modified Web pages between the proxy and clients. That is if an old copy of the modified page exists in the browser cache the proxy only sends the difference between the latest version and the old version. • Apply application-level compression to HTML pages. Studies have suggested that HTML texts can be first compressed and then transferred from one end to another. HTTP supports application-level compression via the transfer-encoding tag 4

Introduction • Prefetching on a Web system is to “separate” the time when a resource is actually requested by a client from the time that the client chooses to see the resource. 5

Introduction • Optimization of T = t1 − t0 in order for T to be: • big enough in order for the resource Ri to be prefetched before the client requests it, and • small enough in order for the resource Ri not to have expired before the client requests it. • the method used to foresee a client’s request before it is placed since t0 < t1. 6

Introduction • Every prefetching system must be designed carefully and put to work only after an extensive trial period and after providing adequate answers to the following basic validation questions: • IF prefetching can and must be added to the specific system. Not all Web systems can be facilitated by prefetching. • For example, a highly dynamic Web system may require the time tolerance (i.e., t1 − t0 in previous Figure) to be so small before a resource expires. So, no prefetching approach would be adequate to facilitate it. • WHO (referring to the Web system’s components) will take part in the prefetching procedure. • Prefetching can be facilitated by all basic components that a Web system consists of (clients, proxies, mediators and servers). • One must answer to the basic WHO question before going to work, in order to produce prediction policies etc. 7

Introduction • HOW (referring to the procedure) prefetching will be carried out. • Prefetching can be considered as an application level protocol. • This protocol must clearly state 1.the procedures that will be followed in order to initiate, execute and terminate prefetching and also 2.the communication procedures between the Web system components. • For example, in some prefetching approaches proprietary text files containing most popular URLs are transferred between servers, clients and proxies. In order to implement the software modules that will handle these files on the appropriate Web system component the question of HOW prefetching will be applied must be answered. 8

Introduction • WHAT (referring to the different types of available resources) will be prefetched. • As argued above, dynamic content is the most “difficult” candidate for prefetching. • Before designing the prefetching system, this question must be answered and the file types that will take part in prefetching should be clearly depicted. • WHEN (referring to the time of prefetching) a resource will be prefetched. • In order to answer this question one must apply a suitable prediction algorithm that will receive a number of inputs and provide the answer to the WHEN question as the output. 9

Introduction • Effectiveness of prefetching depends on whether there is certain predictability in user’ Web page accesses • The info on access patterns may be derived from servers’ access statistics or from clients’ configuration • Recent studies on WWW traffic show that there are considerable inter-dependencies among consecutive accesses to some Web pages 10

Introduction • WWW is a hyperlink based information system • Constraints: what hyperlinks can be followed from a particular page and the contents may also provide some strong leads as to the order in which the Web pages will be viewed. • User’s personal preferences are also an important factor– the sequence of accesses is ultimately decided by users’ individual selection. 11

Introduction • Prefetching can be performed in 3 ways: • Between browser clients and Web servers. • Between proxies and Web servers. • Between browser clients and proxies. • Initiating agent • Client-side (Client Initiated ) prefetch • Server-side (Server-Initiated ) prefetch 12

Server-Initiated Prefetch • Server anticipates what hyperlinks are likely to be followed, and preload the corresponding Web pages to the client. • The client has to be prefetching-aware so it can deal with preloaded pages correctly. • This would require extensions to current HTTP protocol and modification to both client and server software. 13

Client-Initiated Prefetch • It can be done by individual clients in a way transparent to the servers • The implementation is therefore much simpler. 14

Criteria • Criteria for deciding whether a Web page should be prefetched can be either statistical or deterministic: • Statistical: calculate the inter-dependencies of page accesses periodically based on the most recent access logs, and group Web pages with interdependencies higher than a certain threshold for prefetching • Deterministic: configured statically by users as part of their personalized user interfaces or by page designer as part of the content design (e.g. must be read newspapers) 15

Bandwidth and Delay Tradeoff • Statistical prefetching is easy to be automated. But, some bandwidth will be wasted and total bandwidth consumption is increased. • The delay for non-prefetched web pages may increase as a result of the extra load causes by prefetching • When traffic is heavy, aggressive prefetching, such as “get all links”, may actually increase the average latency of all Web pages 16

Analysis of Bandwidth and Delay Tradeoff • P: the hit rate of prefetching, i.e., the probability of a correct prefetch • Do is the average retrieving delay without prefetch. • Assume the retrieving delay is 0 for prefetched pages and Dx is delay for non-prefetched ones. • The average delay with prefetching Dn is Dn = P×0 + (1 – P)×Dx = (1 – P)×Dx 17

Analysis of Bandwidth and Delay Tradeoff • To ensure that prefetching reduces average delay, i.e., Dn < Do, we have Dx/Do < 1/(1 – P) …………………… (1) • Assume the delay Dx and Do can be calculated based on M/M/1 queuing model • Do = 1/(1 – Ro) and Dx = 1/(1 – Rx), where Ro and Rx are the link utilization with and without prefetching, respectively. • Thus, From (1), we have • P/((Rx – Ro)/Ro) > Ro/(1 – Ro) 18

Analysis of Bandwidth and Delay Tradeoff • We define the efficiency (E) of prefetching as the ratio of (1) the hit ratio of prefetching and (2) the ratio of traffic increase to achieve that hit rate, i.e., • E = P/((Rx – Ro)/Ro) > Ro/(1 – Ro) • . • The above implies that the efficiency of prefetching must be larger than Ro/(1 – Ro), otherwise, the average delay can actually be higher than that without prefetching 19

Analysis of Bandwidth and Delay Tradeoff • Thus, the above inequality is re-written it as • Ro < E/(1 + E) • If E is known, one can calculate the maximum Ro for statistical prefetching to be useful. 20

Analysis of Bandwidth and Delay Tradeoff • Feasible regions for E and Ro (above curve) 21

Analysis of Bandwidth and Delay Tradeoff • Feasible regions for E and Ro (above curve) 22

Analysis of Bandwidth and Delay Tradeoff • It is clear that prefetching is only useful when traffic is very light or prefetching efficiency is very high. • Example 1: for E = 0.5 (i.e., for each 1% traffic increase, the prefetch hit rate improves 0.5%), Ro must be smaller than 0.3. (see the first curve above) 23

Analysis of Bandwidth and Delay Tradeoff • Example 2: For Ro = 0.8, E must be larger than 4. (see the second curve above) • This is because when traffic is heavy, very little extra traffic may result in substantial increase in queuing delay. • Unless the prefetching efficiency is very high, the extra delay experienced by non-prefetched pages may outweight the decrease in the delay of prefetched pages. 24

Deterministic Client-Initiated Prefetch • Deterministic prefetch is the most conservative type as it often has little or no bandwidth overhead • When users know what needs to be prefetched, it can reduce perceived latency, and even ease congestion at very little cost • But, its scope of use is limited. • Configured statically by the users • Can be implemented as part of browser or simply as an add-on without changing client and server software 25

Deterministic Client-Initiated Prefetch • Batch prefetching • Many pages are read on a regular basis, such as newspapers, weekly work reports, etc. • Large web pages – papers with large graphics • Similar to mirroring but batch prefetching is more flexible as it does not require any central administration. 26

Deterministic Client-Initiated Prefetch • Start-up prefetching • Start prefetching when a browser is started • A set of pages users need to look at at that day may be prefetched in the background • It can be integrated with planning tools so that a ToDo web page is constructed each day for the users and corresponding web pages are prefetched at the start-up for later viewing. 27

Deterministic Client-Initiated Prefetch • Pipelining with prefetching • Current model for navigation is a series of “click, fetch, and view” operation. • As a user usually spends some time (seconds or minutes) on a page, we can potentially pipeline the operation by fetching the next page while the user is looking at the current page. • Usefule for some information services: • On-line newspapers, stock market prices and headline tracking services where users can easily specify the sequence of pages to be viewed. 28

Server-Initiated prefetch • Predictive prefetch from Berkeley • Idea:(Similar to transparent content negotiation) • Typically, there is a pause after each page is loaded for reading the loaded page • Servercomputes the likelihood that a particular page will be accessed next and conveys this information to the client. • The client program then decides whether or not to actually prefetch the page 29

Predictive Prefetch • The server has the opportunity to observe the pattern of accesses from several clients and use this information to make intelligent predictions • The client is in the best position to decide if it should prefetch files based on whether it already has them cached or the cost in terms of CPU time memory network bandwidth and so on needed to prefetch data 30

Predictive Prefetch • A dependency graph is constructed to depict the pattern of accesses to different files stored at the server • The graph has a node for every file ever been accessed • There is an arc from node A to B if and only if at some point in time B was accessed within w accesses after A, where w is the lookahead window size 31

Predictive Prefetch • The weight on the arc is the ratio of (1) number of accesses to B within a window after Ato (2) number of accesses to A itself • This weight is not actually the probability that the B will be requested immediately after A • So the weights on arcs emanating from a particular node need not add up to. The figure on next page depicts a portion of a hypothetical dependency graph 32

Predictive Prefetch A small hypothetical dependency graph Based on past observations when home.html is accessed, there is a chance that image.gif will be accessed soon afterwards and also a chance that image.gif will be accessed soon afterwards. Furthermore if image.gif is accessed there is a chance that image.gif will follow soon afterwards 33

Predictive Prefetch • The dependency graph is dynamically updated by a process predictd as the server receives new requests from each httpd process running on the server machine • predictd maintains a ring buffer of size equal to the window size w for each client that is currently connected to this server • When predictd receives a new request from a httpd, it inserts the ID of the file accessed into the corresponding ring buffer • Only the entries within the same ring buffer are considered related, so only the corresponding arcs in the dependency graph are updated 34

Predictive Prefetch • This logically separates out accesses by different clients and thereby avoids the problem of false correlations • However in some cases such as clients located behind a proxy cache. predictd will not be able to distinguish between accesses from different clients • One way of getting around this problem is to use mechanisms to pass session-state identification between clients and servers even when there is a proxy between them 35

Predictive Prefetch • Predictd bases its predictions on the dependency graph. • When A is accessed it would make sense to prefetch B if the arc from A to B has a large weight which implies that there is a good chance of B being accessed soon afterwards • In general predictd would declare B as a candidate for prefetching if the arc from A to B has a weight higher than the prefetch threshold p • It is possible to set this threshold differently for each client and also vary it dynamically 36

Server-Initiated prefetch • Top-10 approach from ICS France • Combine server’s active knowledge of their most popular pages (top-10) with client access profiles. • based on the cooperation of clients and servers to make successful prefetch operations • The server side is responsible for periodically calculating a list with its most popular documents (the Top-10) and serving it to its clients • Actually quite a few servers today calculate their most popular documents among other statistics regularly 37

Server-Initiated prefetch • Top-10 approach from ICS France • Calculating beyond the most popular documents is an obvious extension to the existing functionality • Top-10 does not treat all clients equally • Time is divided in intervals and prefetching from any server is activated only after the client has made sucient number of requests to that server (> THRESHOLD) 38

Proxy-Initiated prefetch • From SIgmetric99 • Relies on the proxy to make predictions and either the proxy or the browser to perform the prefetch • Assumption • users have idle times between requests because users often read some parts of one document before jumping to the next one • the proxy can predict which Web pages a user will access in the near future based on reference patterns observed from many users • the proxy has a cache that hold recently accessed Web pages 39

Proxy-Initiated prefetch • The proxy can then either push the Web pages to the users browser or • piggyback the predictions with regular responses to the browser and let the browser fetch the Web pages • Only objects that are already in the proxy cache can be prefetched • Thus the approach generates no wide area network trac 40

Proxy-Initiated prefetch • The proxy maintains a history structure • Every time the proxy services a request, it updates the history structure establishing the connection between past accesses made by the same user and the current request. • When the proxy detects that the connection to a user is idle it uses the history structure to predict pages that the user might access next checks which of the pages are in its cache and generates a list of candidates ordered by their probabilities of access 41

Proxy-Initiated prefetch • The candidates are pushed or fetched one by one into browser cache • The moment the user issues a new request the prefetching is stopped and any partially fetched object is discarded at the browser end unless the request is for the object that is being fetched • In addition the proxy clears the list of candidate pages and recomputes a new one next time 42

PPM Predictors • Prediction by Partial Matching (PPM) data compressor • The algorithms observe patterns from past accesses from all the clients to predict the future accesses of individual clients • The patterns that we capture are in the form of a user is likely to access • URL B right after he/she accesses URL A • Clearly only accesses from the same user should be connected • Accesses from different users are not related 43

PPM Predictors • The algorithm has three parameters • m: number of past accesses used to predict future ones • It is also called the order of the predictor or the prefix depth • l: number of steps the algorithm tries to predict into the future • For example if l = 2, it means that the algorithm not only tries to predict the immediate next access for the user, but it also tries to predict the access after that • We call l the search depth • t: threshold used to weed out candidates • Only candidates whose probability of access is higher than t, • where 0 ≦ t ≦ 1 is considered for prefetching 44

PPM Predictors • The algorithm maintains a data structure (typically a collection of trees) that keeps track of the sequence of l URLs following another URL, a sequence of two URLs, and so on, up to a sequence of m URLs • For prediction, the past reference, the past two references, up to the past m references are matched against the collection of trees to produce the set of URLs for the next l steps. • Only URLs whose frequencies of accesses are larger than t are included • Finally, URLs are sorted first by giving preferences to longer prefixes and then by giving preferences to URLs with higher probability within the same prefix 45

PPM - history structure • The history structure is a forest of trees of a fixed depth K, where K = m + l • The history encodes all dynamic sequences of accesses by any one user up to a max length K • One root node is maintained for every page seen and in this node is a count of how often this page was seen • Directly below each root node are all pages ever requested immediately after the root page and a count of how often the pair of requests occurred • The next level encodes all series of three pages and a count of how often this particular sequence of three pages 46

PPM - history structure • History structure is updated every time a user makes a request • For each user there is a list of the last K pages a user requested • The update involves incrementing counters and possibly adding new nodes to the trees • Each update changes one node at each level of history structure • Figure in next page shows an example of the history structure • In this example K = 3 and the structure is being updated after a user accesses page C following pages A and B • The sequence ABC is updated with the counters for A and B and C incremented • The sequence BC is updated and so is the sequence C 47

PPM - history structure 48

PPM - history structure • Encoded into the history structure is the probability of accesses for URLs following a given sequence of references • The predictor looks at a users recent m accesses and processes each sequence of the last n accesses where n = m … 1 separately • For each sequence it first finds the corresponding tree and node in the history structure • It then follows all paths down from the node for l levels listing all the URLs at each level along with their counts 49

PPM - history structure • If one URL appears in more than one path the node counts for the URL are added together • It then divides the count of each URL by the count of the sequence, yielding the URL’s relative probability of access • It then sorts the list based on the probability and deletes those whose probabilities are less than t • Finally the predictor generates the list of candidates by concatenating the lists from the m sequences putting the list of the longer sequence ie longer prefix first 50

Web Prefetch

Web Prefetch

Presentation Transcript

web 2.0 web beyond web

An Accurate Prefetch Technique for Dynamic Paging Behaviour for Software Distributed Shared Memory

Web Characterization Web Design

DICOM Prefetch Quick access to priors

Access Map Pattern Matching Prefetch: Optimization Friendly Method

Reducing memory penalty by a programmable prefetch engine for on-chip caches

Cluster Prefetch: Tolerating On-Chip Wire Delays in Clustered Microarchitectures

Data Prefetch and Software Pipelining

EXAMPLE: Adding Prefetch Inst.

Example: Adding new instructions - prefetch

Web Programming: Semantic Web

Web SI - Web SI Consulting - Web Applications - Web Solutions

Web 2.0 + Web 3.0 = Web 5.0?

Enhancing Signature Path Prefetching with Perceptron Prefetch Filtering

Prefetch -Aware DRAM Controllers

Web 2.0 + Web 3.0 = Web 5.0?

Access Map Pattern Matching Prefetch: Optimization Friendly Method