280 likes | 434 Views
Finding Reference Affinity Groups in Traces Using Sampling. Chengliang Zhang Yutao Zhong, Chen Ding, Mitsunori Ogihara University of Rochester 08/22/2004. Computer Memory Hierarchy. Why Reference Affinity.
E N D
Finding Reference Affinity Groups in Traces Using Sampling Chengliang Zhang Yutao Zhong, Chen Ding, Mitsunori Ogihara University of Rochester 08/22/2004 TDM'2004
Computer Memory Hierarchy TDM'2004
Why Reference Affinity • Reference affinity measures how closely data in a group are accessed together in an execution • Previous research gets 12%speedup on Pentium IV PC machines using structure splitting and array regrouping based on the result of strict reference affinity analysis. • Weak reference affinity is a generalized probabilistic version TDM'2004
Outline • Weak Reference affinity model • Data element, Trace, Reuse distance • Weak reference affinity and its properties • A sampling method • Comparisons with k-distance analysis TDM'2004
Trace and Reuse Distance • A data element: a memory cell, file or block • A trace: a sequence of accesses to data elements. • π : Logical time→data element • Inverse function Г: data element →logical times • Reuse distance δ(i,j) : Number of distinct data elements between two logical times. • …ABBBC…DEFGH…IJJKL 2 4 3 TDM'2004
Weak Reference Affinity • Given a trace, k andθ, a group G of data elements is a weak reference affinity group if and only if: • 1: For any x, y∈G, • (a) Either at least θ proportion of the occurrences of x is k-linked to one occurrence of y relative to G. • (b) Or they are connected by the transitive closure of (a). • 2: No proper superset of G has this property. TDM'2004
Properties of Weak Reference Affinity • <k,θ> is a unique partition of data elements • Different <k,θ> form a lattice of finer partition relationships. Finer partition TDM'2004
…x…y…x…y…y…x…x… Properties of Weak Reference Affinity • Given a trace and a weak affinity group G at link length k and thresholdθ, for any x∈G, there exists a y∈G, such that there are more than |Г(x)|θsections of trace such that: • these sections include x and y at the two ends; • the reuse distance of every section is within k(|G|-1). • This property is the basis of our sampling method. ≤ k(|G|-1) TDM'2004
Verification Section • Verification section for x and y: • x and y are at its two ends; • its reuse distance is within k(|G|-1). • Critical verification section from x to y: the shortest one among those verification sections that include t and y at their two ends, andπ(t)=x. • There are more than |Г(x)|θ critical verification sections from x to y . TDM'2004
trace Critical verification section window Sampling Method • Sliding window with size n: a section of trace containing accesses to at most n distinct data elements. • Sampling method: • Estimate the upper bound for group size, suppose g. • Pick up sliding windows of size 2gk by sampling. • Compute confidence(x,y)= #windows having x and y min(#windows having x, #windows having y) • If confidence(x,y)>θ/2, then x and y are in the same group. TDM'2004
k-distance Analysis [Zhong+’04] • Get the reuse signature as histogram • Compute the Manhattan distance for every two data elements • If the distance is smaller than kB, then they belong to the same group x y TDM'2004
Experiment Setup • Synthetic trace generator TDM'2004
#correctly predicted groups #actual groups Evaluation Criteria • Match Rate: • Match rate = • Accuracy: • Suppose group G is separated into parts P1, P2, …, Pn and scattered into algorithm detected groups G1, G2, …, Gn. • Define the accuracy to be the average of accuracy of each group TDM'2004
Comparison - Weakness TDM'2004
Comparison - k TDM'2004
Comparison - Scalability TDM'2004
Comparison - Scalability TDM'2004
Related Work • Compiler Analysis [Thabit’81][Chilimbi’01][Ding&Zhong’03][Ding&Zhong’04] • Web system [Chinen&Yamaguchi’97][Duchamp’99][Pitkow&Pirolli’99][Su+’00] [Yang&Zhang’01] • File system [Zhou+’01][Jiang&Zhang02][Jiang&Zhang04][Chen+’04] • Frequent sequence mining [Agrawal&Srikant’95][Mannila+’97] [Han+’00] [Chudova&Smyth’02] [Pei+’02] [Hirao+’03] TDM'2004
Summary • A weak reference affinity model • <k,θ> is a unique partition • Different <k,θ> forms a lattice • A sampling method • Better in term of accuracy and scalability. TDM'2004
Example of Strict Reference Affinity w x w x uy z…z y z yv w x w x … w x w x u y z…z y z y v w x w x … • k = 2, affinity group {w, x, y, z} • k = 1, affinity groups {w, x} and {y, z} TDM'2004
Difference between RAA and FSM • Reference affinity analysis allows: • Repeated cases within the patterns • ABBBC…ABC…ABCC… • variations within the patterns • …ADBEC…AFBC…ABGC… • The order of the sequence does not matter • …ABC…ACB…BAC… TDM'2004
Strong Reference Affinity[Zhong+’04] • Given a trace and k, a group G of data elements is a strict reference affinity group if and only if: • 1. For any x and y in G and for any i∈Г(x), there exists a j∈Г(y), where i and j are k-linked relative to the group G. • 2. There does not exist G', such that G G' and G' also satisfies condition (1). TDM'2004
Properties of Strong Reference Affinity • <k> is a unique partition of data elements • <k> is a finer partition of <k'> when k<k'. • Strong reference affinity gives a hierarchical partition of the data elements. Finer partition TDM'2004
k-link path relative to G • There is a k-linked path between logical time i and j(suppose i<j) relative to data element group G iff there exists a list of logical time t1, t2, …, tn, such that • i< t1< t2<…<tn <j; • π(i), π(t1),…π(tn),π(j) ∈ G are distinct • δ(i, t1), δ(t1 , t2), …,δ(tn , j) ≤ k TDM'2004
Example of k-linked path • Logical time: • Trace : 2 1 2 2 k=2, G={A, C, D, F, H}, i=1, j=8 TDM'2004
Effect of Sample size TDM'2004
Strict vs. Weak Reference Affinity • Given a k, strict reference affinity groups forms a finer partition of any weak reference affinity groups atθand k. • Even whenθ =1,weak reference affinity groups may not be strict reference affinity groups. • Example: …XAWBYCWDZ…XEWFYGWHZ Whenθ =1 and k=2, {XWYZ} belongs to the same weak reference affinity group, but not a strict one. Strict reference affinity groups: {X}, {Z},{W,Y}. TDM'2004