150 likes | 248 Views
Location Querying. Challenges. Managing queries that involve the location of moving objects is challenging because location is a fast changing data
E N D
Challenges • Managing queries that involve the location of moving objects is challenging because • location is a fast changing data • result of such queries may depend on the location of the user posing it, e.g. all the trucks that are within 1 mile of truck ABC123 (which needs assistance). • queries may not directly include location, but may require tracking mobile objects e.g. queries that involve data produced and located at mobile hosts. • In general, queries may be initiated by either static or mobile users and may include databases located at both static and mobile sites.
Types of Location Queries • Location queries with transient data • transient data: whose value changes while the query is being processed e.g. a moving user asking for nearby hospitals. • Continuous queries e.g. a moving car asking for hotels located within a radius of 5 miles and requesting the answer to the query to be continuously updated. • Issues include: • when and how often should continuous queries be re-evaluated? • the possibility of partial or incremental evaluation.
Handling Imprecise Location Data • Maintaining precise location of a user is very expensive in terms of communication cost. • Profile Partitioning is a technique for reducing the update volume [Imielinski & Badrinath VLDB’ ‘92]: • Partition: glues together cells between which the user relocated very often and separates cells between which the user relocates infrequently. 4 6 6 4 1 4 6 6 1 P2 P1 The numbers on the edges indicate the number of times the user relocates between the end locations connected by the edge during certain interval of time.
Profile Partitioning • User do not have to inform the location server about each and every change of their location • location server maintain which partition the user is currently in. • the user informs the location server whenever it moves from one partition to another. • Supports bounded ignorance: • the real position of the mobile and the position which is know are always in the same partition. • “I do not know where you are exactly at this time but I know that you are in downtown Boston.” • Search is always within a partition
Modeling Fast Changing Values • Issues: • margin of error • how to find better (or more precise) value • cost of finding precise value finding procedure • Distinction is made between actual and stored value of a dynamic attribute (such as location, temperature) • Partitions help in defining margin of error • Following locating methods provide exact values at additional cost • paging/broadcasting to all cells in partition • maintain a list of locations, ordered by likelihood of user being at a given location; search each location in the list order • maintain forwarding pointers
Additional Attributes & Methods • The following additional attributes are associated with a mobile object MO: • Partitions: {P1,P2, ...} where Pi: set of cells and PiPj=, ij. • Correctors: method to be used for finding exact location of MO • Predictors: method/set-of-rules to be used to find default location of MO (in case MO has not notified location server about exceptional behavior). • The following methods are defined on MO: • LOC: MO.LOC returns the stored location of MO • ERR: MO.LOC.ERR returns the partition which is stored in MO’s profile and contains MO.LOC • loc: MO.loc returns the actual location of MO • involves additional considerable communication along with database access.
An Example • Additional attributes of user John • Object: John • Partitions: {Cell1,Cell2}, {Cell3,Cell4,Cell5} • Correctors: Paging with pointers • Predictors: {John is at home, in Cell2 after 6 P.M.} • John.LOC : returns stored location of John, say Cell1. • John.LOC.ERR: returns {Cell1,Cell2} • John.loc: • If invoked before 6pm: pages Cell1 and Cell2 to determine the exact location of John and returns that location. • If invoked after 6pm: pages Cell2 (John’s home) first before paging Cell1.
Query Processing • Queries differ in the complexity of location constraint: • “Find me a doctor near the campus” has • one non-location based constraint, and • another unary constraint on location. • “Find a gas station, fast-food restaurant, and grocery store such that all of them are on the same highway and in the above order.”: involves ternary constraint (between) plus three unary constraints on individual location (“on the highway”). • Goal: Minimize the communication cost to answer the queries (in presence of imprecise knowledge about locations of users).
Query Processing: Problem • Main problem arising in query processing in presence of imprecise knowledge about locations of users is: • How to minimize the communication cost to “find out” the missing information necessary to answer the query? • General query in “pseudo SQL” form: SELECT x1.loc, .... , xm.loc FROM Users WHERE (x1.loc=l1 ... xm.loc = lm) C(l1, ... , lm)W(x1, ... , xm) • C(l1, ... , lm) is an m-ary constraint on locations l1, ... , lm. • W(x1, ... , xm) is a constraint on individual object locations. • Users: class which stores all instances of users in the system • Query wants to evaluate a predicate over actual locations of users, while the only locations available are {x1.LOC, .... , xm.LOC} with the associated “error” determined by ERR.
Query Processing Strategies • Naïve strategy: • Find objects a1, …, am such that W(x1, ... , xm) is satisfied. • Determine exact locations a1.loc, .... , am.loc by paging partitions a1.LOC.ERR, .... , am.LOC.ERR. • Check if the results satisfies C(a1.loc, .... , am.loc). • Naïve strategy has high communication cost • In terms of number of cells paged • Computation is less expensive that communication. • Optimization Strategies (to reduce #paging msgs.): • Paging can be avoided completely if it can be determined that C(a1.loc, .... , am.loc) is true (or false) for all combinations in the Cartesian product a1.LOC.ERR … am.LOC.ERR. • Only partial paging may be required in some cases e.g. after determining exact location for a1 and a2 it may be determined that the constraint is true/false irrespective of location of other objects.
Query Processing: An Example • Give the names of doctors (possibly mobile) located near John’s current location: • {d: Near(d.loc, John.loc)Doctors(d)}, where • Near(y,x): true if x is a neighbor of y; false otherwise. • Possible strategies: • Page both John.LOC.ERR and d.LOC.ERR • First page John.LOC.ERR and then determine if it is necessary to page d.LOC.ERR, or • First page d.LOC.ERR and then determine if it is necessary to page John.LOC.ERR • Strategy 2 or 3 are better than 1; But which is better? • depends upon the profiles and locations of John and doctors • Cost is characterized using Classification Tree
Classification Tree (CT) • Used for determining (expected) minimal cost strategy to evaluate the query • Strategy sequence in which partitions are paged. • A CT for strategy with variables x1, …, xm : • Consists of two special types of terminal nodes: • POS: indicates constraint is true • NEG: constraints is false • Has edges labeled with conditions: • (xi.loc=li); lixi.LOC.ERR • Associates with each node N a predicate path(N) which is conjunction of all conditions along the path from root to N. • A path terminates with POS (NEG) node iff the conjunction of all the conditions along the path implies (does not imply) the constraint C(x1.loc, ... , xm.loc).
Determining Evaluation Cost • Cost of a node = the number of outgoing edges which do not terminate in NEG node • this is the cost of paging; since we do not need to page locations which we can determine lead to failure. • Cost of a path = the sum of the cost of nodes along the path. • Expected cost of using the strategy associated with a CT = the sum of costs of all paths weighted by the associates probabilities. • corresponds to the expected number of paging messages which will have to be sent in order to determine if x1.loc, ... , xm.loc satisfy the query. • The problem of finding a tree corresponding to the optimal strategy is NP-complete.
NEG NEG NEG NEG NEG NEG NEG NEG POS POS POS POS Example (Cont.) L1 L2 L3 L4 U = d.LOC.ERR • Classification tree L5 L6 L7 L8 L9 L10 L11 L12 V = John.LOC.ERR John d L12 L10 L11 L3 L7 d d d L4 L3 L7 L7 L7 L3 L3 L4 L4 L4 Strategy 2 Strategy 3