10 likes | 184 Views
Client. Results. DB. Query Execution Engine. . . . Query. Catalog. Catalog. Catalog. Catalog. Parser. DB. Query Execution Engine. Query Rewrite. DB. Query Execution Engine. Query Optimizer. Plan Generator.
E N D
Client Results DB Query Execution Engine . . . Query Catalog Catalog Catalog Catalog Parser DB Query Execution Engine Query Rewrite DB Query Execution Engine Query Optimizer Plan Generator Ranking Data Sources and Query Processing Sites using Ant Colony Theory Eliana Valenzuela-Andrade Ph. D. Student in Computing and Information Sciences and Engineering Program (CISE) Dr. Manuel Rodríguez-Martínez Advanced Data Management Group ABSTRACT This poster presents AntFinder, a novel approach to the problem of discovering and ranking the characteristics of the data sources and query processing sites in a wide-area mobile distributed database system. We model the system as a graph with nodes representing data sources and query processing sites, some of which might be replicated. We introduce a heuristic technique inspired in Ant Colony Theory to dynamically discover, rank and catalog each data source or query-processing site. Our goal is to find possible paths to access the computational resources or data provided by the highest quality sites. We define this concept of quality in terms of performance, and freshness. We present a simulation results of the system using Java CSIM and also a preliminary performance study designed to analyze the quality of paths found by the Ant Colony algorithm. • The problem of finding these paths is a combinatorial optimization problem. • Ant Algorithms were inspired by the observation of real ant colonies. • Ants are social insects. • One of the more important behaviour is them capacity to find shortest paths between food sources and their nest . • Ants use indirect communication. Figure 3. Data Structures for Ant Path Location • PERFORMANCE EVALUATION • Feasibility • Can our approach find optimal o near optimal solutions to the problem of finding a shortest path (vi, vj )? • We implement Java CSIM program to simulate the a NetTraveler system. MOTIVATION Data Integration Scenario, where we want find which is the best data repository or processing site to use. SYSTEM ARCHITECTURE We choose NETTRAVELER as a database middleware system to use as the use-case environment. The following figure depict the main components of our system. • Efficiency • How efficiently is ACO when is compared with the alternative of dynamically finding the paths at run time. • Can our approach find optimal o near optimal solutions to the problem of finding a shortest path (vi, vj )? • We implement Java CSIM program to simulate the a NetTraveler system. Query Processing Cycle. • QUALITY METRIC • We construct a framework to estimate the cost in the path over the graph . • The cost might represent response time, resource source usage, last update time, or other metric. We initially present the following metrics: • Performance • Characterize the quality of a site • Based on the computational capacity • It is also called Quality of Service (QoS) • Provide a notion of speed or efficiency- • Can be measured base on: • Local resources: CPU time, disk bandwidth, waiting time for attention • Network resources: data transfer cost • We explore different models, and we implement the more complex, exponential weighting moving average. • Freshness • May be defined in terms of the “update state” of the data • Often called Quality of Data (QoD) • We modeled it in a binary or a continuous form • PROBLEM • Catalog is centralized and static • But system is dynamic • Catalog changes continuously • Needs to be • manage in de-centralized fashion • changed dynamically as updates arrive • up-to-date with as many updates as possible. • The ACO average cost is almost the same as the cost of the paths found with Dijkstra´s. • ACO overhead for the path lookup is an order of magnitude smaller. • CONCLUSION • The ranking of data sources and query processing sites allow the system to improve it performance, including: • Quality of results. • Query Optimization process. • Use of alternative techniques as ACO in path solution searching. • We will add another features to NetTraveler System. SEARCHING AND RANKING SITES The Ants visit the nodes and collect data information and used it to rank the sites based on the selected metrics. Initially, ants does the travel forward saving information about the situation, searching for a good path. After that, the ants does travel backward, placing the update in the pheromone trail (like real ants) and statistics. We illustrate the process in the following figure. OVERVIEW OF OUR PROPOSE SOLUTION Our wide-area mobile distributed database system can be represent by a graph G. Let G(V,E) be a directed graph where V is a collection of sites and E is a collection of edges that represent connectivity between sites vi, vjbelong to V. Nodes in V can be classified as data providers, query processing providers or both. Predominant Pheromone trial • REFERENCES • M. Dorigo and T. Stutzle. Ant Colony Optimization. The MIT Press, 2004. • D. Kossmann, “The state of the art in distributed query processing,” ACM Computing Surveys, vol. 32, no. 4, pp. 422–469, Dec. 2000. • M. Rodriguez-Martinez and N. Roussopoulos, “Mocha: A self-extensible database middleware system for distributed data sources,” in Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, May 16-18, 2000, Dallas, Texas, USA, pp. 213–224. • E. A. Vargas-Figueróa and M. Rodríguez-Martínez, “Design and implementation of the NetTraveler middleware system based on web services,” in AICT-ICIW ’06: Proceedings of the Advanced Int’l Conference on Telecommunications and Int’l Conference on Internet and Web Applications and Services. Washington, DC, USA: IEEE Computer Society, 2006, p. 138. NSF CAREER IIS-0448184 NetTraveler – A Database Middleware System for Ubiquitous Data Access on Wide-Area Networks