1 / 12

Making DADS distributed a Nordunet2 project

Making DADS distributed a Nordunet2 project. Jochen Hollmann Chalmers University of Technology <joho@ce.chalmers.se>. Project Aims. Principles for the design of distributed systems devoted to Digital Libraries (DL) Project results will contribute with

eileen
Download Presentation

Making DADS distributed a Nordunet2 project

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Making DADS distributeda Nordunet2 project Jochen Hollmann Chalmers University of Technology <joho@ce.chalmers.se>

  2. Project Aims Principles for the design of distributed systems devoted to Digital Libraries (DL) Project results will contribute with • Tradeoffs for the design of future DL infrastructures • Knowledge how users interact with DL • Algorithms for data replication and pre-fetching • Detailed experience from actual implementations (DADS) 2000

  3. Agenda • Comparison: centralized and distributed approach • General techniques for speedup • System properties and opportunities for improvement • How to distribute? • Project plan 2000

  4. Potential Advantages: Low complexity Low total ownership costs Simple administration Potential Disadvantages: Single point of failure Latency/Overload Availability Does not scale No parallel activities Centralized Approach 2000

  5. Potential Advantages: High availability Minimal latency Data retrieval in parallel Potential Disadvantages: Expensive Bandwidth used to distribute Difficult to allow updates everywhere Does not scale Total Replication to all Clients 2000

  6. Prefetching: Meta data or heuristics allow to request a local copy ahead of time Caching: Keep a retrieved copy for future use (and avoid re-transferring it) Replication: Select data and distribute copies without a request t t t General speedup techniques Start Prefetching Point of Replication Search result available Request 1 Request 2 2000

  7. Properties of Articles and the System Articles • Contain references to related work selected by the author • Are catalogued by experts • Published articles went through an acceptance process • high quality data A Search • Reduces the number of articles to a small number • Presents the results before retrieving the article • May contain patterns to hint replication 2000

  8. manual feedback t t t Selection from the list get related articles Search in the index fetch a paper General speedup techniques Start Prefetching Point of Replication Search result available Request 1 Request 2 2000

  9. Department Researcher University Global Library Research Group How to distribute? In deep knowledge Prefetching Research area Caching on article base Caching on journal base Journals Replication of most used journals Field Everything 2000

  10. Project Plan 2000

  11. Project Plan Phase I (Aug 2000 - Apr 2001) • Analysis of the current centralized system and construction of a simulation model (using data from DADS) Phase II (Apr 2001 - Dec 2001) • Design and evaluation of a distributed version and the contained algorithms Phase III (Dec2001 - July 2002) • Evaluation and fine-tuning of the algorithms in DADS 2000

  12. Life System analyze the log files find locality find bottleneck in the current system hints what should be logged hints what can be replicated where does latencies occur Understand the system properties Simulation build a trace driven the simulation model test if the bottlenecks can be reproduced measure the simulation with current and future technology parameters network technology, storage costs what are the problems that will remain Phase I: Analysis of the current system • Develop a benchmark! • Develop metric to quantify the costs. • Both systems should behave identically 2000

More Related