240 likes | 337 Views
Functions of a Web Warehouse. Kai Cheng, Yahiko Kambayashi, Seok Tae Lee Graduate School of Informatics, Kyoto University, Japan and Mukesh Mohania Western Michigan University, USA. Table of Contents. Survival from “Information Explosion” Warehouse-Mediated Content Delivery
E N D
Functions of a Web Warehouse Kai Cheng, Yahiko Kambayashi, Seok Tae Lee Graduate School of Informatics, Kyoto University, Japan and Mukesh Mohania Western Michigan University, USA
Table of Contents • Survival from “Information Explosion” • Warehouse-Mediated Content Delivery • Community-Oriented Web Warehouses • Technical Issues • Warehouse Enhanced Web Caching • Related Work • Concluding Remarks ICDL 2000
Survival from “Information Explosion” • Web Traffic Doubled Every 3-6 Months • Exponential Growth of the Web • 1 Billion Pages , January 2000 • 2 Billion Pages , June 2000 • 100 Times Increase in the Next 2 Years Information Overload for both Nets and Users ICDL 2000
Scale up the Web and Internet • More Bandwidth • Never Keep Pace with the Traffic Growth • More Server Capacity • How to Deal with “Hot-Spots” ? • Site Replication • Only Benefit Replicated Servers ? ICDL 2000
Our Approach • Tame the Chaotic Info. Streams • Unite the Individual Users Saving Redundant Data Transfers Sharing Findings and Efforts of Each Other ICDL 2000
Internet Warehouse-Mediated Content Delivery • Direct Delivery • QoS: Server, Network Overloaded • Personalized Services Unrealistic • Information Hunting Difficult ICDL 2000
WWW Buffering Searching Transformation Clustering Filtering Notification Navigation Analysis Resource Discovery Output Input Storage Indirect Content Delivery Web Warehouse ICDL 2000
Community-Oriented Web Warehousing The Community of Users *People with Special Information Needs/Interests Sharing Contribution ICDL 2000
Businessman Sports Fan Researchers Patients Examples of User Community ICDL 2000
Real/Cyber Communities (a) Real Communities Dependent on Location (b) Cyber Communities Independent on Location ICDL 2000
TechnicalIssues • Functions of a Web Warehouse • Web Caching vs. Web Warehousing • Data Warehousing vs. Web Warehousing • Dynamic Hierarchical Web Warehouses ICDL 2000
Resource Discovery Storage Reusing Format A Format B Transform Content A Content B Analysis Knowledge Data/Information Functions of a Web Warehouse • Buffering • Transformation • Transcoding • Summarizing • Content Analysis • Notification Transform ICDL 2000
Research Program Warehousing Transformation Content Analysis Web Caching ICDL 2000
From Web Caching to Web Warehousing ICDL 2000
From Data Warehousing to Web Warehousing ICDL 2000
Warehouse as Shared Information Repository • Real Communities • Centralized Management of Warehouses • Unicast Data Transfer • Cyber Communities • Distributed Management of Warehouse • Multicast Data Transfer ICDL 2000
Hierarchy of Web Warehouses Sports HP Design Tennis Skiing Mr. A, Ms. C Mrs. D … Mr. A. Mr. D ….. ICDL 2000
Sports Tennis Skiing A B Dynamic Formation of Web Warehouses (Split) Skiing Tennis B A ICDL 2000
Painting & Drawing B A Dynamic Formation of Web Warehouses (Union) Painting Drawing A B ICDL 2000
Content Sensitive Caching Content-Sensitive Caching Current Status:Content-Sensitive Caching Warehousing Web Caching ICDL 2000
Content-Sensitive Caching LRU-SP+ Content-Sensitive Cache Replacement Policy • Cache Replacement : Keep? Replace? • Traditional Caching Long Time Observation Replacement Decision 60% One-Access Objects How Differentiate ? ICDL 2000
LRU-SP+: Content-SensitiveSize-Adjusted & Popularity-Aware LRU • Daily Indexing: Cache Content Indices • Indices Popular Topics • How Similar? New Document Popular Topics • Benefit/Size Model “Observed” Pop. +“Inherent” Pop. • Implement this Model ICDL 2000
Related Work • LSAM’s Proxy Cache (Push) • Multicast-Based Virtual Cache • Affinity Groups and Push Channels • INTELSAT’s Wormhole Content Delivery • Warehouse-Koisk Model • Satellite-Based Delivery Platform ICDL 2000
Concluding Remarks Proposed to Cope with the Scaling Problems by Web Warehouse-Mediated Content Delivery • Discussed the Basic Functions of a Web Warehouse: Buffering, Transformation, Notification and Content Analysis • Introduced our Current Work: Warehouse-Enhanced Web Caching ICDL 2000