150 likes | 304 Views
Data Mining at work. Krithi Ramamritham. Dynamics of Web Data. Ad Component. Headline Component. Headline Component. Navigation Component. Headline Component. Headline Component. Personalized Component. Dynamically created Web Pages -- using scripting languages.
E N D
Data Miningat work KrithiRamamritham
Dynamics of Web Data Ad Component Headline Component Headline Component Navigation Component Headline Component Headline Component Personalized Component Dynamically created Web Pages -- using scripting languages
1. What to deliver? Page content may be based on • queries on dynamically changing data • e.g., sports scores, stock prices, environment • type of access device • time and location of access/user Existing sites may contain new information New sites (URLs) may come into being
2. How to deliver? wiredhost sensors Network Network servers Proxies /caches mobile host Data sources End-hosts
Update Mumbai temperature every 2 degrees The proxy obtains data from the source(s) Maintains | U(t) - S(t) | <= 2 Keep Data Up-to-date Source S(t) Proxy / DB P(t) User U(t)
After a specific interval When to poll the source? Server Proxy User Pull Basedon temporal data mining – time series analysis – and prediction of when change will exceed 2 degrees
Where to do the work? • Diverse client devices • Differ in hardware, software, network connectivity, form factor • Web content needs to be tailored for each client type • Each response depends • not only on the requested URL • but also on the capabilities of the client
Transcoding Conversion of one data version to another • Decreasing Image Quality (JPEG quality level) and size - “convert” utility in Linux • Summarizing text • transcode => Info extraction/ retrieval/ classification
Who should transcode? • Download desired version from server • Transcode higher version locally • Factors influencing decision • Transcoding Complexity • Proxy-server network connection • Load on proxy (Multiple Linear) Regression Predict based on a (linear) model of overheads
What is new on the Web? How is the monsoon progressing? Time series analysis: Change prediction, pattern mining
‘Bhav Puchiye’ www.broadmoor.com Interface for Bhav Puchiye
Inverted Pyramid Interfaces Conclusion Discussions Background & related Information Findings Findings Background & related Information Discussions Conclusion Inverted pyramid approach
Bhav Poochiye Pricing Module developed for selected commodities for selected markets for selected areas DEMO
Building Usage Profiles Estimate access probabilities based on: • Current user/community navigational patterns over site contents (in the form of click streams) • Historical user/community access patterns over site contents (in the form of association rules) Cluster needs based on location, income/age of user, time-of-day
Data Mining From data to information to knowledge to money!