260 likes | 403 Views
Database Issues in Smart Homes. Pervasive Intelligent Environments Spring 2004. Topics: Lecture 1. What’s being done What do you need it for? Issues Where’s the data come from? Data sources DB Communication How do we store the data? Storing LOTS of data: Data warehouses
E N D
Database Issues in Smart Homes Pervasive Intelligent Environments Spring 2004 CRESCENT
Topics: Lecture 1 • What’s being done • What do you need it for? • Issues • Where’s the data come from? Data sources • DB Communication • How do we store the data? • Storing LOTS of data: • Data warehouses • Now we’ve got it, what do we do with it? Looking ahead • Next time: examples, more troubles… CRESCENT
DB in Smart Environements CRESCENT
UTA MavHome DB • Active • Reactive & proactive (e.g., to predict) • Distributed • Information collection agents • Rules • Local Agent: what data they need to collect • Distributed: coordinate overall monitoring of collected information • Continuous monitoring of events • Extension of SNOOP CRESCENT
Microsoft Easy Living DB (2002) • Relational • Fast & robust, but awkward for some data • World Model DB Describes: • Computing devices • People and their personal preferences/settings • Services • Rooms and doorways • Serves as Abstraction Layer between sensors and application that use data from sensors • e.g. new sensors no change to applications CRESCENT
Stanford Interactive Workspace • Uses LORE • A semi-structured XML DB system • Still available, but work stopped in 2000 • Data stored is catalog of (index to) • documents, images, 3-D models, application-specific domain models CRESCENT
What do you need it for? • Kitchen • Entertainment • General (many uses) • When does Molly usually come home? • Where is Rigel now? • What’s the rain forecast? CRESCENT
Issues • Data source • Local (sensors, input devices) • Outside (weather forecast) • Data quality • Data volume • Data lifetime • Do you save images once info extracted (e.g. Ian walked in front door at 2:13pm) • Data rep • Relational is awkward CRESCENT
Data input • LOTS AND LOTS OF DATA • Required for good prediction, decision making • Inputs from • Sensors • Bar code / RF readers • Voice • PC keyboard • Sensors • Recording media choices CRESCENT
Sensor Databases • UTA IT Lab and Diane Cook • sensor-generated data collection, management, analysis, triggering • continuous queries, stream query processing • Sharma Chakravarthy’s work • Active databases CRESCENT
Real Sensor Data Input • 9/8/2002 2:0:1 AM~A5 (Coffee Maker) ON • 9/8/2002 1:6:59 AM~A9 (A/C) ON • 9/8/2002 3:58:52 AM~A0 (Stereo) ON • 9/8/2002 5:57:0 AM~A2 (Kitchen Light) ON • 9/8/2002 3:1:42 AM~A5 (Coffee Maker) OFF • 9/8/2002 7:8:3 AM~A3 (Stove) ON • 9/8/2002 12:54:52 PM~A10 (Bathroom Light) ON • 9/8/2002 4:58:5 AM~A0 (Stereo) OFF • 9/8/2002 8:1:20 AM~A3 (Stove) OFF • 9/8/2002 9:6:10 AM~A8 (Computer) ON • 9/8/2002 10:8:19 AM~A4 (Bathtub Heater) ON • 9/8/2002 11:9:4 AM~A0 (Stereo) ON • 9/8/2002 9:4:5 AM~A8 (Computer) OFF • 9/8/2002 10:9:4 AM~A4 (Bathtub Heater) OFF • 9/8/2002 2:2:5 PM~A10 (Bathroom Light) OFF • 9/8/2002 2:52:37 PM~A0 (Stereo) OFF • 9/8/2002 4:2:0 PM~A9 (A/C) OFF CRESCENT
Simulated Sensor Input 11/15/2001 7:3:53 AM (BedRoom Alarm) A9 ON 11/15/2001 7:4:2 AM (Bath Shower) A11 ON 11/15/2001 7:4:8 AM (Bath BathDisplay) A10 ON 11/15/2001 7:4:8 AM (Bath L4) A4 ON 11/15/2001 7:4:45 AM (Kitchen CoffeePot) A8 ON 11/15/2001 7:4:47 AM (Kitchen KitchenDisplay) A12 OFF 11/15/2001 7:4:55 AM (Kitchen KitchenDisplay) A12 ON 11/15/2001 7:4:47 AM (LivingRoom Thermostat) A16 ON 11/15/2001 7:4:49 AM (Kitchen L3) A3 ON 11/15/2001 7:4:50 AM (Garage/Patio Locks) A17 OFF 11/15/2001 9:29:59 AM (Yard Sprinklers) A14 ON 11/15/2001 9:29:59 AM (LivingRoom JanitorRobot) A13 ON 11/15/2001 6:59:53 PM (Garage/Patio Locks) A17 ON CRESCENT
Media Viewing Data CRESCENT
What data to collect? • Digital Silhouettes (Predictive Networks) • Predicting web surfing behavior ($$$) • Microsoft (2002) track TV viewing preferences • 140 data items for each user • Demographics (50) • Subcategories within gender, age, income, education, occupation, and race • 90 Content preferences • golf, music, yoga CRESCENT
Communication with the DB • Agent communication languages • KQML • FIPA • XML • SOAP • UPnP (upnp.org) • For more information, slides 11-26 of • personal.tcu.edu/~lburnell/SE/SmartHomeAgents.zip CRESCENT
KQML Examples • Turn the TV on to channel 5 • (sendCommandToDevice :deviceName TV: type ask :command (alterSettings :isOn 1 :channel 5)) • Can embed into an event • (event :year 2001 :month October :dayOfMonth 15 :hour 15 :minute 45 :command (sendCommandToDevice :deviceName TV: type ask :command (alterSettings :isOn 1 :channel 5))) CRESCENT
Data Warehouses • An organization-wide snapshot of data, typically used for decision-making • Evolved via consultants, RDBMS vendors, and startup companies. • All had something to prove; to "differentiate their product". • Researchers making progress cleaning up the BIG mess they created • A DBMS that runs decision-making queries efficiently sometimes called a "Decision Support System" DSS • OLAP (on-line analytical processing) is 1 class of DSS queries • DSS systems and warehouses are typically separate from the on-line transaction processing (OLTP) system • Data Mart • a mini-warehouse -- typically a DSS for one aspect or branch of a company, with lots of relatively homogeneous data (i.e. a straight DSS) 02.15.04 from http://redbook.cs.berkeley.edu/lec28.html CRESCENT
Warehouse/DSS properties • Very large: 100gigabytes to many terabytes • Tends to include historical data • Workload: mostly complex queries that access lots of data, and do many scans, joins, aggregations. Tend to look for "the big picture". • Updates pumped to warehouse in batches (overnight) • Data may be heavily summarized and/or consolidated in advance (must be done in batches too, must finish overnight). • Research work has been done (e.g. "materialized views") -- a small piece of the problem. 02.15.04 from http://redbook.cs.berkeley.edu/lec28.html CRESCENT
Data Warehouses 02.15.04 from http://redbook.cs.berkeley.edu/lec28.html CRESCENT
Data Warehouses • Data Cleaning • Data Migration: simple transformation rules (replace "gender" with "sex") • Data Scrubbing: use domain-specific knowledge (e.g. zip codes) to modify data. Try parsing and fuzzy matching from multiple sources. • Data Auditing: discover rules and relationships (or signal violations thereof). Not unlike data mining. • Data Loading • can take a very long time! (Sorting, indexing, summarization, integrity constraint checking, etc.) Parallelism a must. • Full load: like one big xact – change from old data to new is atomic. • Incremental loading ("refresh") makes sense for big warehouses, but transaction model is more complex – have to break the load into lots of transactions, and commit them periodically to avoid locking everything. Need to be careful to keep metadata & indices consistent along the way. 02.15.04 from http://redbook.cs.berkeley.edu/lec28.html CRESCENT
Looking Ahead • Using the data we have • Prediction • Decision making • Problem Solving • Getting better over time… • Reinforcement learning • Updating • Bayesian networks • Neural networks • Rules and cases CRESCENT
Looking Ahead: Data Mining & Prediction • Find patterns • Verify user supplied patterns • Generate patterns • Sequences – HARD! • Noise • Missing data CRESCENT
Decision Making: Bayes Nets • What assumptions and methods allow us to turn observations into causal knowledge, and how can even incomplete causal knowledge be used in planning and prediction to influence and control our environment? * • One solution: Bayesian nets • a.k.a. Bayes nets, Bayesian networks, belief networks • *From from “Causation, Prediction, and Search, 2nd Edition”, Spirtes, Glymour & Scheines CRESCENT
Problem Solving • Rule-based systems • Case-based reasoning • Neural networks • Influence diagrams CRESCENT
Looking Ahead: Reinforcement Learning • "RL is learning what to do --- how to map situations to actions --- so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most machine learning, but instead must discover which actions yield the most reward by trying them." from Reinforcement Learning: An Introduction. • MDP & semi-MDP: assumptions about how world can be described and that you don’t have to remember the past. • Agents in a state can choose actions to take in an environment. • Choice (decision) is rewarded or punished • Agent learns to make better choices • Model can be stored in database. May have many states/actions/probabilities to store. CRESCENT
More information • Filip Perich, Anupam Joshi, Tim Finin, and Yelena Yesha, “On Data Management in Pervasive Computing Environments. IEEE Transactions on Knowledge and Data Engineering, October 12, 2003 • http://ebiquity.umbc.edu/v2.1/_file_directory_/papers/3.pdf • Fundamentals of Database Systems, 4th edition. Elmasri and Navathe. • http://mavhome.uta.edu/publications.html • Reinforcement learning • http://www.aaai.org/Pathfinder/html/reinf.html • http://reinforcementlearning.ai-depot.com/Tutorials.html CRESCENT