1 / 42

Smart Home Technologies

Smart Home Technologies. Data Management and Databases . Databases for Smart Homes. Requirements Database Types Database Technologies Smart Home Databases Data Mining. Data Storage Requirements. Sensor data Temperature (15 @ 8 Kbps) Humidity (15 @ 8 Kbps) Gas (15 @ 8 Kbps)

rowdy
Download Presentation

Smart Home Technologies

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Smart Home Technologies Data Management and Databases

  2. Databases for Smart Homes • Requirements • Database Types • Database Technologies • Smart Home Databases • Data Mining

  3. Data Storage Requirements • Sensor data • Temperature (15 @ 8 Kbps) • Humidity (15 @ 8 Kbps) • Gas (15 @ 8 Kbps) • Light (15 @ 8 Kbps) • Motion (15 @ 8 Kbps) • Pressure (100 @ 8 Kbps) • Microphone (15 @ 500 Kbps) • Camera (15 @ 10 Mbps)

  4. Data Storage Requirements • User data • Multimedia • Phone messages/conversations (500 Kbps – 10 Mbps) • Music (500 Kbps) • TV/Radio broadcasts (500 Kbps – 10 Mbps) • Home movies (10 Mbps) • Images • Computer • Programs • Data files • Operating systems

  5. Data Storage Issues • Issues • Query frequency and type • Sampling/recording rates • 205 sensors (158,900 Kbps) • Multimedia recordings • Simultaneous playback • Analysis, prediction, decision-making queries • Transaction granularity • Historical data, decay • Security and privacy • Centralized vs. distributed

  6. What Data to Store • Type of Data • Raw data • Pre-processed • Compressed • Frequency of Data Storage for Sensor Data • Tradeoff between precision and quantity

  7. Sensor Data Example • 9/8/2002 2:0:1 AM~A5 (Coffee Maker) ON • 9/8/2002 1:6:59 AM~A9 (A/C) ON • 9/8/2002 3:58:52 AM~A0 (Stereo) ON • 9/8/2002 5:57:0 AM~A2 (Kitchen Light) ON • 9/8/2002 3:1:42 AM~A5 (Coffee Maker) OFF • 9/8/2002 7:8:3 AM~A3 (Stove) ON • 9/8/2002 12:54:52 PM~A10 (Bathroom Light) ON • 9/8/2002 4:58:5 AM~A0 (Stereo) OFF • 9/8/2002 8:1:20 AM~A3 (Stove) OFF • 9/8/2002 9:6:10 AM~A8 (Computer) ON • 9/8/2002 10:8:19 AM~A4 (Bathtub Heater) ON • 9/8/2002 11:9:4 AM~A0 (Stereo) ON • 9/8/2002 9:4:5 AM~A8 (Computer) OFF • 9/8/2002 10:9:4 AM~A4 (Bathtub Heater) OFF • 9/8/2002 2:2:5 PM~A10 (Bathroom Light) OFF • 9/8/2002 2:52:37 PM~A0 (Stereo) OFF • 9/8/2002 4:2:0 PM~A9 (A/C) OFF

  8. Media Viewing Example

  9. Multimedia Example • Digital Silhouettes (Predictive Networks) • Predicting web surfing behavior ($$$) • Microsoft (2002) track TV viewing preferences • 140 data items for each user • Demographics (50) • Subcategories within gender, age, income, education, occupation, and race • 90 Content preferences • golf, music, yoga

  10. Database Types / Data Models • Relational • OO • Hybrid (Object-Relational) • Temporal • Deductive • Others • Spatial, …

  11. Example Data Representations • Relational • We all know…flat tables of atomic attributes with foreign key relationships • OO • Complex data reps • multivalued, composite • Temporal • Relational model: add valid start, end dates to each table (versions of info and when valid) • Includes time, events, durations…

  12. Operations • DDL/DML (data def/manip languages) • SQL • OQL • Update operations • Built-in insert, delete, update • Stored procedures for triggers, active (ECA) rules

  13. Example Operations for Temporal Databases • INCLUDES • Rows valid in a certain time period • BEFORE/AFTER a time condition • Set operations • Union, intersection of 2 time periods

  14. Active DB • Event-Condition-Action rules • Allow for decisions to be made in the database instead of a separate application • Relational • Implemented as triggers • Challenges • Rule consistency • (2+ rules do not contradict) • Guaranteed termination • Trigger loops (T1 <->T2)

  15. Smart Home Active DB Example • Java, Postgres, Jess rules • Event classification (local&composite) • Data Manipulation Events • TV show being viewed (channel, time, genre…) • Temporal Events (instance,recurring) • Set temp to 70 degrees at 7:00am workdays • Exception Events • Power failure • Behavioral Events • Time children home from school; dinner time

  16. Active DB Example (TCU)

  17. Distributed vs. Centralized • Centralized database can produce a bottleneck • Large volume of data input • Large database • Large volume of queries • In distributed databases, data consistency, replication, and retrieval can be more problematic • Consistency of schemas • Retrieval in case the data location is not known • Communication overhead to ensure database consistency

  18. SmartHome Database Architecture • Centralized vs. distributed? • Answer: Both • Central storage of high demand, persistent data • Distributed storage of low demand, dynamic data • Distributed queries • Push processing toward sensors • Adaptive, hierarchical organization • End-effector autonomy (“smart sensor”)

  19. Commercial DB2 Empress Informix Oracle MS Access MS SQL Sybase Free Berkeley DB PostgreSQL MySQL Database Systems

  20. UTA MavHome DB • Active • Reactive & proactive (e.g., to predict) • Distributed • Information collection agents • Rules • Local Agent: what data they need to collect • Distributed: coordinate overall monitoring of collected information • Continuous monitoring of events • Extension of SNOOP

  21. Microsoft Easy Living DB (2002) • Relational • Fast & robust, but awkward for some data • World Model DB Describes: • Computing devices • People and their personal preferences/settings • Services • Rooms and doorways • Serves as Abstraction Layer between sensors and application that use data from sensors • e.g. new sensors  no change to applications

  22. Stanford Interactive Workspace • Uses LORE • A semi-structured XML DB system • Still available, but work stopped in 2000 • Data stored is catalog of (index to) • documents, images, 3-D models, application-specific domain models

  23. Sensor Database Systems • COUGAR project • www.cs.cornell.edu/database/cougar • Query processing over ad-hoc sensor networks • Small database component (QueryProxy) at each sensor • Sensor clusters provide local aggregations (e.g., min, max, mean) • Assumes centralized index of all data sources

  24. Siemens Netabase • “The network is the database.” • Navas and Wynblatt, ACM SIGMOD 2001 • Sensor networks • Large number of data sources (105) • Volatile data and data organization • “Thin” data servers on scaled-down hardware • Netabase approach • Query decomposition • Characteristic routing (ala IP routing) • Local joins • Query evaluation

  25. Siemens Netabase • www.netabasesoftware.com

  26. Data Warehouses • Repositories for data mining activities • Aggregates/summaries of data help efficiency • Optimized for decision-support, not transaction processing • Definition (Elmasri, page 900) • A subject-oriented, integrated, non-volatile, time-variant collection of data in support of management’s decisions” • Replace “management”, with “smart home agents”

  27. Warehouse Properties • Very large: 100gigabytes to many terabytes • Tends to include historical data • Workload: mostly complex queries that access lots of data, and do many scans, joins, aggregations.  Tend to look for "the big picture".  • Updates pumped to warehouse in batches (overnight) • Data may be heavily summarized and/or consolidated in advance (must be done in batches too, must finish overnight).  • Research work has been done (e.g. "materialized views") -- a small piece of the problem. 02.15.04 from http://redbook.cs.berkeley.edu/lec28.html

  28. Data Warehouses • Data Cleaning • Data Migration: simple transformation rules (replace "gender" with "sex") • Data Scrubbing: use domain-specific knowledge (e.g. zip codes) to modify data. Try parsing and fuzzy matching from multiple sources. • Data Auditing: discover rules and relationships (or signal violations thereof). Not unlike data mining. • Data Loading • can take a very long time! (Sorting, indexing, summarization, integrity constraint checking, etc.) Parallelism a must. • Full load: like one big xact – change from old data to new is atomic. • Incremental loading ("refresh") makes sense for big warehouses, but transaction model is more complex – have to break the load into lots of transactions, and commit them periodically to avoid locking everything.  Need to be careful to keep metadata & indices consistent along the way. 02.15.04 from http://redbook.cs.berkeley.edu/lec28.html

  29. Data Warehouses 02.15.04 from http://redbook.cs.berkeley.edu/lec28.html

  30. Data Mining Definition • Discovery of new information in terms of patterns or rules from vast amounts of data • Extracts patterns that can’t readily be found by asking the right questions (queries) • TOO MUCH DATA FOR HUMANS • Emerged from • Artificial Intelligence:Machine learning, Neural nets, Genetic Algorithms • Statistics • Operations Research

  31. Data Mining Steps • Data selection -- pick the data needed • Data cleansing • Fix bad data (e.g., spelling, zip codes) • Hard to deal with missing, erroneous, conflicting, redundant data • Enrichment • Add data (e.g., age, gender, income) • Data transformation • Aggregate (e.g., zip codes  regions) • Data mining • Reporting on discovered Knowledge

  32. Types of Results • Association rules • Buy diapers  buy lots of beer • Sequential patterns • Buy house  buy furniture within months • Classification trees • Types of buyers (upscale,bargain-conscience, …) • Why do it? • Make more money • Science & medicine

  33. Data Mining Goals • Find patterns to predict future events • Find major groupings • Groupings of buyers, stars, diseases … • Find which group something belongs to • creditworthiness

  34. Data Mining Results • Association rules • Classification hierarchies • Clustering • Sequential patterns • Patterns within time series • Type of result, inputs & algorithms vary • Often interested in some combination of these types of Knowledge

  35. Clustering • Unsupervised learning techniques • Training samples are unclassified • Vs. supervised learning (classification) • Drug categories for depression • Categories of TV viewers • Categories of buyers (likely, unlikely) • Categories of households? • Single male, mother/children, conventional (M/D/kids), DINKs.

  36. Sequential Patterns • Detecting associations among events with certain temporal relationships • Example: • Cardiac bypass for blocked arteries • AND within 18 months, high blood urea • THEN kidney failure likely in next 18 months • Particularly important in smart homes

  37. Sequential Pattern Discovery • Sequence of itemsets • Grocery store purchases by 1 person (3 itemsets) • {soy milk, bread, chocolate}, {bananas, chocolate}, {lettuce, tomato, chocolate} • 2 Subsequences • {soy milk, bread, chocolate}, {bananas, chocolate}, • {bananas, chocolate}, {lettuce, tomato, chocolate}

  38. Sequential Pattern Discovery • The support for a sequence S is the % of the given set U of sequences of which S is a subsequence. • That is: how many times does S show up? • Find all subsequences from the given sequence sets that have a user-defined minimum support. • The sequence S1, S2, … Sn, is a predictor of “fact” that a customer that buys itemset S1 is likely to buy itemset S2, then S3, … • Prediction support based on frequency of this sequence in the past • Many research issues to create good algos

  39. Patterns Within Time Series • Finding 2 patterns that occur over time • 2003 stock prices of Choice Homes and Home Depot • 2 products show same sales pattern in summer but different one in winter • Solar magnetic wind patterns may predict earth atmospheric changes

  40. Time Series Pattern Discovery • Time series are sequences of events • Event could be a transaction (closing daily stock price) • Look at sequences over n days, or • Longest period in which change is no greater than 1% • Comparing • Must define similarity measures

  41. Other Approaches in Data Mining • Neural nets • Infer a function from a set of examples • Non-parametric curve-fitting • Interpolates to solve new problems • Supervised & unsupervised algorithms • Capabilities • classification • time-series prediction • Disadvantages • can’t see what it learned (not declarative)

  42. Other Approaches in Data Mining • Genetic algorithms • Set up • Representation (strings over an alphabet) • Evaluation (fitness) function • Parameters: # of generations, cross-over rate, mutation rate, etc. • Randomized (probabilistic operators), parallel search over search space • Used for problem solving and clustering

More Related