Keigo Yoshida , Minoru Inui, Takehisa Yairi, Kazuo Machida

2nd Workshop on Domain Driven Data Mining, Session I： S2208 Dec. 15, 2008 Palazzo dei Congressi, Pisa, Italy Identification of Causal Variables for Building Energy Fault Detection by Semi-supervised LDA & Decision Boundary Analysis Keigo Yoshida, Minoru Inui, Takehisa Yairi, Kazuo Machida (Dept. of Aeronautics & Astronautics, the Univ. of Tokyo) Masaki Shioya, and Yoshio Masukawa (Kajima Corp.)

Main Point of the Presentation We propose … A Supportive Method forAnomaly Cause Identification by Combining Traditional Data Analysis and Domain Knowledge Applied to Real Building Energy Management System (BEMS) Root cause of energy wastes was found successfully

Outline • Introduction • Theories • Experiments for Real Data • Conclusions

I/F BEMS Introduction: What is BEMS ? • Building Energy Management Systems • Collect/Monitor Sensor Data in BLDG (temperature, heat consumption etc…) • Energy-efficient Control • Discover Energy Faults (wastes)

Introduction: Problem of BEMS • Hard to identify root causes of Energy Faults (EF) • Complex Relation between Equipments • Data Deluge from Numerous Sensors (approx. 2000 sensors, 20000 points for 20-story) • Current EF Detection: Heuristics Based on Expert’s Empirical Knowledge, usually fuzzy “IF-THEN” rules. “Heuristic Diagnostics is Incomplete” • Fuzziness False Negative Error • Detection-Only Cannot Improve Systems

Data-Driven Knowledge-Based Modeling-Based Experts Data Source Easy Hard Interpretation Expensive Low Modeling Cost Poor High Versatility Early Fault Diagnosis Methods Performance • Feature Extraction • Neural Networks… • FTA/FMEA • Bayesian • Filtering • FDA… Expert System Fuzzy Logic Supervised Learning Unsupervised Learning / Data Mining Knowledge Acquisition Bottleneck Neglecting Useful Knowledge

Data-Driven Knowledge-Based Modeling-Based Experts Data Source Easy Hard Interpretation Expensive Low Modeling Cost Poor High Versatility Proposed Method Performance Proposal Domain Knowledge + Data Analysis Expert System Fuzzy Logic Supervised Learning Unsupervised Learning / Data Mining - Characteristics - Interpretation: exploit domain knowledge Cost: not so high, empirical knowledge only Versatility: easy to apply to various domains & problems Performance: better than heuristics

* Assumption * Incomplete heuristics surely represent abnormal phenomena Variable Identification Contribution to EF Variable # Conceptual Diagram Learning Boundary Experts Detection Rule e.g. Feedback Data Distribution Acquire Reliable Labels with Given Rule DBA Semi-supervised LDA

Outline • Introduction • Theories • Semi-Supervised Linear Discriminant Analysis • Decision Boundary Analysis • Experiments for Real Data • Conclusions

Semi-supervised LDA Learning Boundary Data Distribution Acquire Reliable Labels with Given Rule

Manifold Regularization [M. Belkin et al. 05] Labeled data only • Regularized Least Square Penalty Term (usually squared function norm) Squared loss for labeled data

Squared loss Penalty Term Additional term for intrinsic geometry : graph Laplacian Manifold Regularization [M. Belkin et al. 05] Labeled data only • Regularized Least Square • Laplacian RLS: Penalty Term (usually squared function norm) Squared loss for labeled data Use labeled & unlabeled data Assumption: Geometrically close ⇒ similar label

Regularizer Semi-Supervised Linear Discriminant Analysis (SS-LDA) • LDA seeks projection for small within-cov. & large between-cov. • Regularized Discriminant Analysis: [Friedman 89] • Semi-Supervised Discriminant Analysis (SS-LDA): Between-class Within-class

Learning Boundary Data Distribution Acquire Reliable Labels with Given Rule Semi-supervised LDA Decision Boundary Analysis

Learned Boundary Top view Cross-section view Class 2 Class 1 Normal vec. : disciminantly informative : discriminantly redundant Decision Boundary Analysis • Feature Extraction method proposed by Lee & Landgrabe C. Lee & D. A. Landgrabe. Feature Extraction Based on Decision Boundary, IEEE Trans. Pattern Anal. Mach. Intell. 15(4): 388-400, 1993 • Extract informative features from normal vectors on the boundary

Decision Boundary Feature Matrix • Define responsibility of each variables for discrimination • Linear: • Nonlinear:

Outline • Introduction • Theories • Experiments • Application to Energy Fault Analysis • Conclusions

Energy Fault Diagnosis Problem EF: Inverter overloaded Detection Rule 6h M.A. of Inverter output = 100 EF … but I don’t know the cause cold Inverter hot coil Air Handling Unit humidity

DATA cold & Inverter RULE hot coil Air Handling Unit humidity Energy Fault Diagnosis Problem EF: Inverter overloaded Detection Rule 6h M.A. of Inverter output = 100 EF … but I don’t know the cause Find out root cause of inverter overload

NN = 5, Energy Fault Diagnosis - Settings • Air-conditioning time-series sensor data for 1 unit • instances: 744 • Labeled sample:10for each (3% of all) (based on probability proportional to distance from boundary) • Hyper-parameters: • 13 attributes, all continuous

Experimental Results

0 20 40 60 80 100 Contribution Score [%] Results (100 times ave.) Inverter <LDA> Inverter (96%) Trivial

0 20 40 60 80 100 Contribution Score [%] Results (100 times ave.) SA Temp. Cooling water <LDA> Inverter (96%) <SSLDA> Cool water (75%) SA temp. (12%)

0 20 40 60 80 100 Contribution Score [%] Results (100 times ave.) Not Distinctive ! <LDA> Inverter (96%) <SSLDA> Cool water (75%) SA temp. (12%) <KDA> Cool water (19%) MA. Pressure (15%) Inverter (15%) …

0 20 40 60 80 100 Contribution Score [%] Results (100 times ave.) [1] SA Temp. [2] SA Setting Inverter [3] Cooling water <LDA> Inverter (96%) <SSLDA> Cool water (75%) SA temp. (12%) <KDA> Cool water (19%) MA. Pressure (15%) Inverter (15%) <SSKDA> Inverter (33%) SA temp (19%) Cool Water (17%) SA setting (13%) …

Energy Fault Diagnosis: Examine Row Data • Cooling water valve Opening [3] valve opens completely, but this is result of EF, not cause

deviation of SA temp. Energy Fault Diagnosis: Examine Row Data • Cooling water valve Opening valve opens completely, but this is result of EF, not cause • SSLDA/SSKDA show SA temp. [1] & setting [2] responsible • To reduce this deviation… • Operate inverter at peak power • Open cooling water valve

Evaluation

Outline • Introduction • Theories • Experiments for Real Data • Conclusions

Conclusions • Introduce identification method of causal variables by combining semi-supervised LDA & DBA • Labels are acquired from imperfect domain-specific rule • SS-LDA/SS-KDA: reflect domain knowledge & avoid over-fitting • DBA: extract informative features from normal direction of boundary • Apply to energy fault cause diagnosis • Succeeded in extracting some responsible features beginning with fuzzy heuristics based on domain knowledge

Room for improvements • Consider temporal continuity • Time-series is not i.i.d. • Find True Cause from Correlating Variables

Thank you for your kind attention

Discussions

Minor improvements • Optimize Hyper-parameters • AIC, BIC, … • Cross Validation • Regularization Term • L1-norm will give sparse solution • Comparison to other discrimination methods • SVM • Laplacian SVM… etc.

Extension to Multiple Energy Faults • In real systems, various faults take place • Fault cause varies among phenomena • Need to separate phenomena and diagnose respectively <Our Approach> 1. Extract points detected by existing heuristics 2. Reduce dimensionality and visualize data in low-dim. space 3. Clustering data and give them labels 4. Identify variables discriminating that cluster from normal data

Experimental Condition & Results • Air-conditioning sensor data, 13 attributes, same heuristics • 748 instances, operating time only (hourly data for 2 months) • 137 points are detected by heuristics • Reduce dimensionality by isomap [J.B. Tenenbaum 00] (kNN = 5) • Contribution score is given by SS-KDA (kNN = 5, ) <2D representation> 2 major cluster, 4 anomalies

Room air Temp. superficial Contribution score for red points Experimental Condition & Results • Air-conditioning sensor data, 13 attributes, same heuristics • 748 instances, operating time only (hourly data for 2 months) • 137 points are detected by heuristics • Reduce dimensionality by isomap [J.B. Tenenbaum 00] (kNN = 5) • Contribution score is given by SS-KDA (kNN = 5, ) <2D representation> Deviation of Room Air Temp. around detected points Detected, this is EF 2 major cluster, 4 anomalies

Properly Controlled System Deviation Data Distribution

Linearly Separable for Cooling Water Valve [3] Cooling Water Valve [%] Data Distribution

: Distance from boundary of point Probabilistic Labeling • Points distant from boundary are reliable as class labels • Keep robustness against outliers Points are stochastically given labels based on reliability Rule outlier Unreliable

Feature space Input space Estimate DBFM • Linear Case: • Nonlinear Case Difficult to acquire points on boundary & calculate gradient vector Disciminant function is linear in feature space Kernelized SSLDA (SS-KDA)

Feature space DBFM for Nonlinear Distribution (1) 1. Generate points on boundary in feature space 2. Gradient vector at corresponding point for Gaussian kernel But to find pre-image is generally difficult… By kernel trick, pre-image problem is avoidable Input space

DBFM for Nonlinear Distribution (2) Finally we have gradient vectors on boundary for each point 3. Construct estimated DBFM • Define responsibility of each variables for discrimination Max. eigenvalue

質問されそうなこと • リアルタイム性は？ • 事後処理を想定 • 他の手法と比較したか？なぜLDAか？ • SVMでも適用できるので試したい • なぜこういう結果になったのか • 原因変数のデータを見ると線形判別は難しい

Keigo Yoshida , Minoru Inui, Takehisa Yairi, Kazuo Machida