180 likes | 319 Views
Optimizing the Efficiency of the NCAR-Wyoming Supercomputing Center Facility A Software Perspective. Theophile Nsengimana Collaborator: Ademola Olarinde Mentor: Aaron Andersen August 1, 2014. Project Goals. Work on Software Tools to Automate: Collection of NWSC Building Sensor Data
E N D
Optimizing the Efficiency of the NCAR-Wyoming Supercomputing Center Facility A Software Perspective Theophile Nsengimana Collaborator: AdemolaOlarinde Mentor: Aaron Andersen August 1, 2014
Project Goals • Work on Software Tools to Automate: • Collection of NWSC Building Sensor Data • Quality Control • Visualization of Key Building Parameters • Work with AdemolaOlarinde: • Software Tools • Requirements • Data Exploration
Preplanned Method: Store sensor-based raw data generated by Johnson Controls Inc. into a data store. sMAP (Simple Measurement and Actuation Profile) from Berkeley. [1] Current Method: Restructuring raw data into CSV file(s). Data Collection
sMAP Overview • An open source modular software designed fundamentally to ease the collection, storage and retrieval of time series data. • Time series source: archival and real-time from sensors. • Time series can be tagged with metadata
sMAP Components • sMAP sources • Connect to physical sensors to expose the data they generated to sMAParchiver (repository) via http. • Real-time data.
sMAP components • sMAParchiver • A high-performance data store • Connects to both relational and time series databases. (Postgres for metadata storage and Readingdb for time series storage) • Applications • Make use of data: visualization, computing control optimal strategies, etc.
Challenges with sMAP • General • Documentation • Community support • Relatable to this project • Couldn’t load Archival data whose time was out of the range [(time_now – 24 hours), time_now]
Quality Control • Proper Formatting • Hourly (or daily, monthly, yearly) Interval as opposed to 15-minutes interval • Match time format across all generated csv files • Handle missing data • Eliminate irrelevant data • Merge properly formatted csv files into one csv file for faster access
Quality Control defday_interval(self): BY = ‘day’ init_dt, init_value= self.init_data.popitem() samedts = self.getSameDatetimes(self.init_data, init_dt, BY) #find datetimes with same day as init_dt value = init_value#the sum of values of ‘samedts’ missingCounter= 0#missing data zeroCounter= 0#how many times this sensor has been offline. forndtinsamedts: ch_value= self.init_data.pop(ndt) try: if float(ch_values)!= 0: value += float(ch_values) elif: zeroCounter += 1 exceptValueError: missingCounter += 1 #calculate the average if missingCounter == len(samedts): value = ‘M’ elif (zeroCounter == len(samedts)) or (zeroCounter > 0 and (zeroCounter + missingCounter == len(samedts))): value = 0 else: value = value / ( len(samedts) – (missingCounter + zeroCounter)) self.fin_data[self.makedt(init_dt, BY)] = value
Data Analysis & Visualization • Python packages • Matplotlib • Numpy • Main focus • Statistical correlation • Basic Control Charts • Plots
Statistical Correlation • How a sensor’s performance is related to other various factors. • E.g., Cooling Towers vs Outside Air and Computer Load Cooling Towers’ correlations against time, wet-bulb, wet-bulb depression, dry-bulb and computer load.
Control Charts • Whether a particular sensor’s performance is in a statistical control. • E.g., evaluate the performance of Condenser Water pumps.
Plots • Visualize and understand the relationship between sensors’ performance and other factors (or sensors). • E.g., Condenser Water Pumps vs time.
Summary • Development of software tools to facilitate the process of collecting data from NWSC, performing basic quality control, and analyzing as well as visualizing key parameters from data collected.
Future Work • Two possibilities: • Configuring sMAP properly and develop a high level custom application on top of sMAParchiver for NWSC staff to monitor the facility.
Future Work • Two possibilities: • Stick with CSV files and take advantage of Google Fusion Tables, an experimental yet powerful data visualization web application developed by Google. [2]
References [1] sMAP 2.0 documentation http://www.cs.berkeley.edu/~stevedh/smap2/ [2] Google Fusion Table Help Center https://support.google.com/fusiontables/?hl=en
Thank you Theophile Nsengimana nsengimana.theophile@philander.edu