220 likes | 370 Views
TADS Data Application & Introduction to Statistical Analysis of WECC Transmission Reliability Database. Svetlana Ekisheva and Jessica Bian, NERC TADSWG meeting, Oklahoma City August 13-15, 2012. Topics. TADS data applications Overview of WECC Transmission Reliability Database Inventory
E N D
TADS Data Application & Introduction to Statistical Analysis of WECC Transmission Reliability Database Svetlana Ekisheva and Jessica Bian, NERC TADSWG meeting,Oklahoma CityAugust 13-15, 2012
Topics • TADS data applications • Overview of WECC Transmission Reliability Database • Inventory • Outages • AC circuit outages: Distribution study • Transmission circuit attributes as explanatory variables for the number of outages (the outage rate) • Correlation analysis and hierarchy of the numerical explanatory variables for the number of outages • Remarks on statistical analysis for character attributes • Next steps and discussion
Applications • TADSWG Report, Sept. 2007, Section 2.6 - Intended Uses and Limitations of Data and Metrics http://www.nerc.com/docs/pc/tadstf/TADS_PC_Revised_Final_Report_09_26_07.pdf 1. Outage cause analysis and outage Event analysis “Event analysis will aid in the determination of credible contingencies and will result in better understanding, and this understanding should be used to improve planning and operations. Ultimately, these improvements should result in improved transmission system performance.”
Applications (cont’d) 2. “Trending each Regional Entity’s performance against its own history will show how that region’s performance is changing over time. It will take a number of years of data collection (five years was suggested by several commenters) before the data can be useful for trend analysis. A through-time comparison is appropriate for evaluating a region’s performance.”
Introduction to Statistical Analysis of • WECC Transmission Reliability Database
WECC Transmission Reliability Database (TRD) • Inventory Data • Transmission Circuit • Transformer • Common Structure-Corridor • Outage Data • Transmission Circuit Inventory • For each line 22 columns with attributes (not all filled) • Combination of Transmission Owner (TO) and TO Element ID is unique • Outage Data • Outage entry contains the unique identifier • Otherwise, similar to a TADS entry • Unique Identifier links Inventory and Outage databases and allows one to study them together
TRD Analysis: Data • TRD 2010 • Transmission circuit inventory: 2090 lines • AC circuit outages: 1652 • Average outage rate: 0.79 outages per line • TRD 2011 • Transmission circuit inventory: 2144 lines • AC circuit outages: 1504 • Average outage rate: 0.70 outages per line • TRD 2010 and 2011 combined • Transmission circuit inventory: 1899 lines (common list for 2010 and 2011) • AC circuit outages: 2889 on these lines only • Average outage rate: 0.76 outages per line per year
2010-2011: Distribution of Outages • Number of outages per line approximately has a Poisson distribution with parameter λ=1.52 • Corresponds to annual rate per line λ/2=0.76 (as expected) • For Poisson distribution, all occurrences (outages) must be independent which is true only for Single Mode outages • Single mode outages: about 80% of all outages in 2011 • Outage events could be considered as independent • Will repeat the distribution analysis for outage events
Inventory Attributes 1 • Number of outages N for a given line varies from 0 to 37 • Are the line attributes and the number of outages N connected? • What attributes could be selected as the explanatory variables for N? • Numerical line attributes: • Length (miles) • Age (calculated from In service date) • Conductor per phase • Overhead ground wire • Circuit per structure • Elevation (not actual altitude but a code)
Inventory Attributes 2 • All remaining are character attributes: • From Bus • To Bus • Voltage class • Insulator type • Terrain etc. • Correlation analysis for the number of outages N and the numerical attributes of lines
Line Length: Distribution • Length ranges from 0 to 243 miles • Average line length: 35.3 miles • Standard deviation: 41 miles
Line Length: Correlation with N • Positive correlation of 0.44 between line length and N • Statistically significant correlation (P-value <0.0001) • On average, the lines that are 10*x miles longer have 0.16*x more outages a year than the shorter lines
Line Age: Distribution • Line Age ranges from 0.25 to 112.1 years • Average age: 32.5 years • Standard deviation: 23 years
Line Age: Correlation with N • Negative correlation of -0.08 between line age and N • Statistically significant correlation (P-value <0.001) • On average, if lines that are 10*x years older have 0.05*x less outages a year than the younger lines
Numerical Attributes: Hierarchy of Explanatory Variables • Coefficient of determination R² measures an “explanatory power” of explanatory variable X (as a regressor in the linear regression model for X and N) • For example, 19.4% of variability in the observed number of outages for a line can be explained by the variability in line length. • Together, these 5 variables cannot explain 25.9% of variability in N • The best multivariate linear model (that minimizes the mean square error) accounts for 19.75% of variability in N; this model involves Length and Age as regressors for N
Line Age: Distribution 2 (57 112-year-old lines removed) • Line Age ranges from 0.25 to 111.5 years • Average age: 27.6 years • Standard deviation: 19 years
Line Age 2: Correlation with N • Negative correlation of -0.06 between line age and N • Statistically significant correlation (P-value <0.01) • On average, if lines that are 10*x years older have 0.05*x less outages a year than the younger lines
Numerical Attributes 2: Hierarchy of Explanatory Variables • Coefficient of determination R² measures an “explanatory power” of explanatory variable X (as a regressor in the linear regression model for X and N) • For example, 19.5% of variability in the observed number of outages for a line can be explained by the variability in line length. • Together, these 5 variables cannot explain 25.8% of variability in N • The best multivariate linear model (that minimizes the mean square error) accounts for 19.84% of variability in N; this model involves Length and Age as regressors for N
Character Attributes • Character attributes in TRD are not categorical variables (with two values that can be coded as 0 and 1) • Voltage class • 200-299 kV (1530 lines): Mean 1.2, Std Dev 2.5 (outages for 2 years) • 300-399 kV (128 lines): Mean 4.1, Std Dev 5.4 (outages for 2 years) • 500-599 kV (241 lines): Mean 2.0, Std Dev 3.5 (outages for 2 years) • Distribution of outages differs by voltage class • The differences are statistically significant (pair-wise t-tests have P-value <0.001) • Should elevation be treated as a character variable (not an actual altitude but a code)?
Next Steps and Discussion • Element-Initiated and AC Substation-Initiated outages: separate analysis • Statistical Analysis by Outage Mode • Statistical Analysis by Initiating Cause Code • Extend the analysis to 2008-2009 • To study the total dataset with greater statistical confidence • To track year-over-year changes and time trends • Determine Credible Contingencies • Trend each Regional Entity’s performance