100 likes | 205 Views
Nonstationarities in teletraffic data which may spoil your statistical tests. Piotr Żuraniewski (UvA/TNO/AGH) Felipe Mata (UAM), Michel Mandjes (UvA), Marco Mellia (POLITO). Stationarity. Many models assume stationarity: statistical properties do not change over time
E N D
Nonstationarities in teletraffic data which may spoil your statistical tests Piotr Żuraniewski (UvA/TNO/AGH) Felipe Mata (UAM), Michel Mandjes (UvA), Marco Mellia (POLITO)
Stationarity • Many models assume stationarity: statistical properties do not change over time • strong stationarity: all statistical properties remain the same over time • weak stationarity: statistical properties up to second order (mean, variance, covariance) remain unchanged
Nonstationarity – problems • Real life: things are changing… • Bad news: sample stationarity can not be positively verified • Best answer we can get: ‘we found no evidence of given type of nonstationarity’ • Some examples: • mean shift • polynomial deterministic trend • variance change
Example • Change in the number of users in VoIP system • Model: load change in M/G/inf queue • Sample ACF suggests very high correlation • slow decay? • long range dependency?
Example • Changepoint detection procedure we developed allows to separate parts with different load • There is no significant correlation in either of this parts • Sample ACF does not estimate ACF in case of nonstationarity
Changepoint detection • Window of 50 samples presented to detection procedure • Add newest observation, drop oldest and repeat detection procedure • In this example: true change in window number 51 • Changepoint detection works well – see output of 500 experiments
Changepoint detection • However, if we add deterministic trend, things go wrong • Observe high false alarm ratio after polluting data with trend
Work in progress • Real VoIP data from Italian service provider and aggregated IP data from Spanish university backbone network • Current research: estimate and remove trend from traffic • Only than apply changepoint detection procedure(s)
Work in progress • Trend estimation methods: • moving average? • kernel/wavelets smoothing? • parametric methods? • time series regression? • How to judge if estimated trend is really significant? • Models different than M/G/inf?
Conclusions • Different types of nonstationarities may severely influence statistical tests or values of estimators • Even if we try to detect one type of nonstationarity, the other type may ruin our original test • We always have to pay attention to the assumptions of the theorems used • Share your experience!