1 / 65

Trends, seasonality and anomalies: making your time-series talk

Trends, seasonality and anomalies: making your time-series talk. Wladimir J. Alonso Fogarty International Center / NIH. Goals for of this talk. Learn how to extract the basic components of epidemiological relevance from a time-series

cassia
Download Presentation

Trends, seasonality and anomalies: making your time-series talk

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Trends, seasonality and anomalies: making your time-series talk Wladimir J. Alonso Fogarty International Center / NIH

  2. Goals for of this talk Learn how to extract the basic components of epidemiological relevance from a time-series Learn how to explore the spatial patterns of those components Introduce the modeling tool Epipoi(www.epipoi.info)

  3. But before this…

  4. A parenthesis for “Graphical Excellence” • well-designed presentation of interesting data – a matter of substance, statistics and design • consists of complex ideas communicated with clarity, precision and efficiency • is nearly always multivariate • requires telling the truth about the data • Provides the viewer with the greatest number of ideas in the shortest time with the least ink in the smallest space Edward Tufte (1983)

  5. Napoleon's Retreat from Moscow, 1812by Illarion Pryanishnikov

  6. Charles Joseph Minard (1861): Losses suffered by Napoleon's army in the Russian campaign of 1812 "It may well be the best statistical graphic ever drawn“ (Edward R. Turfte, 1983)

  7. First: Organize your dataset in a meaningful way A typical mortality dataset

  8. Time in a meaningful sequence Variables in meaningful sequence Structured spreadsheet as a source of instantaneous analysis • - Age groups • - Causes of deaths • Longitude • Latitude…

  9. So you can plot in this way: Trends, anomalies, seasonality and even spatial can be seen Alonso et al 2011 Spatio-temporal patterns of diarrhoeal mortality in Mexico. Epidemiol. Infect

  10. We can use this display to see the shift in the timing of RSV circulation in São Paulo city and its implications for immunoprophylaxis period of palivizumabe prophylaxis Paiva et al 2012 JMV

  11. And then we can use a different plot for displaying the epidemiologic and putative explanatory series Paiva et al 2012 JMV

  12. In fact, sometimes a simple organization of data in space can generate all the information we need! This is a quick example on how we found that (surprisingly!) the Northern hemisphere timing of the vaccine would be more efficient than the current Southern timing for Brazil Mello et al 2010 PLoS One

  13. influenza viruses isolated monthly from 1999 to 2007 in Belém and São Paulo Belém São Paulo Influenza virus isolated plotted exactly in their time of collection Mello et al (2010)

  14. Now we overlap the Southern andNorthernHemisphere recommendations

  15. And count first the matches obtained with the Southern Hemisphere recommendation… 11 matches

  16. And compare with the matches if the Northern Hemisphere timing of the vaccine and composition were applied 24 matches!

  17. Part 1: How to extract the basic components of epidemiological relevance from a time-series?

  18. Brazilian dataset of deaths coded as pneumonia and influenza We are going to extract as much information as possible from this series

  19. Brazilian dataset of deaths coded as pneumonia and influenza • Example of analyses performed in Schuck-Paim et al 2012 Were equatorial regions less affected by the 2009 influenza pandemic? The Brazilian experience. PLoS One. • Data source: Department of Vital Statistics from the Brazilian Ministry of Health

  20. Series to be analyzed Typical epidemiological time series from where to obtain as many meaningful and useful parameters as possible

  21. Average Many times this information is all we need! mortality at time t

  22. Average But, it still leaves much of the variation (“residuals”) of the series unexplained … the first of which seems to be an “unbalanced” between the extremities mortality at time t

  23. Linear trend • Better now!

  24. Trend (linear) We can use this information (e.g. is the disease increasing/decreasing? - but then the data needs to be incidence) Mortality at time t Mean Mortality Linear trends

  25. Trend (with quadratic term too) • Better definition • It gets more complicated as a parameter to be compared across time-series • But better if our purpose is eliminate the temporal trend Mortality at time t Quadratic trends

  26. Getting rid of the trend Blue line: “detrended series”

  27. But let’s keep the graphic of the original series for illustrative purposes Clearly, there are still other interesting epidemiological patterns to describe… Mortality at time t Mean Mortality Linear and quadratic trends

  28. We can see some rhythm… • The block of residuals alternates cyclically • Therefore this is something that can be quantified using few parameters Mortality at time t Mean Mortality Linear and quadratic trends

  29. Jean Baptiste Joseph Fourier(1768 –1830)

  30. The Fourier theorem states that any waveform can be duplicated by the superposition of a series of sine and cosine waves As an example, the following Fourier expansion of sine waves provides an approximation of a square wave Source: http://www.files.chem.vt.edu/chem-ed/data/fourier.html

  31. Fourier decomposition • the periodic variability of the monthly mortality time-series is partitioned into harmonic functions. • By summing the harmonics we obtain what can be considered as an average seasonal signature of the original series, where year-to-year variations are removed but seasonal variations within the year are preserved • This method is not always appropriate when dealing with complex population time series, since it cannot take into account the often-observed changes in the periodic behavior of such series (i.e., they are not “stationary”).

  32. Before modeling cycles: …so, remembering, these are the residuals before Fourier Mortality at time t Mean Mortality Linear and quadratic trends

  33. … and now with the incorporation of the annual harmonic Mortality at time t Annual harmonic Mean Mortality trends

  34. or with the semi-annual harmonic only? Mortality at time t semiannual harmonic Mean Mortality trends

  35. Much better when the annual + semi-annual harmonics are considered together! Mortality at time t Annual and semi-annual harmonics Mean Mortality trends

  36. Although not much difference when the quarterly harmonic is added… Mortality at time t Periodic (seasonal) components Mean Mortality trends

  37. average seasonal signature of the original series • We obtained therefore the average seasonal signature of the original series (where year-to-year variations are removed but seasonal variations within the year are preserved) • Now, let’s extract some interest parameters (remember, we always need a “number” to compare, for instance, across different sites)

  38. Timing and Amplitude average seasonal signature of the original series

  39. 5 0 -5 -10 -15 -20 -25 -30 -35 Variations in relative peak amplitude of pneumonia and influenza coded deaths with latitude Alonso et al 2007 Seasonality of influenza in Brazil: a traveling wave from the Amazon to the subtropics. Am J Epidemiol Latitude (degrees) (p < 0.001) 0 10 20 30 40 50 60 70 80 90 Amplitude of the major peak (%)

  40. 5 0 -5 -10 -15 -20 -25 -30 -35 The seasonal component was found to be most intense in southern states, gradually attenuating towards central states (15oS) and remained low near the Equator Latitude (degrees) (p < 0.001) 0 10 20 30 40 50 60 70 80 90 Amplitude of the major peak (%)

  41. 5 0 -5 -10 -15 -20 -25 -30 -35 Variations in peak timing of influenza with latitude (p < 0.001) Latitude (degrees) J F M A M J J A S O N D Phase of the major peak (months of the year)

  42. 5 0 -5 -10 -15 -20 -25 -30 -35 Peak timing was found to be structured spatio-temporally: annual peaks were earlier in the north, and gradually later towards the south of Brazil (p < 0.001) Latitude (degrees) J F M A M J J A S O N D Phase of the major peak (months of the year)

  43. 5 0 -5 -10 -15 -20 -25 -30 -35 Such results suggest southward waves of influenza across Brazil, originating from equatorial and low population regions and moving towards temperate and highly populous regions in ~3 months. (p < 0.001) Latitude (degrees) J F M A M J J A S O N D Phase of the major peak (months of the year)

  44. But can we still improve the model? Yes, and in some cases we should, Mostly to model excess estimates e.g. pandemic year Mortality at time t Periodic (seasonal) components Mean Mortality trends

  45. Residuals after excluding “atypical” (i.e. pandemic) years from the model To define what is “normal” it is necessary to exclude the year that we suspect might be ‘abnormal’ from the model

  46. Ok, so now we can count what was the impact of the pandemic here right?

  47. No! (unless you consider all the other anomalies pandemics(andanti-pandemics…) That is why we need to include usual residual variance in the model, and calculate excess BEYOND usual variation

More Related