1 / 32

δαίμωνες

Devil. ATM Data:. The. in. Size , stationarity ,. δαίμωνες. and other. Massimiliano Zanin. 螃蟹挖洞根據他們的砲彈的大小. Crabs dig holes according to the size of their shells. Richard Ernest Bellman. Curse of dimensionality.

sappington
Download Presentation

δαίμωνες

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Devil ATM Data: The in Size, stationarity, δαίμωνες and other Massimiliano Zanin

  2. 螃蟹挖洞根據他們的砲彈的大小 Crabs dig holes according to the size of their shells

  3. Richard Ernest Bellman Curse of dimensionality Given the large search space, for every desired significance an hypothesis can be found A. Zimek, E. Schubert, and H. P. Kriegel (2012)

  4. Richard Ernest Bellman Curse of dimensionality

  5. Richard Ernest Bellman Curse of dimensionality

  6. Richard Ernest Bellman Curse of dimensionality J. Cohen, 1962 All papers in a volume of Journal of Abnormal and Social Psychology .18 for small effects .48 for medium effects .83 for large effects Average power: Half of the results are not significant!

  7. John P. A. Ioannidis Why Most Published Research Findings Are False Main reasons behind false results: Small studies (small sample size) High number of relationships Financial and other interests Hot scientific fields

  8. What about ATM? Two main problems: High feature space dimension Low number of instances

  9. What about ATM? Two main problems: High feature space dimension Low number of instances Feature selection techniques

  10. What about ATM? Two main problems: High feature space dimension Low number of instances Analyze events leading to these instances Get more data!

  11. Divide et impera Divide and rule

  12. Feature 2 Feature 1 Simpson’s Paradox Two groups of events Each one associated with a positive correlation between f1 and f2

  13. Feature 2 Feature 1 Simpson’s Paradox Two groups of events Each one associated with a positive correlation between f1 and f2 A spurious negative correlation appears when considering both groups

  14. C. R. Charig et al. Comparison of treatment of renal calculi by open surgery, percutaneous nephrolithotomy, and extracorporeal shockwave lithotripsy 2 treatments: A - strong surgical intervention B - percutaneous nephrolithotomy 2 groups: Small stones Large stones

  15. C. R. Charig et al. Comparison of treatment of renal calculi by open surgery, percutaneous nephrolithotomy, and extracorporeal shockwave lithotripsy

  16. C. R. Charig et al. Comparison of treatment of renal calculi by open surgery, percutaneous nephrolithotomy, and extracorporeal shockwave lithotripsy

  17. οὐ λέγουσι τὸ διὰ τί περὶ οὐδενός ... these do not tell us why a thing is so

  18. W. M. Briggs Global Warming Increases Disastrous Music: A Scientific Paper

  19. W. M. Briggs Global Warming Increases Disastrous Music: A Scientific Paper Raising temperature might be boiling brains. Global warming will drive the population mad through an intense barrage of awful, atonal, crude, childishly limited and harmful popular music.

  20. Our safety research Input data Target Daily average delays Number of safety events p-value < 0.01 Classification score > 61%

  21. Our safety research Input data Target Daily operations (in the previous day) Number of safety events p-value < 0.001 Classification score > 83%

  22. Πάντα ῥεῖ καὶ οὐδὲν μένει Everything flows, nothing stands still

  23. Stationarity The properties of a system do not depend explicitly on time Mean and variance do not change over time or position

  24. Stationarity The properties of a system do not depend explicitly on time Mean and variance do not change over time or position

  25. Stationarity The properties of a system do not depend explicitly on time Mean and variance do not change over time or position

  26. Rara avis in terris nigroque simillima cygno A rare bird in the lands, very much like a black swan

  27. Black swan An event that was unprecedented and unexpected at the point in time it occurred. However, after evaluating the surrounding context, domain experts can usually conclude: “it was bound to happen” The Black Swan, N. N. Taleb, 2007

  28. Black swan Problem: Inside vs. Outside (a probability distribution)

  29. Black swan Inside vs. Outside Unpredictable vs. Predictable Same mechanisms vs. New processes Black swan vs. Dragon-King

  30. Black swan Mid air collision <> Loss of separation Eyjafjallajökull (2010) <> Day with extreme delays Runway incursion <> Wrong communication Event aggregation can be dangerous!

  31. In resume: Sample size Grouping Causality Stationarity Extreme events

  32. In resume: Know your data mz@innaxis.org

More Related