30 likes | 159 Views
Last time…. 4. delusive dwarves Early work based on small samples – need to revisit 5. baffling bias Are all penguins equal ? Very large datasets still contain bias 6. sinful simulations Are patterns a function of model parameters ? Simulation ≠ validation.
E N D
Last time… 4. delusivedwarves • Early workbased on smallsamples – needtorevisit 5. bafflingbias • Are all penguinsequal? • Very large datasets still containbias 6. sinfulsimulations • Are patterns a function of model parameters? • Simulation ≠ validation 1. slipperyspaces • First order effects = context • Objects break rules 2. granularitygrief • More data ≠ moreinformation • Sensitivityvarieswithmeasure 3. defectivedefinitions • Missedpattern ≠ badalgorithm, but baddefinition
Since 2008 there’s been lots of progress, but… • We’re all still working more or less independently • There’s very little reuse of existing work • A number of initiatives (MOVE, MPA’10) have discussed possible benchmark data and their characteristics • They’ve made a good start on defining benchmark types and desirable characteristics of such data…
Open problem • With Bettina, Judy Shamoun-Baranes and Daniel Weiskopf I’m organising a workshop at the Lorentz Centre • We’ll have a data challenge as part of that workshop • I’d like to discuss, on the basis of the state of the art, what realistic benchmark problems and definitions are and how we can compare results