210 likes | 311 Views
Spring Conference 2008 Widening Horizons in Acoustic Research. ANASE: Lessons from ‘Unreliable Findings’. Peter Brooker Cranfield University, Cranfield, UK. ANASE – What went Wrong?.
E N D
Spring Conference 2008Widening Horizons in Acoustic Research ANASE: Lessons from ‘Unreliable Findings’ Peter Brooker Cranfield University, Cranfield, UK
ANASE – What went Wrong? • Late 2007, the ANASE (Attitudes to Noise from Aviation Sources in England) report was published: ANASE claimed that people are increasingly annoyed by aircraft noise, and it estimated how much they would be willing to pay to get rid of it • But its quantitative ‘findings were rejected as unreliable by the Department for Transport [DfT]’ • Project’s managers were warned early on that the work would fail to deliver good value for money, not meet accepted technical/statistical standards • How and why did it fail? What were the methodological failings? What are the lessons for acoustics professionals?
Background • ANASE project initiated by DfT in 2001: aim to explore the relationship between aircraft noise and annoyance, and its monetary valuation – Stated Preference [SP] • Focus here is annoyance component, intended to update the 1980s Aircraft Noise Index Study (ANIS) • ANIS concluded that there was no better metric than the noise energy measure Leq re correlation between aircraft noise and community annoyance measured in social surveys • Government adopted Leq to describe aircraft noise, decided to use 57 Leq (16-hour period) as onset of significant community annoyance
2002 Warnings about ANASE • Would not be robust, nor technically reliable, nor capable of withstanding scrutiny, nor good value for money; and that it would be a source of considerable vulnerability for DfT • Thus, ANASE outputs would be poor value for the taxpayer, residents near airports, and the aviation industry • DfT did not heed these warnings, nor recommendation to use an independent expert audit team (to get the work back on the right track and ensure that the results would command the widest possible confidence)
2007 ANASE – Sadly, much Worse • Time over-run: ANASE specification said ‘in the order of three years’, but it took five years eleven months: +97% • Over-spend: original budget of £0.53 million, but (to January 2008) actual spend on ANASE was £1.78 million: +236% • Peer Reviewers did not endorse the ANASE claims on aircraft noise annoyance – its claims were disregarded in the Consultation Document on Heathrow Airport • ANASE’s outputs failed in three important ways: annoyance, SP valuations, and confusion
Dramatic ANASE Aircraft Annoyance Claims • Claim: “For the same amount of aircraft noise, measured in Leq, people are more annoyed in 2005 than they were in 1982” • Claim: “The results from the attitudinal work and the SP analysis both suggest that Leq gives insufficient weight to aircraft numbers, and a relative weight of 20 appears more supportable from the evidence than a weight of 10, as implied by the Leq formulation” • But dramatic claims needed to be backed up by good evidence
ANASE’s Main Design Bias • When using social surveys, critical to avoid marked ‘context effects’ – which distort results • For the SP work, ANASE included a ‘foreshadowed laboratory experiment’: noise playback equipment installed & calibrated in respondents’ homes about 20 minutes before the survey • Two kinds of interviews: • Full: noise playback equipment for SP • Restricted: no equipment – crude version of ANIS baseline • If the playback equipment affects people’s normal responses then the Full data is contaminated
ANASE’s Main Design Bias – Consequences? • If no context effects, then no differences would be observed between Full and Restricted Data • If Full and Restricted data sets show marked differences, then context effects crucial, so: • Full data is distorted, ‘not fit for purpose’ • Full & Restricted data cannot be combined statistically • Full data not comparable to ANIS • Basis for ‘claims’ vanishes • The burden on the ANASE contractors was to eliminate design biases and/or demonstrate they were not statistically significant
ANASE: Fails Heathrow Data Bias TestStatistically significant difference at 5% level
ANASE – Failed ‘Heathrow Bias’ Test • Carried out at instigation of peer reviewers – who were not shown the raw data for a year – ?? • ANASE contractors say ‘extreme’ – to carry out a textbook statistical test on the best quality data available – ?? • What happens if one carries out: • textbook statistical tests • using textbook multiple regression analysis • on a standard international/UK metric for annoyance • and the actual ANASE data on Leq? • The answer is the next slide
ANASE: Fails ‘All Data’ Bias TestStatistically significant difference at 5% level
ANASE – ‘Full Dataset Bias’ Summary • Full and Restricted data are not consistent – textbook statistical testing • Full and Restricted regression lines have significantly different slopes – an approximate multiplicative bias • Full data cannot reliably be used to model annoyance relationship with noise • Full data not comparable with ANIS, so cannot be used to indicate changes over time • Design bias effects explain failure to match international data and ‘implausible’ SP results [peer reviewers] • Evidence for the dramatic claims is not there
Good Contracts – General Comments • Obvious key ingredients are: • Competent project managers • Competent contractors • Good quality technical advice • Vital for the project managers to detect: • Improper award of contract – de-scoping specification for a particular contractor, biased contract evaluation process • Contract not delivered properly – ineffective monitoring, no effective auditing programme • Contract cost over-runs – ‘lowballing’ by contractor, ineffective monitoring
Lesson – Get Expert Technical Advice Early • Right and wrong times to get independent technical advice about a project: right time is very early in the project, when pilot studies are being designed and analysed, so work avoids obvious traps/design biases • Fool’s Gold: ‘Frobisher lesson’ • Martin Frobisher, the 16th Century English explorer, went on three voyages to northern Canada, bringing back increasingly large amounts of gold ore – 1100 tons on the last trip • But it was Fool’s Gold – iron pyrites • Frobisher should have taken advice from a reliable metallurgist at the outset (density!)
Lesson: Professional Approach is Vital • Need to do the basic things sensibly, whilst focusing on the goal of meeting the project specification: • Reputable textbooks and accessible governmental guidance, attitude testing, social science methodology, multiple regression, etc • Aim must be to eliminate or reduce the potential for serious technical/statistical biases, and then to try to uncover latent biases in data • Aircraft noise estimation from computer models needs to be validated against appropriate field data collections
Don’t just mix Annoyance and SP • SP element of the study distorted the annoyance survey, which in turn distorted the SP results • Noise research literature has many examples showing that laboratory experiments change people’s disturbance reactions • ANASE Executive Summary: • “Overall, therefore, we do not think that the valuations from either [ANASE SP] method are safe, and it will probably be necessary to rely on sources based on Hedonic Pricing” • So don’t mix annoyance and SP unless there is confidence that bias/distortion effects are eliminated or controlled
ANASE: Heartbreaking – Lessons? • ANASE’s problems were predicted, failure to produce cost-effective outputs was preventable • Study failed to achieve DfT goal to be ‘substantial research that commands the widest possible confidence’; and its claims added confusion to current Heathrow Consultation process • Study’s dramatic claims collapse because of design biases: failures to meet textbook statistical tests on Heathrow/All raw data, out-of-line with international data, ‘implausible’ SP results • Valuable lessons to be learned from failings regarding aircraft noise study specifications, contracts, design methodology and data analysis