610 likes | 624 Views
This course explores the difficulties in turning policy into practice through the development or evolution of real administrative systems. It analyzes case histories to understand the reasons behind project failures and emphasizes key concepts in software engineering, project management, and information economics to identify strategies for successful implementation.
E N D
SystemsMPP Ross Anderson
Aims • Introduce students to why turning policy into practice by developing or evolving real administrative systems is hard • Illustrate what goes wrong with case histories • Understand why many projects are late, or fail altogether (and public sector particularly bad) • Study basic ideas from software engineering, project management and information economics as a guide to how mistakes can be avoided
Objectives • By the end of the course you should appreciate why policy implementation often involves the construction, outsourcing or or modification of complex information systems, and frequently fails. • You should be more able to avoid magical thinking about systems. • You should also be aware of some of the ways in which changing technological possibilities can change the regulatory landscape.
Objectives (2) • You should appreciate the leading causes of project failure • You should understand the waterfall, spiral and agile models of system development • You should appreciate the reasons why outsourcing information systems can be complex, including technical lock-in, contracting practices and the diseconomies of scale.
Resources • Recommended reading: • SW Thames RHA, ‘Report of the Inquiry into the London Ambulance Service’ • C Shapiro, H Varian, ‘Information Rules’ • A King, I Crewe, ‘The Blunders of our Governments’ • Additional reading: • H Thimbleby, ‘Improving safety in medical devices and systems’ • W Curtis, H krasner, N Iscoe, ‘A field study of the software deign process for large systems’ • F Brooks, “No Silver Bullet”
Assessment • About 6 groups of about 4 people will each analyse a system problem and come up with a detailed report • Imagine that you are each representing a different department or other stakeholder • You will be expected to argue divide up the work into units manageable by team members, conduct the research, and combine the individual members’ into a coherent report • Limit: 30 pages, 12 point type
Assessment (2) • In addition, each student will separately write a short briefing for your minister (1 or at most 2 pages) that summarises the contents of the report and sets out options for decision. • You will receive a group mark out of 60 for the report (of which 40 marks will be allocated for the paper and 20 for the presentation). • You will also get an individual mark out of 40 for the briefing paper.
Topics • The DWP programme to introduce Universal Credit • The introduction of smart meters, in the UK and elsewhere • The implementation of Obamacare • The regulation of medical device safety
Topics (2) • The UK Investigatory Powers Bill • The regulation of autonomous vehicles • Suggest a topic (3 or 4 volunteers) Preferences by email to me today please; teams will be allocated tomorrow and reports are due by noon on February 28th
Outline of Course • Initial lecture: Jan 14 • Guest lectures: • Nick Hunn, smart meter expert, Jan 21 • Tom Loosemore, ex-Cabinet Office, Feb 11 • Veronica Marshall, ex-NAO, Feb 18 • (we might add one more) • Present your own project work: Mar 3rd
The ‘Software Crisis’ • Software lags far behind the hardware’s potential! • Many large projects fail in that they’re late, over budget, don’t work well, or are abandoned (LAS, NPfIT, DWP …) • Some failures cost lives (medical devices) or cause large material losses (NPfIT) • Some cause expensive scares (Y2K) • Some combine the above (LAS)
The London Ambulance Service System • Commonly cited example of project failure because it was thoroughly documented (and the pattern has been frequently repeated) • Attempt to automate ambulance dispatch in 1992 failed conspicuously with London being left without service for a day • Hard to say how many deaths could have been avoided; estimates ran as high as 20 • Led to CEO being sacked, public outrage
Original System • 999 calls written on paper tickets; map reference looked up; conveyor to central point • Controller deduplicates tickets and passes to three divisions – NW / NE / S • Division controller identifies vehicle and puts note in its activation box • Ticket passed to radio controller • This all takes about 3 minutes and 200 staff of 2700 total. Some errors (esp. deduplication), some queues (esp. radio), call-backs tiresome
Project Context • Attempt to automate in 1980s failed – system failed load test • Industrial relations poor – pressure to cut costs • Public concern over service quality • SW Thames RHA decided on fully automated system: responder would email ambulance • Consultancy study said this might cost £1.9m and take 19 months – provided a packaged solution could be found. AVLS would be extra
The Manual Implementation resource identification call taking Incident Form Resource Resource resource Allocators Controller mobilisation Map Incident Book Despatcher form' Control Incident Assistant Form'' Allocations Radio Box Operator resource management
Dispatch System • Large • Real-time • Critical • Data rich • Embedded • Distributed • Mobile components despatch worksystem resource identification call resource taking mobilisation resource management despatch domain
Bid process • Idea of a £1.5m system stuck; idea of AVLS added; proviso of a packaged solution forgotten; new IS director hired • Tender 7/2/1991 with completion deadline 1/92 • 35 firms looked at tender; 19 proposed; most said timescale unrealistic, only partial automation possible by 2/92 • Tender awarded to consortium of Systems Options Ltd, Apricot and Datatrak for £937,463 – £700K cheaper than next lowest bidder!
First Phase • Design work ‘done’ July • Main contract signed in August • LAS told in December that only partial automation possible by January deadline – front end for call taking, gazetteer, docket printing • Progress meeting in June had already minuted a 6 month timescale for an 18 month project, a lack of methodology, no full-time LAS user, and SO’s reliance on ‘cozy assurances’ from subcontractors
The Goal call taking CAD system resource mobilisation resource identification Computer- Resource proposal system based gazetteer AVLS mapping system resource management Operator
From Phase 1 to Phase 2 • Server never stable in 1992; client and server lockup • Phase 2 introduced radio messaging – blackspots, channel overload, inability to cope with ‘established working practices’ • Yet management decided to go live 26/10/92 • CEO: “No evidence to suggest that the full system software, when commissioned, will not prove reliable” • Independent review had called for volume testing, implementation strategy, change control … It was ignored! • On 26 Oct, the room was reconfigured to use terminals, not paper. There was no backup…
LAS Disaster • 26/7 October vicious circle: • system progressively lost track of vehicles • exception messages scrolled up off screen and were lost • incidents held as allocators searched for vehicles • callbacks from patients increased causing congestion • data delays voice congestion crew frustration pressing wrong buttons and taking wrong vehicles many vehicles sent to an incident, or none • slowdown and congestion leading to collapse • Switch back to semi-manual operation on 26th and to full manual on Nov 2 after crash
Collapse • Entire system descended into chaos: • e.g., one ambulance arrived to find the patient dead and taken away by undertakers • e.g., another answered a ‘stroke’ call after 11 hours, 5 hours after the patient had made their own way to hospital • Some people probably died as a result • Chief executive resigns
What Went Wrong – Spec • LAS ignored advice on cost and timescale • Procurers insufficiently qualified and experienced • No systems view • Specification was inflexible but incomplete: it was drawn up without adequate consultation with staff • Attempt to change organisation through technical system • Ignored established work practices and staff skills
What Went Wrong – Project • Confusion over who was managing it all • Poor change control, no independent QA, suppliers misled on progress • Inadequate software development tools • Ditto technical comms, and effects not foreseen • Poor interface for ambulance crews • Poor control room interface
What Went Wrong – Go-live • System commissioned with known serious faults • Slow response times and workstation lockup • Software not tested under realistic loads or as an integrated system • Inadequate staff training • No back up • Loss of voice comms
NHS National Programme for IT • Like LAS, an attempt to centralise power and change working practices • Earlier failed attempt in the 1990s • The February 2002 Blair meeting • Five LSPs plus a bundle of NSP contracts: £12bn • Most systems years late and/or don’t work • Changing goals: PACS, GPSoC, … • Inquiries by PAC, HC; Database State report … • Coalition government: NPfIT ‘abolished’ • See case history written by 2014 MPP students!
Topic 1 – Universal Credit • Idea: unify hundreds of welfare benefits and mitigate poverty trap by tapered withdrawal • Was supposed to go live Oct 2013! • General: big systems take 7 years not 3 • They hoped ‘agile’ development would fix this … • Needed real-time feed of tax data from HMRC • Descended into chaos; NAO report • IDS looking for yet another project manager (7th?)
Topic 2 – Smart Meters • Idea: expose consumers to market prices, get peak demand shaving, make use salient • EU Electricity Directive 2009: 80% by 2020 • Labour 2009: £10bn centralised project to save the planet and help fix supply crunch in 2017 • Coalition government: need big deployment by next election in 2015! So we had build central system Mar–Sep 2013 (then: Sep 2014 …) • Contracts tendered while spec still fluid… • Still floundering as some of the tech now obsolete • Ontario project similar; went way over budget and failed to save any energy
Topic 3 – Obamacare • Affordable Care Act – Obama’s big project • Key website to provide an insurance market • The Act made the spec fragile to legal challenge and obstructive state governors • Responsibility for making it all work was divided; everyone could pass the buck • Rescued by heroics from geek volunteers who’d helped Obama get elected in 2008
Topic 4 – medical device safety • Some industries, such as cars and aerospace, make steady progress on safety • Incentives are well aligned there – accidents are visible, drivers/pilots don’t want to die and insurance statistics are good • But this is not the case everywhere • Research by Harold Thimbleby: in the UK, hospital safety usability failures kill about 2000 p.a. (about the same as road accidents) Leipzig
Where the problem’s being fixed Leipzig
Managing Complexity • Engineering is about managing complexity at a number of levels • At the micro level, bugs arise in protocols etc because they’re hard to understand • As programs get bigger, interactions between components grow more than linearly • … • With complex socio-technical systems, we can’t predict reactions to new functionality • Most failures of really large systems are due to wrong, changing, or contested requirements
What’s to be done? • US FDA has two engineers, who have to process 500 applications a year • Not allowed to test actual devices • UK MHRA has no engineers at all • Also fails to publish clinical study reports that are unhelpful to vendors • Has retired chief scientists from drug, appliance makers as nonexecs …
Topic 5 – the IP Bill • Draft Investigatory Powers Bill is the hot UK tech policy topic right now • Snowden alerted us to intelligence abuses • US government set up NSA review group, reformed FISA, cut data retention… • UK government just wants to legalise it all • Industry, NGOs, not happy; EU courts ... • What policy might be implementable?
Topic 6 – autonomous vehicles • Self-driving cars are almost here • Drones are already causing safety and privacy problems • No government really confident here! • FAA has cautious rules on drones; Nevada allows driverless cars • How should the UK work given then EU single market, including liability rules?
Topic 7 – choose your own • Many public-sector project failures are reported • You can choose your own, provided it’s likely to lead to a decent case study • Key criterion: is there enough information publicly available? • Most failures (e.g. Olympic overspend, Addenbrookes IT meltdown) hushed up
Nineteenth Century • Charles Babbage, ‘On Contriving Machinery’ • “It can never be too strongly impressed upon the minds of those who are devising new machines, that to make the most perfect drawings of every part tends essentially both to the success of the trial, and to economy in arriving at the result”
1960s – The Software Crisis • In the 1960s, large powerful mainframes made even more complex systems possible • People started asking why project overruns and failures were so much more common than in mechanical engineering, shipbuilding… • ‘Software engineering’ was coined in 1968 • The hope was that we could things under control by using disciplines such as project planning, documentation and testing
How is Software Different? • Large systems become qualitatively more complex, unlike big ships or long bridges • The tractability of software leads customers to demand ‘flexibility’ and frequent changes • Thus systems also become more complex to use over time as ‘features’ accumulate • The structure can be hard to visualise or model • Feature interactions are just like the loopholes that proliferate in complex tax codes or regulations
Structured Design • The only practical way to build large complex programs is to chop them up into modules • Sometimes task division seems straightforward (bank = tellers, ATMs, dealers, …) • Sometimes it isn’t (tax and welfare systems interact) • Sometimes it just seems to be straightforward • There are several common methodologies • US DoD specifies the ‘waterfall model’
The Waterfall Model Requirements Specification Implementation & Unit Testing Integration & System Test Operations & Maintenance
The Waterfall Model (2) • Requirements are written in the user’s language • The specification is written in system language • There can be many more steps than this – system spec, functional spec, programming spec … • The philosophy is progressive refinement of what the user wants • Warning – when Winton Royce published this in 1970 he cautioned against naïve use • But it become a US DoD standard …
The Waterfall Model (3) Requirements Specification validate Implementation & Unit Testing validate Integration & System Test verify Operations & Maintenance verify