1 / 27

Embracing the future - a retrospective look

Embracing the future - a retrospective look. Michael Jones OPS-G Forum 5 th September 2008. Contents. “Governance” 2. Change can be slower than you think! FFP contracts – the magic bullet? The Black Swan: the improbable in operations 5. Software Dependability. Governance.

cooper
Download Presentation

Embracing the future - a retrospective look

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Embracing the future - a retrospective look Michael Jones OPS-G Forum 5th September 2008

  2. Contents • “Governance” 2. Change can be slower than you think! • FFP contracts – the magic bullet? • The Black Swan: the improbable in operations 5. Software Dependability Embracing the Future: a Restrospective Look - 5th Sept. 2008 - M. Jones OPS-GD

  3. Governance • “The use of institutions, structures of authority and even collaboration to allocate resources and coordinate or control activity [in society or the economy].” (Wikipedia) Embracing the Future: a Restrospective Look - 5th Sept. 2008 - M. Jones OPS-GD

  4. Governance in the SSA Programme • What is the SSA Programme? • Provides a systematic capability for surveillance of man-made objects in the space around the earth; • provides warnings of collisions that may endanger space activities or even life on earth. • Governance = making decisions on how the programme and deployed assets are to be run. Embracing the Future: a Restrospective Look - 5th Sept. 2008 - M. Jones OPS-GD

  5. Data Systems Governance: Data Systems Task Force The Data Systems Task Force must : • “Ensure the availability of adequate strategy and plans for the mission data infrastructure and monitor the execution of those plans in order to ensure timely availability of the mission data infrastructure.” • This means that the DSTF in effect carries out governance of the data systems infrastructure. Embracing the Future: a Restrospective Look - 5th Sept. 2008 - M. Jones OPS-GD

  6. Conclusions on Governance • Governance is a buzz word that you will continue to hear! • Recent or emerging examples are: • Establishment of the ESA Security Office; • Software licence governance for ESA and Third Party Software. • Mike Jones’s proposed definition of “governance” to fit its usage in ESA: “The process of making decisions, the oversight of the results of those decisions and also the oversight of organisations or structures of authority for decision making.” Embracing the Future: a Restrospective Look - 5th Sept. 2008 - M. Jones OPS-GD

  7. Change can be slower than you think: Example 1: SCOS-2000 • SCOS-2 (which became SCOS-2000) was a new MCS infrastructure developed from scratch. • A very brief summary of the timeline of the project up to 2002: • Start of project as SCOS-2: 1992; • Version 1 (as SCOS-2) used for Huygens, MTP and Teamsat: late 1997; • Re-engineering of SCOS-2 (mainly TC chain): 1997-1998; • Parallel production of architectural designs for both SCOS-1 and SCOS-2 baselines for the Integral MCS: 2nd half of 1998; • Adoption of SCOS-2 as the Integral MCS baseline: January 1999; • Integral was the first major ESA science spacecraft based on SCOS-2; • SCOS-2 renamed SCOS-2000: 2000; • Supported INTEGRAL LEOP: 17th October 2002, using SCOS-2000 rel. 2.3. • So it took 10 years to reach the point at which the new infrastructure became generally accepted – original plan was 5 years. Embracing the Future: a Restrospective Look - 5th Sept. 2008 - M. Jones OPS-GD

  8. Change can be slower than you think: Example 1: SCOS-2000 - Conclusion • Developing a new mission control system infrastructure from scratch is difficult and time consuming. • First lesson – try to avoid building new MCS infrastructures – “evolution, not revolution”. • Second lesson– if you have to do it, develop a simple version first. Embracing the Future: a Restrospective Look - 5th Sept. 2008 - M. Jones OPS-GD

  9. Change can be slower than you think : Example 2: Intel/Linux MCS Infrastructure • In 2001, it was decided to port SCOS-2000 to Linux. • Straightforward: by 2002 a SCOS-2000 version was available which could run on either SUN Solaris platforms or on LINUX. • Outside ESOC, S2K became popular as a licensable product and, with one exception, external (non-ESOC) projects using SCOS-2000 have been based on Linux. • At ESOC, the move to the Linux version proceeded cautiously in two stages: • 1. a pilot project with Linux server and SUN clients (Herschel Planck, S2K rel. 4); • 2. a Linux transition project to install Linux clients in all the common areas. • Stage 1 was successfully completed ca. 2006. Stage 2, started in 2007, has been completed for the MCR. • Intel workstations for the remaining common areas will be procured this year together with a reserve of spares. • We are now aiming at supporting Herschel Planck LEOP using the new Linux infrastructure installed by the LIT project. Embracing the Future: a Restrospective Look - 5th Sept. 2008 - M. Jones OPS-GD

  10. Change can be slower than you think: Conclusion – Linux • It has taken more than 6 years to reach the point of having a common Intel/Linux infrastructure. • Where you have a large installed park of workstations (ca. 1400 in this case) change is quite slow, since the missions already installed on the old platforms will not want to, or be able to, change. Embracing the Future: a Restrospective Look - 5th Sept. 2008 - M. Jones OPS-GD

  11. FFP Contracts – the Magic Bullet? • Before1996 most software development at ESOC was done under fixed-unit price conditions. • Implications: • ESOC “owned the risk” for the software requirements and their implementation. • The contractor companies took no responsibility - they simply provided man-hours of staff. • In 1996 firm-fixed price (FFP) contracts for development of spacecraft control systems and simulators were introduced with the new frame contracts. • Prime motivation: • Move contract staff off-site to their own companies’ premises; • FFP regime much more suitable for off-site work. • FFP became the rule for most work awarded under these frame contracts, achieving: • Far more rigorous scrutiny of requirements by frame contractors; • Better competition; • Equitable risk sharing between ESA and its suppliers; • Formal change control (contract change notices - CCNs). Embracing the Future: a Restrospective Look - 5th Sept. 2008 - M. Jones OPS-GD

  12. FFP Contracts – the Magic Bullet? • Firm-fixed price contracts have been rather successful for MCS, simulator and station back-end software. But: • Firm Fixed Price does not mean Firm Fixed Schedule! • Contractor can underestimate the work to be done. • Recent example: Herschel Planck MPS, where the cheapest offer was taken and the contractor had underestimated the budget by a factor of nearly 10. Embracing the Future: a Restrospective Look - 5th Sept. 2008 - M. Jones OPS-GD

  13. FFP Contracts – The Magic Bullet? Conclusions • 1. The lowest acceptable offer may not always be the right choice, particularly if the schedule is important. • A careful evaluation of management plan and technical solution is needed to ensure that the schedule can be met. • 2. For schedule-critical developments, a look at more sophisticated techniques such as Earned Value Analysis may be needed. Embracing the Future: a Restrospective Look - 5th Sept. 2008 - M. Jones OPS-GD

  14. Black Swans • “Black Swan” - title of a book by Nassim Nicholas Taleb. • A black swan is a large-impact, hard-to-predict, and rare event beyond the realm of normal expectations. • Comes from ancient Western conception that 'All swans are white'. Embracing the Future: a Restrospective Look - 5th Sept. 2008 - M. Jones OPS-GD

  15. Black Swan: The Turkey Example Embracing the Future: a Restrospective Look - 5th Sept. 2008 - M. Jones OPS-GD

  16. The Black Swan: How we deal with the unexpected in operations Try to make operations fully predictable: • Plan and prepare ground segments very carefully. • Technically validate them thoroughly. • Prepare procedures and plans for operations. • Operationally validate extensive simulations programme aimed at training all the teams and ensuring systems, documentation and operations staff all work together. • The operations validation also includes contingencies or anomaly cases to ensure the unexpected can be handled. This is the discipline of Operations Engineering. Embracing the Future: a Restrospective Look - 5th Sept. 2008 - M. Jones OPS-GD

  17. The Black Swan and Software • ESOC uses systems containing lots of software. • In the real world much software is complex - • no single person can understand it completely. • “Complex” in this case means “Big” - • complexity varies as a power of the size. • Behaviour of any complex software system cannot be fully understood - • highly improbable or “black swan” events may occur. Embracing the Future: a Restrospective Look - 5th Sept. 2008 - M. Jones OPS-GD

  18. Black Swan: The MCS Incident during the MSG-1 LEOP (28th August 2002) • A number of the client workstations in the MCR, PSR and SSR suddenly became unusable - went to the SUN login. • Softcoor logged into the A server from the SSR and restarted the system. • This appeared to work, but then • a SCOS-2000 communications task stopped processing on the server; • two telecommanding tasks (multiplexer and releaser) crashed. • Attempts to switch clients to the redundant B server also failed. • Fortunately in the meantime the spacecraft was safe - • despite the problems with the clients, telemetry was received and processed on both A and B servers. • Softcoor then took the decision to move to a third chain, the C-system. • He was then able to logout all clients on the A and B chains and to restart the servers on both of them. • The systems were made available to the flight control and project teams about 20 minutes later. Embracing the Future: a Restrospective Look - 5th Sept. 2008 - M. Jones OPS-GD

  19. Black Swan: MSG-1 Diagnosis and Conclusions Diagnosis: • The server had been started as foreground task remote from a SUN WS in the SSR – this created a dependency between the server and the SSR SUN WS. • For reasons unknown, this SUN had a problem and went to “login” status, resulting in the stopping of the server tasks started directly from this SUN. • There was an implementation error in the MISCdynamic server relating to CORBA event processing. Problem resolution: • Start the server as a background task. • Correct one CORBA call in the MISCdyn server. • A full explanation of everything that happened was not possible - for example, why the SSR SUN went to “login” in the first place - since the logs were inadequate. Embracing the Future: a Restrospective Look - 5th Sept. 2008 - M. Jones OPS-GD

  20. MSG-1 Incident: Discussion • Problems in complex software cannot be excluded. • ESOC approach is very practical and sound: • a software coordinator thoroughly familiar with the system; • assisted by a very qualified software support team; • both fully involved in the sim campaign; • Ensured quick recovery in MSG case. • An operations engineering technique is applied to software engineering. Embracing the Future: a Restrospective Look - 5th Sept. 2008 - M. Jones OPS-GD

  21. Software Dependability • “Software dependability” seeks to quantify how much we can rely on a software system to function as required. • However, it is impossible with any reasonable effort to ensure there are no errors in a large software system, e.g. • SCOS-2000, which comprises several millions of lines of software code written since the mid-1990s. • There is a widespread misapprehension that it is possible to quantify the errors in computer code. Embracing the Future: a Restrospective Look - 5th Sept. 2008 - M. Jones OPS-GD

  22. Software Dependability: Example 1 – Misunderstanding Software Bugs • “Even if the tools are better, the number of bugs in newly written code has remained constant at around five per “function point”. . . Worse,. . . only about 85% of these bugs are eliminated before software is put into use.” [my underlining] (Economist Technology Quarterly, March 6, 2008) • You can measure the number of bugs found before putting the software into use; • But you cannot know how many bugs remain, unless the software is very simple. Embracing the Future: a Restrospective Look - 5th Sept. 2008 - M. Jones OPS-GD

  23. Black Swan: Example 2 – Misunderstanding Software Bugs • It is impossible to demonstrate a negative proposition such as this: • e.g. no run-time errors. Absence of evidence is not evidence of absence. “The supplier shall verify the software code ensuring: . . . 7. absence of run-time errors; 8. absence of memory leaks . . .” (Source: ECSS-E-40C) Embracing the Future: a Restrospective Look - 5th Sept. 2008 - M. Jones OPS-GD

  24. Black Swan: Conclusion on Example 2 There are known unknowns. That is to say, we know there are some things we do not know. But there are also unknown unknowns, the ones we don't know we don't know. Donald Rumsfeld U.S. Secretary of Defense, 2001 to 2006 Embracing the Future: a Restrospective Look - 5th Sept. 2008 - M. Jones OPS-GD

  25. Software Dependability: Example 3 – Software Criticality • ECSS-E-40C puts tailoring according to software criticality in a normative annex. For example the standard requires 100% path coverage testing for Class B criticality software. Critique: • 100% coverage testing is, in practice, impossible for very complex systems; • Even if you ensure 100% coverage testing, there is still no guarantee that the software is free from error. Discussion: • For on-board software it is reasonable to take quite heavy measures in development are taken to ensure dependable software. • ECSS-E-40C • Shows a very strong influence from on board development practice. • Does not take into account the impacts for ground software which typically are much bigger and more complex. Embracing the Future: a Restrospective Look - 5th Sept. 2008 - M. Jones OPS-GD

  26. Final Conclusions • You can • Have good governance; • Develop your ground systems in careful way, piloting new technology and taking plenty of time; • Ensure our industrial partners are fully motivated via competitive firm-fixed price contracts; • But you can still be hit by unexpected problems in operations, especially in complex software. • The way to successfully tackle these unpredictable anomalies or incidents is to have a skilled team, fully familiar with the software and fully involved in the sims campaign. Embracing the Future: a Restrospective Look - 5th Sept. 2008 - M. Jones OPS-GD

  27. Questions? Embracing the Future: a Restrospective Look - 5th Sept. 2008 - M. Jones OPS-GD

More Related