1 / 38

GDAAS II Algorithm Implementation Review

GDAAS II Algorithm Implementation Review. Introduction. GAIA data analysis understood to be a complex task - Data volume: 20-50 TB data over 5 yrs, - 10 20 flop - Complexity: Data ‘mixed’ in time and space due to scanning motion, ...

aurora
Download Presentation

GDAAS II Algorithm Implementation Review

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. GDAAS II Algorithm Implementation Review

  2. Introduction • GAIA data analysis understood to be a complex task • - Data volume: 20-50 TB data over 5 yrs, • - 1020 flop • - Complexity: Data ‘mixed’ in time and space due to • scanning motion, ... • Hipparcos approach (flat files/sequential processing) inappropriate • Major software engineering infrastructure required GAIA Data Analysis and Access System

  3. GDAAS Phase I Objectives • Define an efficient, scalable, maintainable and usable system for populating the GAIA mission database from the satellite data stream: • The GAIA Data Access and Analysis System should allow the reduction processes to be optimally run. • Data must be accessed readily in the time and spatial domain. Challenge:Establish the technical baseline concepts for the system on realistic basis and prove the feasibility of the approach chosen for the reduction of the mission.

  4. Objectives • Investigate feasibility of an Object-Oriented Approach to Data Manipulation • Analyse an appropriate Data Storage, Processing and Access Concept • Simulate an end-to-end data processing environment for the Operational Phase • Devise and implement a mechanism to incorporate critical algorithms produced by the GAIA Science Community • Investigate Performance Issues of the System

  5. Consortium Project team organisation ESA/ESTEC GMV Prime Contractor GMV Team UB Team CESCA Team • GMV: Management and Software development • UB: Scientific support and customisation of the GAIA simulator • CESCA: Hardware infraestructure, processing power and on-line support.

  6. The Context • Objectivity was chosen as the Database Management System for the Study • Only the ASTRO Instruments were considered • A Data Model was developed, based on GAIA’s specifications • Some critical Algorithms were identified and implemented

  7. System specification • Definition of a satellite model • Definition of the DB data model • Processing requirements • Interface requirements

  8. Prototype Development • Design • Design of the GAIA database • Data Model Refinement • Data Manipulation Layer • Design of the Processing Framework • Implementation • Database Data Model and Database Manipulation Layer • Processing algorithms • Processing Framework • Testing • Integration and Validation at CESCA premises

  9. System Evaluation • Based on three main processes: Ingestion, Cross Matching (M. Lattanzi) and GIS (L. Lindegren) • Data storage: sizing, overheads and access time • Distributed processing performances • Resources requirements (memory/disk/CPU); • Precision of the processed data.

  10. GASS simulator Telemetry Stream Data processing structure: prototype Sources Global Iterative Solution Attitude Updating Astrometric updating Calibration Global The GAIA Database Ingestion & Initial data treatment • Raw Observations • Obs2Elem. • Centroiding Cross-Matching

  11. GDAAS Phase I Conclusions • The approach chosen has proved quite succesfully • O-O + UML tools demonstrated its advantatges in the implementation of • this complex system • Java has demonstrated to be ideal for the problems posed by the system • The choice of the DBMS has shown to be a key element • To get good concurrent performance on ingestion and CM was an expensive task • GIS complexity increased by the use of wrappers • Some Test results

  12. GDAAS Phase II

  13. Objectives • The objective of the Phase II study is to provide complete • confidence in the overall GAIA data processing approach, • identifying interfaces with all foreseen data reduction steps, • implementing and testing an agreed package of algorithms provided by the wider GAIA community, and demonstrating scalability to a final data processing system. • Aims to deliver to ESA a processing environment capable of providing the basis for the data analysis system applicable to the GAIA mission data.

  14. Tasks (I) • Selection and recomendation of the processing hadware and • software environment • Recommend Programming language rules for critical components • Incorporate the changes in the baseline payload • Develop suitable data performance monitoring tools • Demonstration of scalability to the final data processing system • Definition of the overall data processing structure: • “what algorithms go where”

  15. Tasks (II) • Establish priorities for algorithmic development • - Representative algorithms should be targetted in all major areas • - Simplified modules may be defined, but with clear goals and outputs • - Selected algorithms should be representative of algorithm and interface • complexities • - Identified algorithms should lead to confidence in number of flop • - Identification of the necessary algorithms should help demonstrate • the feasibility of the GDAAS approach

  16. Tasks (III) • Propose and implement an overall configuration control strategy • Definition of a comprehensive test plan. • Definition and procurement of the storage and processing hardware

  17. The Role of the DPWG 1. Categorise each Algorithm in terms of whether it is a Core/Shell/Quick Look 2. Validate the context within which an algorithm is being proposed to fit the GAIA specifications. 3. Assess the scientific validity of the algorithm. 4. Ensure that all the information being provided is complete. 5. Check the completeness of each test case and theexpected results

  18. The role of the Configuration Control Board • Review all algorithms proposed by the DPWG in light of their • implementation into GDAAS. • Classify the algorithms according to their criticality. • Schedule the implementation of a particular algorithm • Verify the implementation and use of the required version of the • identified GAIA Simulator components. • Select and implement a Configuration Control Tool for the GDAAS • Review the source code rules • Manage the GDAAS Library: a set of routines, readily available for • use by the community. • Review the Interface Control Document for coding practices within GDAAS. • Monitor and review on a regular basis all algorithms being implemented • Monitor all the test procedures accordingly. • Run a Risk Assessment dealing with GDAAS implementation issues.

  19. Algorithms to be delivered by July 2003 • Improved object matching • ASM handling • Fundamental algorithms • Improved attitude modelling (preliminary) • Improved instrument model • Chromaticity calibration (preliminary) • Astrometric binary star analysis • Planets • Photometric raw data treatment • Radial velocity cross correlation • Minor planet detection • Discrete source classifier (preliminary) • Single star parametrizer (preliminary)

  20. Algorithms to be delivered by Sept. 2003 • Visual double star analysis (preliminary) • Ingestion level classifier (preliminary)

  21. Algorithms to be delivered by April 2004 • Improved attitude modelling (update) • Detailed geometrical calibration • Chromaticity calibration (update) • CCD calibration (preliminary) • Additional global parameters • Visual double star analysis (update) • Science alerts • Radial velocity source detection • Radial velocity wavelength calibration • Multiple source detector (preliminary) • Single quasar parametrizer (preliminary)

  22. All these algorithms require simulations of a matching level of detail to be tested. The algorithm providers should in principle provide the appropriate modules, but it is the responsibility of the SWG to provide the basis for these simulations (GASS & GIBIS) and to ensure its implementation

  23. Algorithm SUBMISSION • Follows the following steps: • Proposal by author/reviewer • DPWG Evaluation • CCB Evaluation • Implementation and Testing

  24. Title: Name of the algorithm Author: Name of author providing the algorithm Reviewer: Name of the reviewer (normally a Working Group leader) Method: Details of how the algorithm is to be used frequency of execution impact it may have on data (large/small modifications) requirement of temporary disk space Function: Explanation of what the algorithm accomplishes from a scientific standpoint and a brief high level description of how it achieves it. Assumptions: Scientific assumptions made during the design and implementation of the algorithm, including those concerning the values of constants

  25. Input Data: For each input for each step of an algorithm • Correspondence – From the GDAAS Data Model or the necessary • pre-processing described in pseudo-code. If the input • is passed from a previous step it should be stated explicitly • Type – The type used in the algorithm and if necessary the format. • Units – The units used in the algorithm. • Range – When only a certain range of values is accepted • Conditions –special conditions, e.g. observations must be linked to a source • (i.e. cross-matched) before their values can be used. • Output Data: Analogous information to that provided for inputs. Details of • the meaning of any output error codes should be provided • in order to assist the implementation team during testing.

  26. Source Code: The actual file(s) containing the source code. All delivered code should be compliant with the appropriate language standards Test Data: Input and output data to verify that the implementation was succesful More Details: Algorithm Interface Control Document GDAAS Algorithm Preparation Guidelines GATT: http://gaia.esa.int/algorithms/

  27. How to access the database • Astandard set of routines takes care of retrieving and updating the DB • The GDAAS team will take care of the wrappers needed to interface to the DB • To restrict the observation: • by time interval • sky region (HTM) • instrument or range of CCDs

  28. Simulations: The GAIA simulations are responsability of the GAIA scientific community The telemetry stream for GDAAS is provided by GASS (GAIA System Simulator) The providers must be sure that the complexity of his algorithm is matched by the simulations. If not they should provide appropriate code to the SWG

  29. Present status of GDAAS Test on GAIA 1 design + Oracle is going on Fully distributed HW: 1Tb, Oracle 9i RAC, Results: Ingestion up to 2.5 years already done CM is running for 6 months telemetry GIS: re-designned, running on 12h telemetry

  30. Present status of GDAAS Design of telemetry according to the new GAIA design New Data Model in progress according to the GAIA design and the existing algorithm information System: design is going on preparing for the implementation of algorithms

  31. GAIA design DB engine Algorithms GDAAS 1 Old. Red Book Only Astro Objectivity Ing. - CM.- GIS (First approx.) Old. Red Book Only Astro Oracle 9i Ing. - CM.- GIS (First approx.) GDAAS 2 Phase 1 GDAAS 2 Phase 2 New Oracle 9i Ing. - CM.- GIS (Improved) + Other

  32. SIMULATOR: • SATELLITE MODEL • GAIA barycentric ephemeris: satellite orbit around L2 point (provided by F. Mignard) • GAIA scan law: nominal scan law without perturbations (provided by L. Lindegren) • GAIA parameters: compilation of satellite and instrument parameters (UB-SWG-012). • Only Astro instrument is considered.

  33. UNIVERSE MODEL: GALAXY MODEL • Torra et al. (Universitat de Barcelona) • Single stars • Double star (provided by F. Arenou) • Variable stars: no actual light curves and distribution period implemented. • Extintion model: F. Arenou • The JAVA structure allows in a easy way to implement other galaxy model -> a implementation of the more realistic Besançon Galaxy Model (A. Robin et al.) is on going.

  34. Algorithm Implementation deadlines: July, 1st July, 7th CCB meeting December, 03 First set of algorithms implemented December, 04 Completion of GDAAS2

More Related