CERA / WDCC

CERA / WDCC Hannes Thiemann Max-Planck-Institut für Meteorologie Modelle und Daten hannes.thiemann @ zmaw.de NCAR, October 27th – 29th, 2008

Statistics Requirements + Features General architecture Implementation (current and new) Migration Summary Contents

WDCC / CERA: General Statistics at 01-10-2008 00:00:10 Database Size (TByte): 370 Number of blobs: 6663287791 (6.6 billion) Data access by fields and not by files. Number of experiments: 1146 Number of datasets: 142062 Total size divided by number of BLOBs gives the average size of data access granules:50 - 60 kB/BLOB Basic Statistics

Users by continent Active Users 1-Jan-2008 until 14-Oct-2008

Download destinations Download destinations 1-Jan-2008 until 14-Oct-2008

Records per download

Recordsize

Access over WAN Downloads typically quite small, but huge downloads to some extent. Small downloads imply that users are not willing to wait long … We can not scan through large files for each download Granularity has to be small Requirements and constraints

Model data Climatological runs (global and regional) (IPCC, …) Weather forecasts (DPHASE, CEOP, …) Reanalysis data Observational data (COPS, CARIBIC, …) Satellite data products Datatypes

CERA provides the ability to store data of any format: These are the formats used GRIB (60%) NetCDF (18%) Other (22%) Formats

General Architecture Midtier Data

Contact Coverage Reference Entry Webserver Proxy Status Parameter Appl. Server Metadata Data Spatial Reference Distribution Local Adm. Data Org Data Access General Architecture Select timestep + region Convert format

Database Table 1 Data of timestep i 2 Data of timestep i+1 Data of single variable 3 Data of timestep i+2 … n Data of timestep i+n Storage within CERA Index

Handicap: not enough disk space available Data stored within database: approx. 400 TB Disks available: approx. 24 TB Database has been coupled transparently to the HSM system How do we avoid frequent tape accesses? Big cache  Store data as close as possible according to the needs of users: split into single variables Handicap

Migin Migout dxdb TBS - RW TBS - RW TBS - RO All tablespaces are moved “at once” to dxdb Tbl Partition 1 Tbl Partition 2 Tbl Partition 1 Data migration

Header 128k Table Lob Index Primary Key Blob data Inside the datafile

Header 128k Header 128k Frontend versus Backend Filesystem Frontend HSM Backend Part 1 = 512 MB Part 2 = 512 MB

Header 128k 3 1 2 5 4 Retrieving data Tape Request

Compression – nothing special used within the server Partitioning – allow parts of data to be moved to HSM Backup Nologging - beware of crash … Read only - two copies on tape Warehouse features

Metadata database will stay as is Oracle Databases holding data will be replaced by a new, self-made development Why? There is a certain risk that a future version of Oracle may not work with a / any HSM system On the long run some license costs shall be saved New implementation

Webserver Appl. Server Metadata Data General Architecture - new Oracle-DB Blobserver

Instead of keeping data within blobs in Oracle databases, data records will be kept within so called CERA Container Files. Ability to keep huge number of records. They provide fast access independent of position within file (granular access). Provided fault tolerance against tape damages by keeping checksums within the files. Enclose read/write operations against container files in transactions. Well known format CERA-Container

Concept / Team (namely Peter Drakenberg, DKRZ) Not yet really finished Software First software ready, in order to migrate data Convert old data Started last week, but will take at least a year Migration

1 8 Webserver Appl. Server 2 7 4 3 5 6 Dataflow: outbound Processing Metadata Data

Metadata Dataserver Dataflow: inbound Model run GFS Postprocessing

CERA allows for the storage of data of different kind Format independent Metadata enables addressing of internal and external data Users are typically fetching only small amounts of data. System allows for efficient access to small data granules By using warehousing functions like Partitioning by using small Oracle database Blobs or - in future - CERA Container files. Summary

Thank you !

CERA / WDCC

CERA / WDCC

Presentation Transcript

Phospholine Iodide in the management of esotropia

PREVENTING MYOPIA PROGRESSION

Enterprise Risk Management For Insurers and Financial Institutions