1 / 20

GSC2 Maintenance

GSC2 Maintenance. GSC2 Annual meeting 2001. Database administrative tasks Database production tasks Identification and correction of errors Identification and correction of errors Processing Statistics and where do we go?. Database Administrative tasks. System and Database upgrades

marva
Download Presentation

GSC2 Maintenance

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. GSC2 Maintenance GSC2 Annual meeting 2001

  2. Database administrative tasks • Database production tasks • Identification and correction of errors • Identification and correction of errors • Processing Statistics and where do we go?

  3. Database Administrative tasks • System and Database upgrades • System upgrade to Windows 2000 • Hard disk storage increased to 4TB RAID • Objectivity 6.1 latest version has browsing and object manipulation improvements as well as greatly enhanced the transaction cleanup time. New version expected soon : PYTHON BINDING for Objectivity will be supported.

  4. DB Admin. (cont) • Database file migration into the new disk storage • all 32768 files required re-registration in addition to migrating into the disk storage. • file access error in migration. Vendor provided solution implemented and all files registered. • Reassessed the distribution of files within the RAID system based on previous problems with disk space. • We had implemented an ad hoc fix to highly non-homogeneous distribution of objects with respect to the HTM. • Some disks nearly empty and others nearly full!

  5. N0

  6. J: N00 N01 K: N02 L: N03 M: N10 N11 N12 N13 N: N20 N21 N22 N23 O: N30 N31 P: N32 N33 Q: S00 S01 S02 S03 R: S10 S: S11 T: S12 S22 U: S20 S21 V: S22 W: S23 X: S30 Y: S31 S32 S33 N0 N00 N01 N02 N03 N1 N10 N11 N12 N13 N2 N20 N21 N22 N23 N3 N30 N31 N32 N33 S0 S00 S01 S02 S03 S1 S10 S11 S12 S13 S2 S20 S21 S22 S23 S3 S30 S31 S32 S33 WHAT THIS MEANS IS WE NOW HAVE 16 DRIVES EACH AT ABOUT 4% TO 8% OF CAPACITY. PLENTY OF ROOM TO GROW AND EFFICIENT OPERATIONS

  7. Database production tasks • All tasks currently in production required to be recompiled and rebuilt. • Integration of PYTHON scripts into day to day production. • Insertion of reference catalogs into database. • Streamlining administrative tasks and integration into production tasks. • Porting photosol and new classification tasks into windows 2000 environment.

  8. Identification and correction of errors • Complexity • Ten’s of thousands of lines of code (C, FORTRAN, perl, idl, dcl, C++…) • 3 operating systems • Nearly 1 billion objects • Greater then 3600 unique photographic plates response, uniformity of the glass under stress, physics and chemistry of the emulsion and the manufacturing process… • A great number of factors associated with observational astronomy. seeing, atmospheric transmission, temperature, extinction, telescope tracking, image quality…

  9. Identification and correction of errors • Database errors • Corruption • Referential integrity

  10. Database errorscorruption and referential integrity • Effected small amount of data on the order of fractions of a percent. Complete rebuild of 4 or 5 databases for gsc2.2 delivery due to corruption. • Corruption is most serious due to the unknown nature of how it occurs • Especially difficult due to the hands off large scale production efforts we employ with tasks running for weeks on end over large datasets and complicated production tasks. • Vender request to help in order so they can better understand • Reference integrity fundamental to our project . • likely cause is concurrent access between database applications and possibly between applications and administrative tasks. • We do have some utilities to check for zero reference objects and to statically look at various ratios of 1,2,…,n references. • Complications due to various factors (primarily the complicated nature of the plate overlap regions). Use of this tool is mainly as a result of some other additional indication of problems. • The extent of this problem is again fairly small with only a handful of databases requiring un-matching and re-matching.

  11. Database errors • Matching integrity • Clearly visible on the sky maps. • Can be result of various causes. • in cases where it only occurs in overlap region could be the result database timing and file access problems as well as astrometric problems (astrometry tasks are very robust and the reduction has been very uniform for the 2nd epoch surveys). • The identification and correction of matching problems is made more difficult due to the difference between the plate based matching and the region based database.

  12. +20 +30 507 508 N321 N322 442 443 N 15 h 12 h In addition to J and F POSSII fields 507, 508, 442, 443 we have the IV N fields 507 And 508 which do not have magnitude selected limits imposed in the export task. The quick V fields N321 and N322 are loaded as well. All the matching is reasonably well done in the plate centers.

  13. Take a closer look in the North east corner of field 507 • Bright stars: • Entry with F, J, N, V • 2nd entry with F, N • 3rd entry with V • As well as various entries • that result from lack of a • magnitude limit for V and • N.

  14. So ? Whatever happened to create this must have been a fairly complicated sequence of events. • Clearly for some reason all the plates matched well at the center of the field but not around the edges. Evidence suggests that this is NOT an astrometric problem. • Completely un match all the plates in the region. • Verify the astrometric and photometric solutions for the IV N and quick V plates (I am assuming that there is a reason to include these fields despite the fact they really do not belong in this release) . • Re-match all the regions on all the plates again. • Re-export all those regions and check.

  15. What is the point? • No single method or task was able to detect and explain why this region looked so different. • Visualization tools: skycat showsky, fitsview and IDL were all used but care must be exercised as different problems (photometric error) could produce similar results. • Easy to generate global statistics like matching ratios and object index counts… only gave very qualitative indications that are hard to interpret. • The cause of the problem remains unknown. • On the order of 50 fields or 100 plates may be affected in a similar fashion requiring re-matching. The difficulty is in the identification due to the complicated plate overlaps. • The data is completely fixable! And this allows us to focus on the important issues. (science, calibration updates, loading and matching new surveys…)

  16. Photometric Errors • Fairly easy to identify and fix if they are large. • From the plate maps it is fairly obvious that the plate to plate consistency is fairly good. • Several cases found to occur in fields without good sequences deeper then GSPC1 (14th to 15th mag).

  17. Field 219 north Examination of J-F showed decent agreement to around 14th mag. Diverged at fainter magnitude to around mean J-F near 18 was about 2.5 to 3.0. Also turned out that the cause of the error was an isolated procedural error and an acceptable calibration had been performed but had failed to be applied properly.

  18. Catalog processing • Southern IVN (IS) 44 % complete • North IVN (XI) 35% complete • POSS1 E 100% complete • Small number have been loaded and matched. • Infrared surveys could be completed in relatively short time if concerted effort was made. • May have to make some hard decisions in light of staffing and resource issues.

  19. Conclusions • Database maintenance continues to be a high priority issue prior to proceeding with large scale operations such as loading and matching new surveys. • Data quality and integrity is being addressed but may not be incorporated into the export catalog on a fix by fix basis. High priority requests could be accommodated to some degree. • Estimates on the amount of data that have been compromised (primarily in the matching integrity) are at less then 10%. • We can identify and fix the vast majority of problems. • Data processing and future enhancements to the GSC2 are both planned and proceeding. • We are grateful and deeply indebted to those working in collaboration with us to help in the analysis and better understand this massive dataset.

More Related