1 / 39

Description of information resources for molecular spectroscopy on the base of XML platform

Description of information resources for molecular spectroscopy on the base of XML platform. Fazliev A.Z . Institute of Atmospheric Optics SB RAS, Tomsk, Russia. The author would like to acknowledge the Russian Foundation for Basic Research for financial support ( grant 02-07- 9 0139).

knut
Download Presentation

Description of information resources for molecular spectroscopy on the base of XML platform

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Description of information resources for molecular spectroscopy on the base of XML platform Fazliev A.Z. Institute of Atmospheric Optics SB RAS, Tomsk, Russia The author would like to acknowledge the Russian Foundation for Basic Researchfor financial support (grant 02-07- 90139) International Conference and Young Scientists School on Computational Information Technologies for Environmental Sciences: “CITES-2005”Novosibirsk, Russia, March 13-23, 2005

  2. The subject of atmospheric spectroscopy: Molecule structure parameters(energy levels, intermolecular potential parameters,…) Molecula spectra (vibrational, rotational, vibration-rotational) Spectral functions (absorption coefficient, transmittance function, absorption cross-section, … ) Methods of testing: experimental measurements, quantum mechanical and semiempirical calculations Data level:50 molecules are of interest for atmospheric research. Complete data for the water molecule reflected in 300 000 000 spectral lines (99,5% - weak lines). Spectral data description level:some line parameters are described by uncertainty indicesand bibliography. Introduction International Conference and Young Scientists School on Computational Information Technologies for Environmental Sciences: “CITES-2005”Novosibirsk, Russia, March 13-23, 2005

  3. History USA.Works are carried on from the late sixties. The databank HITRAN is created. France.Works are carried on from the mid-seventies. The databankGEISA is created. Russia.Works on informational resources for molecular spectroscopyare carried on from early eighties at the Institute of Atmospheric Optics SB RAS. The early nineties initiated the client side information systems. The advent of Internet technologies allowed development of a new type of information systems for the domain of molecular spectroscopy.The information resource (http://spectra.iao.ru) is based on the databanksHitranandGeisa. International Conference and Young Scientists School on Computational Information Technologies for Environmental Sciences: “CITES-2005”Novosibirsk, Russia, March 13-23, 2005

  4. Molecular spectroscopy web sites(BabikovYu.L., Golovko V.F., Mikhailenko S.N.) (1999-2004) Spectroscopy of Atmospheric Gases http://spectra.iao.ru Spectroscopy &molecular properties of Ozone (http://ozone.iao.ru) Carbon Dioxide Spectroscopic Databank  (http://cdsd.iao.ru) International Conference and Young Scientists School on Computational Information Technologies for Environmental Sciences: “CITES-2005”Novosibirsk, Russia, March 13-23, 2005

  5. 1. Spectroscopy of Atmospheric Gases • Survey of the content of various datasources: HITRAN and GEISA spectral databanks, original data obtained by IAO researchers in collaboration with other scientists, H2O spectra simulated by Partridge and Shwenke etc... • Simulation of intensity diagram, absorption coefficient profile, transmittance, absorption, and radiance spectra at given conditions for selected molecule, isotopic species, and set of spectral bands or for selected wavenumber region and gas mixture. • Spectra convolution with a given apparatus function. • Direct problem solution (spectrum simulation by given hamiltonian and dipole moment parameters). • Gas and/or isotopic species mixture preparation by user. • Uploading of user spectra to server side and comparison of them to spectra obtained with the system. • 2. Spectroscopy & Molecular Properties of Ozone • molecular structure and spectroscopic constants in the ground electronic state • potential function, dipole moment surface, transition moments • vibration and vibration-rotation energies and wavefunctions, isotopic effects • simulated and experimental spectra from MW to Infrared • Gas and/or isotopic species mixture preparation by user. • Uploading of user spectra to sever side and comparison of them to spectra obtained with the system. • 3. Carbon Dioxide Spectroscopic Databank • Survey of the content of CDSD, HITRAN/HITEMP, GEISA spectral databanks for CO2 molecule Problems solved within these web sites International Conference and Young Scientists School on Computational Information Technologies for Environmental Sciences: “CITES-2005”Novosibirsk, Russia, March 13-23, 2005

  6. Introduction Most of the internet accessible scientific information systems (IS) are document oriented. Features of these IS: Huge information contents: Ineffective context search; High level of infosmog as a result of the search. To solve this problem, two information resource components are needed (data&metadata): Currently, most information resources (“hidden” Web) from the point of their structure and semantics are “black box”. International Conference and Young Scientists School on Computational Information Technologies for Environmental Sciences: “CITES-2005”Novosibirsk, Russia, March 13-23, 2005

  7. Calculations DB Rough data Public DB Experiment Information Resources of e-Science e-Science Technology Introduction Organization Multimedia(video, sound, animation) Metadata International Conference and Young Scientists School on Computational Information Technologies for Environmental Sciences: “CITES-2005”Novosibirsk, Russia, March 13-23, 2005

  8. e-Science The Data-Computation Layer Introduction The Information Layer The Knowledge Layer David De Roure, Nicholas Jennings, Nigel Shadbolt, A Future e-Science Infrastructure, Report for EPSRC/DTI Core e-Science Programme, 2001. International Conference and Young Scientists School on Computational Information Technologies for Environmental Sciences: “CITES-2005”Novosibirsk, Russia, March 13-23, 2005

  9. Application and interfaces level Middleware level System level Hardware IntroductionPortal ATMOS. Middleware Interface human-PC Interface PC-PC Data and computation Metadata service Software Authorization, applied logic, linguistic support, dialog system facility, etc Middleware core Operating system, compilers DataBase Management System Web server International Conference and Young Scientists School on Computational Information Technologies for Environmental Sciences: “CITES-2005”Novosibirsk, Russia, March 13-23, 2005

  10. Isolated molecule structure Problems: Finding of Watson’sHamiltonian constants, parameters of short and long range potential, wave functions and energy levels Molecular spectral properties Problems: Finding of spectral line parameters (wave number, intensity, line width, line shift, …). Identification of spectral lines from experimental spectra. Spectral properties of atmospheric gases Problems: Weak line study Continuum problem Web site “Atmospheric spectroscopy” (http://saga.atmos.iao.ru) The problems of molecular spectroscopy International Conference and Young Scientists School on Computational Information Technologies for Environmental Sciences: “CITES-2005”Novosibirsk, Russia, March 13-23, 2005

  11. Distributed IS oriented for the problems of molecular spectroscopy (RFBR project) St.Petersburg ? ? Tomsk (TSU) N.Novgorod Moscow Data and metadata exchange Tomsk(IAO SB RAS) Basic host of DIS Metadata exchange Client host International Conference and Young Scientists School on Computational Information Technologies for Environmental Sciences: “CITES-2005”Novosibirsk, Russia, March 13-23, 2005

  12. Web Languages • Existing Web languages extended to facilitate content description • XML XML Schema (XMLS) • RDF RDF Schema (RDFS) • XMLS is not an ontology language • RDFSis recognisable as an ontology language • Classes and properties • Sub/super-classes (and properties) • Range and domain (of properties) • OWL is an ontology vocabulary • Well defined semantics • Formal properties well understood (complexity, decidability) • Known reasoning algorithms • Implemented systems (highly optimised) International Conference and Young Scientists School on Computational Information Technologies for Environmental Sciences: “CITES-2005”Novosibirsk, Russia, March 13-23, 2005

  13. The Data-Computation Layer International Conference and Young Scientists School on Computational Information Technologies for Environmental Sciences: “CITES-2005”Novosibirsk, Russia, March 13-23, 2005

  14. Data groups in the distributed ICS “Molecular spectroscopy" 1. Basic parameters of a molecule These are the characteristics that determine molecular energy.Depending of description method they can be either full molecular Hamiltonian parameters (potential energy, dipole moment, etc.) or effective Hamiltonian parameters (rotational, centrifugal, and resonance constants, effective dipole moment parameters, etc.). Can add quadruple, octopole molecular moments and other parameters characterizing intermolecular interaction in gases. 2. Spectral line parameters • Parameters of isolated spectral line, • “Local” and “global” quanta indexes, • Collision dependent parameters. 3. Spectral functions Absorption coefficient, transmittance function, absorption cross-section, etc. International Conference and Young Scientists School on Computational Information Technologies for Environmental Sciences: “CITES-2005”Novosibirsk, Russia, March 13-23, 2005

  15. Experiment SubstanceAbsorbing gas.Buffer gas. Thermodynamic conditions Temperature.Pressure. Partial pressure of buffer gas. Spectral parameters Resolution. Path length.Frequency range. Absorption coefficient Transition frequency. Absorption coefficient. Absorption coefficient. Data structure Calculation SubstanceAbsorbing gas.Buffer gas. Thermodynamic conditions Temperature.Pressure. Partial pressure of buffer gas. Spectral parameters Frequency range. Contour type. Absorption coefficient Transition frequency. Absorption coefficient. Data source Spectral line parameters. Statistical sums. International Conference and Young Scientists School on Computational Information Technologies for Environmental Sciences: “CITES-2005”Novosibirsk, Russia, March 13-23, 2005

  16. Basic concepts International Conference and Young Scientists School on Computational Information Technologies for Environmental Sciences: “CITES-2005”Novosibirsk, Russia, March 13-23, 2005

  17. Parameters of spectral lines(entity) Isolated molecule • Vacuum wavenumber • Intensity • Lower level energy • Statistical weight of lower level • Identification Interacting molecule (gas) • Line shift • Pressure induced linewidth (selfbroadening, buffer molecule broadening) • Temperature dependence of linewidth Other parameters • Reference indices • Uncertainty indices International Conference and Young Scientists School on Computational Information Technologies for Environmental Sciences: “CITES-2005”Novosibirsk, Russia, March 13-23, 2005

  18. Parameters of spectral lines(attributes) The origin of transition frequency and intensity- experiment- calculation- synthetic Intensity value scale- absolute- relative Uncertainty- relative error- absolute error International Conference and Young Scientists School on Computational Information Technologies for Environmental Sciences: “CITES-2005”Novosibirsk, Russia, March 13-23, 2005

  19. Last version ofmetadata inHITRAN International Conference and Young Scientists School on Computational Information Technologies for Environmental Sciences: “CITES-2005”Novosibirsk, Russia, March 13-23, 2005

  20. Watson’s Hamiltonian constants • Energy levels and wave functions • Long range part of intermolecular potential (dipol, quadrupole, …, octopolemoments). • Short range part of intermolecular potential (for example, Lennard-Jones potential,  - depth of potential well, σ- range ofrepulsive force) Molecular structure data Data source Basic physical quantities Molecule classification: Class for Molecule(Linear triatomic molecules with large Fermi resonance , Non-linear triatomic molecules, Diatomic molecules, … )Symmetry Group(C2v, Td, Cinf v, ... )Group Classification(Asymmetric rotor, Spherical rotor, Doublet- П ground electronic states (half-integer J , integer F), … ) International Conference and Young Scientists School on Computational Information Technologies for Environmental Sciences: “CITES-2005”Novosibirsk, Russia, March 13-23, 2005

  21. XML-Scheme RDF-Scheme XML Notation International Conference and Young Scientists School on Computational Information Technologies for Environmental Sciences: “CITES-2005”Novosibirsk, Russia, March 13-23, 2005

  22. Comparison of calculated absorption coefficient of CO2 with experimental values International Conference and Young Scientists School on Computational Information Technologies for Environmental Sciences: “CITES-2005”Novosibirsk, Russia, March 13-23, 2005

  23. Task of data entry International Conference and Young Scientists School on Computational Information Technologies for Environmental Sciences: “CITES-2005”Novosibirsk, Russia, March 13-23, 2005

  24. Absorption coefficient calculation Substance choice Setting of thermodynamic and spectral parameters Data source choice Setting of approximations for calculations International Conference and Young Scientists School on Computational Information Technologies for Environmental Sciences: “CITES-2005”Novosibirsk, Russia, March 13-23, 2005

  25. Absorption coefficient metadata International Conference and Young Scientists School on Computational Information Technologies for Environmental Sciences: “CITES-2005”Novosibirsk, Russia, March 13-23, 2005

  26. XML-document RDF-document DC-metadata Data and metadata for machine processing International Conference and Young Scientists School on Computational Information Technologies for Environmental Sciences: “CITES-2005”Novosibirsk, Russia, March 13-23, 2005

  27. The Information Layer International Conference and Young Scientists School on Computational Information Technologies for Environmental Sciences: “CITES-2005”Novosibirsk, Russia, March 13-23, 2005

  28. Metadata description methods • Register formation • CDF, RSS, Atom • Formatted metadataRDF • Required - DC • Dependent on resource type • CIMI, MARC • Domain metadata RDF • based onRDF-schemeandOWL International Conference and Young Scientists School on Computational Information Technologies for Environmental Sciences: “CITES-2005”Novosibirsk, Russia, March 13-23, 2005

  29. Absorption coefficient metadata International Conference and Young Scientists School on Computational Information Technologies for Environmental Sciences: “CITES-2005”Novosibirsk, Russia, March 13-23, 2005

  30. Absorption coefficient. What else? СО2-CO2 15 mkm Т=296 К Contourof Moskalenko 4.3 mkm Т=296 К 2397 - 2575 сm-1Contour ofBenedict Т=296 К 2380 - 2585 cm-1Contour ofBoulet Т=296 К 2400 - 2580 cm-1Contour ofGal’tsev Т=296 К 2140 - 2250 cm-1Contour ofBoulet, asymmetric Т=296 К Contour ofMoscalenko Т=218 К 2380 - 2585 cm-1Contour ofBoulet Т=190-800 К 2400 - 2600 cm-1Contour ofHartman 2.7 mkm Т=296 К 3750 – 4100 cm-1Contour ofGal’tsev Т=296 К 3750 – 4100 cm-1Contour ofBezard Т=296 К Contour ofMoscalenko 2.3 mkm Т=296 К 3800 – 4700 cm-1Contour ofTonkov 2.0 mkm Т=296 К Contour ofMoscalenko 1.6 mkm Т=296 К Contour ofMoscalenko 1.4 mkm Т=296 К 6985 – 7100 cm-1 Contour ofGal’tsev Т=296 К Contour ofMoscalenko Line contour • Typical (Voigt, Doppler, Lorentz) • Line wing theory (Tvorogov) (CO2, Н2О) • Empirical contours • СО2-CO2, Н2О-Н2О • СО2-N2, СО2-O2, Н2О-Н2О + Н2О-N2 International Conference and Young Scientists School on Computational Information Technologies for Environmental Sciences: “CITES-2005”Novosibirsk, Russia, March 13-23, 2005

  31. Long range forces Straight line trajectories Linked diagrams Short range forces Parabolic trajectories Exact trajectories Perturbation theory Broadening and shifting coefficients Method of potentials Trajectory methods Dynamical methods MurphyMethod Anderson Method Robert-Bonamy Method FCRB Method RBET Method Physical models International Conference and Young Scientists School on Computational Information Technologies for Environmental Sciences: “CITES-2005”Novosibirsk, Russia, March 13-23, 2005

  32. Problems with RDFS • RDFS is too weak to describe resources in sufficient detail • No localised range and domain constraints • Can’t say that the range of isDissociated is molecules when applied to molecules • No existence/cardinality constraints • Can’t say that all triatomic moleculeshave exactly 3 atoms • No transitive, inverse or symmetrical properties • Can’t say that isPartOf is a transitive property, that hasPart is the inverse of isPartOf or that come_into_collission_with is symmetrical • … • Difficult to provide reasoning support International Conference and Young Scientists School on Computational Information Technologies for Environmental Sciences: “CITES-2005”Novosibirsk, Russia, March 13-23, 2005

  33. “The aim of the knowledge layer is to act as an infrastructure to support themanagement and application of scientific knowledge to achieve particular types ofgoal and objective. In order to achieve this, it builds upon the services offered by thedata-computation and information layers. The first thing to reiterate at this layer is the problem of the sheer scale of content weare dealing with. We recognise that the amount of data that the data grid is managingwill be huge. By the time that data is equipped with meaning and turned intoinformation we can expect order of magnitude reductions in the amount. However theamount of information remaining will certainly be enough to present us with aproblem – a problem recognised as infosmog – the condition of having too muchinformation to be able to take effective action or apply it in an appropriate fashion to aspecific problem. Once information is delivered that is destined for a particularpurpose, we are in the realm of the knowledge grid that is fundamentally concernedwith abstracted and annotated content, with the management of scientific knowledge.” The Knowledge Layer David De Roure, Nicholas Jennings, Nigel Shadbolt, A Future e-Science Infrastructure, Report for EPSRC/DTI Core e-Science Programme, 2001. International Conference and Young Scientists School on Computational Information Technologies for Environmental Sciences: “CITES-2005”Novosibirsk, Russia, March 13-23, 2005

  34. Ontologies and Ontology Representations • Most of the time we will just say “concept” and “ontology” but whenever anybody starts getting religious, remember… • It is only a representation! • We are doing engineering, not philosophy – although philosophy is an important guide • There is no one way! • But there are consequences to different ways • and there are wrong ways • and better or worse ways for a given purposes • The test of an engineering artefact is whether it is fit for purpose • Ontology representations are engineering artefacts International Conference and Young Scientists School on Computational Information Technologies for Environmental Sciences: “CITES-2005”Novosibirsk, Russia, March 13-23, 2005

  35. Why ontology is hard • Clash of intuitions • Subject Matter Experts motivated by custom & practice • Prototypes & Generalities • Logicians motivated by logic & computational tractability • Definitions and Universals • Transparency & predictability vs Rigour & Completeness • Conflation of Models • Meaning: Correctness of Classification & retrieval • Retrieval: Task of discovery, search, or finding • Use: Task of data entry, decision support, … • Acquisition: Task of capturing knowledge • Quality assurance: Criteria for whether it is ‘correct’ International Conference and Young Scientists School on Computational Information Technologies for Environmental Sciences: “CITES-2005”Novosibirsk, Russia, March 13-23, 2005

  36. Top-Level Categories (John Sowa)http://www.jfsowa.com/ontology/toplevel.htm International Conference and Young Scientists School on Computational Information Technologies for Environmental Sciences: “CITES-2005”Novosibirsk, Russia, March 13-23, 2005

  37. Data Computation Top- and bottom-level domain ontologies:Quantum mechanics and electrodynamics (top level) Molecular spectroscopyMathematical algorithms (bottom level) Special features of the ontology for molecular spectroscopy • Data sources: • Experiment and calculation • Presentation levels: • Physical model – Mathematical model – Information model – Program model - …. • Resource description in spectroscopy – OWL DL European projects on ontologies– Esperonto, Monet, … International Conference and Young Scientists School on Computational Information Technologies for Environmental Sciences: “CITES-2005”Novosibirsk, Russia, March 13-23, 2005

  38. Dialogue system design (ontology application) Web site Portal Web site Administrative web site Domain ontology International Conference and Young Scientists School on Computational Information Technologies for Environmental Sciences: “CITES-2005”Novosibirsk, Russia, March 13-23, 2005

  39. Thank you for your attention The author would like to acknowledge the experts in the domain of molecule spectroscopycorresponding member of RASS.D. Tvorogov Prof. A.D. BykovProf. O.B. Rodimova International Conference and Young Scientists School on Computational Information Technologies for Environmental Sciences: “CITES-2005”Novosibirsk, Russia, March 13-23, 2005

More Related