1 / 19

Metadata Workshop

Metadata Workshop. Rick St. Denis Glasgow University April 26-28, 2004. Format. Goal: Answer the question “What is Metadata” in our document Method: Provaceteurs Topics list: augment now Get acquainted, divide and study topics, present together, course of action

gail
Download Presentation

Metadata Workshop

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Metadata Workshop Rick St. Denis Glasgow University April 26-28, 2004

  2. Format • Goal: Answer the question “What is Metadata” in our document • Method: Provaceteurs • Topics list: augment now • Get acquainted, divide and study topics, present together, course of action • Output: Revamped deliverables

  3. Rough Agenda • Mon: • 2-3 5 min on who we are • 3-3:30 Decide on topics • 3:30-5:00 Get to Stepps, Hotel • 5:00 Meet in 2 West Ave • Tues: Provocateur sessions and research • Wed: Final Document with deliverables, Plans for future: MO, CHEP abstract

  4. Topics • Metadata Architecture and components • Replica Catalogs, file catalogs, physics catalogs • Use Cases • Query Languages • Implementations and Performance. Technology Considerations, Performance reqs • Service architectures, Deployment Architectures • Database implementations: text/mysql/postgres/oracle/enth

  5. Informing ourselves • SAM Services (Julie) • Arda/OGSA-DAI(Gav will outline) • LHCB (Carmine) • AMI (Solveig) • Pool and Graphical Visualization (Carmine) • Spitfire (Paul) • PNPA-GGF (Rick) • Project Management (Tony) • Package services for release in SourceForge

  6. Use Cases • CDF5858: physicist use case (Rick) • HEPCAL II (Solveig,Tony) • Production • Analysis • ADA: Atlas catalogs – David Adams(Steve) • D0: Wyatt • Schema Update Document: use cases?(Adam)

  7. Services • Compare Arda and SAM approaches: Arda architecture:Gavin • Given Use cases: Define services • List Services from SAM:Services to services • Interfaces: The SAM service with one schema – the Grid services implemented in several schemas. • Interfaces: Physics catalog impact from failure of lower level services. “file content status”. • Action: outline models of access: physical/logical • Discrete or related bits of functionality: dependencies between services. Zenness of services. List of files, directive on where to use, not connection to why anymore.Performance implications on interfaces. • Wyatt, Gavin, Rick, Julie

  8. Deployment Architectures • Where do the services run? Application servers? Tiers of applications and databases • Replication for HA. At what tier? Application or DB? Oracle? Is it replication or mirroring. • What is the time constant for replication? • When do metadata become stale?Freshness date: status bits. • Centralized catalogs as a single point of failure: what are single points of failure. • HA strategies • Federation of metadata • Julie,Gavin,Paul,Solveig

  9. Tools • DB: jdbc,phpi,text, mysql, msql, oracle,xml,soap,python • Dbserver • Tools on top of *sql. • Relation to deployment architectures: db access directly or application server. • Replication • Data Virtualization • Rick, Gavin, Solveig, Adam,Julie

  10. Query Languages and Interfaces • SQL • Chains and Links (rick) • General Dimensions (Wyatt) • Queries against multiple databases. Related to deployment architecture (dimensions, c&l,SBIR II/enth) • POOL (Carmine)

  11. Monitoring • Sam TV (Adam) • Mining and instrumenting (Caitriana) • MonAlisa • File access patterns • stats

  12. Security • Table Access in a distributed architecture • Server to Server security • Access to the Server by the user • A standard certification protocol • VOMs • Spitfire security

  13. Next Steps • Design for Keyword-Value • Schema evolution and self-describing schema • Use previous 2 to automate transition from keyword-value to query-efficient schema and determination of which queries need to be satisfied. • Unique dataset tool

  14. Deliverables • Docs from next steps • Use case filtered for our group • Services: Decomposition of ER-Diagram into collab diagram • Deployment Arch: Enumerate problems • Monitoring: Stats on queries(accumluate/doc) • QueryLang/Int: Survey of QL(Pool.C&L) • Tools:Wrap corba w/xml • Deliverables: longer term

  15. Schedules • Monthly meeting Last Tues of month at 8:30/14:30/15:30 First: May 25. H323: 8272634 • Mailing list (Paul)

  16. Metadata for the Common Physicist A working group on metadata with representatives from ATLAS, BaBar, CDF, CMS, D0, and LHCB in cooperation with EGEE have identified overlapping user requirements that may be supported by common service implementations. Classes of metadata specific to each service and their relations are described. These include a set of use cases based on compilation of various HEP documents. These documents are used to inform interfaces in existing and planned services as described in metadata schema. Emphasis is placed on the evolution of schema using keyword-value pairs that are then transformed into a normalised performant database schema. A report is made of self-description mechanisms, which coupled with updating processes, allow the APIs to remain static as the schema evolves. A presentation is made of the way use cases drive performance. Requirements are presented for the physical and logical arrangement of service implementations, dictating the degree to which the databases containing the metadata may be distributed or centralised. A set of existing monitoring tools expose the validity and completeness of the use cases for experiments in various stages of maturity. A survey of the query languages, web service interfaces and tools in use across the experiments is presented.

  17. Future • Work to deliverables • Meet according to deadlines • Workshops according to major deadlines

More Related