100 likes | 105 Views
Explore the core challenges and offer suggestions on how to better understand, improve, and manage data sharing between China and the US. Is your data ready for the 'Big Data' era?
E N D
The Fifth China - U.S. Roundtable on Scientific Data Cooperation DAMA China Ben Hu November 27, 2011 Core underlying scientific, technical and managerial challenges for data (metadata) sharing - A few suggestions of how this subject area could be better understood, improved and managed
Agenda 1. ‘Data’ fundamental 2. Data, Meaning, Context – the base for sharing 3. State of Standards 4. Metadata Interoperability - Registry 5. Q/A
‘Data’ fundamental B A 6 Thought Thought 2.2 2.1.1 1 5 4 Ogden Triangle 2.1.2 Ogden Triangle 3.2 Symbol Referent 3.1 Symbol Referent
Common and Context • The difficulties Domain context 1 Domain context 2 Domain context 3 This line is intrinsically difficult to draw Common, Generic and Neutral
1. Do we really need all these?2. How come we have come with so many?3. Can these be organized a bit? 9th Open Forum for Metadata Registry, Kobe, 2006
A More Complete Picture … Standards ? Semantic (Conceptual) ? Data and Information ? XSD based Taxonomies Risks Data Srcs Credit Risk XBRL FR1 FR Taxonomy 1 Asset=111 Loan Provision Profit FR2 FR Taxonomy 2 ? asset Asset=222 F asset F asset L asset L asset Rules FR3 FR Taxonomy 3 L asset In FR F asset in FR wwww wwww Asset=333 wwww wwww Process DB, ERP, DW Abstraction
Metadata Interoperability- Two Layers Ontology Metadata Registry Semantic layer Data Element Concept Concept Domain Basic Metadata Layer Data Element Value Domain Data Instance Metadata1 Metadat2
Opportunities and Challeges • It is these boundary regions of science which offer the richest opportunities to the qualified investigator. • These specialized fields are continually growing and invading new territory. The result is like what occurred when the Oregon County was being invaded simultaneously by the United States settlers, the British, the Mexicans, and the Russians – an inextricable tangle of exploration, nomenclature, and laws. There are fields of scientific work, as we shall see in the body of this book, which have been explored from the different sides of pure mathematics, statistics, electrical engineering, and neurophysiology; in which every single notion receives a separate name from each group, and in which important work has been triplicated or quadruplicated, while still other important work is delayed by the unavailability in one field of results that may have already become classical in the next field. Norbert Wiener - Cybernetics: Or Control and Communication in the Animal and the Machine.