120 likes | 209 Views
Data discovery from a digital library perspective. Greg Janée, Darren Hardy UC Santa Barbara. Outline. Questions grappling with granularity struggling with search dithering over distribution pondering process Integrating search with access. institution (NASA). data center (GSFC).
E N D
Data discoveryfrom adigital library perspective Greg Janée, Darren Hardy UC Santa Barbara
Outline • Questions • grappling with granularity • struggling with search • dithering over distribution • pondering process • Integrating search with access
institution (NASA) data center (GSFC) program (MODIS) product (sea surface temperature) resolution (1km) space time granule datum Granularity type organization
Approaches I • ADL • uniform object (metadata) representation • flat list of collections (=containers) • possible extensions: • collections as first-order objects • nested containers • THREDDS • hierarchical “collection” datasets • “coherent” datasets (=aggregation server?) • “direct” datasets
Approaches II • Granularity on the Web... • webpage • multi-page document • website • ...and sidestepping it • uniform representation (webpage) • page linking • visible, decomposable identifiers (URLs)
Flattening granularity • Use heuristics to return “best” match inherit descriptive metadata dataset aggregate intrinsic metadata
Search • Type • text, numeric, space, time, ... • Source • data itself • intrinsic metadata • added (usually descriptive) metadata • 3rd party
Distribution • Centralized system • eg. Google, ECHO • SPOF; requires resources • Peer-to-peer • eg. BRICKS, built on P-GRID • MPOF; requires commitment • ADL: incomplete peer-to-peer
A “textbook” search process • Classic process (Lancaster 1979) • Information need • Stated request • Selection of database • Search strategy • Search in database • Screening of output • Web search - about the same 25 years later
What’s the real process? • Irrational search (Pharo & Järvelin 2006) • Textbook search processes insufficient • Disjointed incrementalism theory • Many smaller steps • Learning during a search • Subjective & dynamic information needs over time • What’s the ideal for earth science data users? • How do you inform choices during search? • How do you formulate a search, and what’s the context? • When is enough enough?
Integrating search with access • File menu • Open... • Search library... • Close • Quit • Query results returned as a THREDDS catalog?