470 likes | 750 Views
Masi-Carver UCDL82002 LA Hilton. Alexandria digital Library
E N D
1. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB
2. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB ADL Mission To provide a distributed spatially searchable digital library of geographically referenced materials.
The library's components may be distributed (spread across the Internet) or coexist within a single network or desktop.
Geographically-referenced means that all the information objects in the library will be associated with one or more regions ("footprints") on the surface of the Earth.
3. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB
4. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB
5. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB What information do you have about here?
6. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB ADL Organization The ADL project has:
An operational library run by the Davidson Library,
A research component (ADEPT) funded by NSF and others, and
A gazetteer (place name index and geocoder) run by the Davidson Library
7. UCSB Davidson Library 08.2002 Alexandria digital Library Operational Partners Implementers
AUT (Auckland University of Technology) Software implementation and content builder
DLESE (Digital Library for Earth Systems Education) Software implementation and content builder
CNR (Center for National Research, Pisa Italy) Content Builders
ADEPT Educational classroom content
CASS (Center for the Analysis of Sacred Sites) Video, sound, imagery text
ESSW MODIS real-time spacecraft imagery
Scripps SIOExplorer Oceanographic Data
8. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB
9. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB Prototypes Rapid Prototype (CD ROM + Arc View)
Java Application
Marc & FGDC Union Catalog
Web Version 1
Search Optimized Fields, AKA Search Buckets
Java Application
CDL Web Client
10. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB Marc & FGDC Web Prototype (1995)
11. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB Java Application Prototype (1997)
12. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB Webclient Interface (2002) 1/2
13. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB
14. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB ADL - Web Gazetteer
15. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB
16. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB Common Features of the Prototypes Map
Place name search
Search definition frame/panel/tab
Vocabulary support where appropriate
Standardized citation & metadata display/formatting Standardized citation & metadata display/formating Important. The librarians point this out.
This is what the geography network browser starts to provideStandardized citation & metadata display/formating Important. The librarians point this out.
This is what the geography network browser starts to provide
17. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB ADL Architecture Goals (1/2) Catalog separate from the data distribution
Metadata agnostic search methodology
Data center reliability
Collection level metadata
Search buckets
Strongly typed aggregated search field based on library concepts
Facilitate quick/easy ingest of collections
Abstract, searchable indexes
Catalog separate from the data distribution services same approach as the arcIMS metadata.
So we ended up with a metadata agnostic search methodology which scales to several million records, is database and dataschema independent and can be distributed/federated
FGDC is now discovering, distributed search does not fully function without a good description of the data at the data center, and data center reliability
Persistence of data/urls
Reliable servers
Backups
To speed searching, every collection needs to be described with information about what it contains, and how it can be searched. Collection level metadata is the missing piece of the puzzle!!!
Distributed search does not fully function without a good description of the data at the data center
Metadata may be heterogeneous
It's not necessary to map between metadata standards, instead we map to the search fields.
Solution - Develop a strongly typed aggregated search field based on library concepts.
Facilitate quick/easy ingest of collections
Abstract, searchable indexesCatalog separate from the data distribution services same approach as the arcIMS metadata.
So we ended up with a metadata agnostic search methodology which scales to several million records, is database and dataschema independent and can be distributed/federated
FGDC is now discovering, distributed search does not fully function without a good description of the data at the data center, and data center reliability
Persistence of data/urls
Reliable servers
Backups
18. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB ADL Architecture Goals (2/2) Digital library for georeferenced information
distributed
heterogeneous
rich services
scalable
many providers
collections, large and small
Standard components, interfaces
19. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB Components/services
20. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB
21. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB
22. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB What is a bucket? (1/3)
Strongly-typed aggregated search fields based on library concepts
Similar to Dublin Core, but define allowable content and search semantics, and are optimized for geospatial searching
Facilitate quick/easy ingest of collections
Abstract, searchable indexes:
Location, Time, Type, Format, Originator, Assigned terms, Subject related text and Identifiers
23. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB What is a bucket? (2/3) Strongly typed, abstract metadata category with defined search semantics to which source metadata is mapped
Key properties
name
Coverage date
semantic definition
The time period to which the item is relevant.
data type (strictly observed)
calendar date or range of calendar dates
syntactic representation (strictly observed)
ISO 8601
24. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB What is a bucket? (3/3) Source metadata is mapped to buckets
buckets hold not just simple values
2001-09-08
but rather, explicit descriptions of those values
(FGDC, 1.3, Time period of content, 2001-09-08)
multiple values may be mapped per bucket
Bucket definition includes search semantics
defines query terms
ISO 8601 date range
defines query operators
contains, overlaps, is-contained-in
semantics are slightly fuzzy in certain cases to accommodate multiple implementations
25. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB
26. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB Bucket Motivation Heterogeneous metadata
Uniform client services
Spatial search requires
Strongly typed search fields
Optimized for geospatial searching
27. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB Summary A bucket is a strongly typed, abstract metadata category with defined search semantics to which source metadata is mapped
Supports discovery/search across distributed, heterogeneous collections that use metadata structures of their choosing
Supports high-level searching across collections and supports drill-down searching to the item-level metadata elements
28. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB Benefits of the Architecture Standard Readily-Optimized Search Methodology
Simplifies Design:
Provides a client with a standard API for searching different data sources.
Provides a way to discover a changed data locations.
Scalability
Scale by upgrading the database
Scale by distributing the databases Simplicity of design. Does not mean that a simple design is easy to build.Simplicity of design. Does not mean that a simple design is easy to build.
29. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB ADL Metadata
30. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB Collection Ingest Procedure
31. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB
32. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB
33. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB
34. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB
35. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB
36. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB
37. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB
38. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB
39. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB
40. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB
41. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB
42. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB
43. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB
44. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB
45. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB
46. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB
47. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB
48. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB
49. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB The Ideal ADL Entry Portal The Portal will be:
Easy to use - allows patron to search collection w/out knowing keywords or jargon
Flexible - to allow users of differing levels of geographic knowledge to find the data they seek in the minimal amount of time
Help oriented - if user does not find what s/he wants, we in MIL will find out and use that knowledge to develop the collection
Dynamic - so that the user will want to return to see the latest features, collections and tools
Educational - so that the user can learn to use the site more effectively
Interesting uncluttered, new data, featured events
50. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB
51. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB
52. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB
53. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB Stay Tuned
54. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB Contact Information
Larry Carver, AUL Library Technologies
carver@library.ucsb.edu 805-893-4433
Catherine Masi, ADL Coordinator
masi@library.ucsb.edu 805-893-7661
David Valentine, Senior Systems Engineer
valentine@library.ucsb.edu 805-893-4545
55. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB Lessons Learned Spatial orientation a problem for some users help/guidance needed
Crosswalking between different metadata systems can result in loss of information during translation
Technical metadata standards are not written with information discovery in mind
Not every field in a technical metadata standard will end up being populated
For search and discovery, it is the indexing of the metadata that is important
Geographic and text queries can cause problems with query optimizers
Every prototype seems to lose a few features when moved to the next prototype People don't know how to locate themeselves on a map: So we built a gazetteerDeveloping a database system to store metadata in a relational or object-relational database is not necessary. Using a database system to store metadata in relational tables will not improve searching performance. good to metadata searching. it just does not scale.
Crosswalking between different Metadata systems is aa problem. You lose semantics and other information in thetranslaationMetadata Standards are not written with information discovery in mind.Not every field in a metadata standard will end up being populated. So its not neccary to make every serch field searchable
Geographic and text queries can cause problems with query optimizers. Geographic indexing and text indexing .
Every prototype seems to lose a few features when moved to the next prototype. But this project was about the building scalable geographic searchPeople don't know how to locate themeselves on a map: So we built a gazetteerDeveloping a database system to store metadata in a relational or object-relational database is not necessary. Using a database system to store metadata in relational tables will not improve searching performance. good to metadata searching. it just does not scale.
Crosswalking between different Metadata systems is aa problem. You lose semantics and other information in thetranslaationMetadata Standards are not written with information discovery in mind.
56. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB Collection-level aggregation Collection-level metadata describes
buckets supported by the collection
item-level metadata mappings
statistical overviews
item counts
spatiotemporal coverage histograms
Example (de-XML-ized)
in collection foo, the Originator bucket is supported and the following item fields are mapped to it:
(FGDC, 1.1/8.1, Citation/Originator) [973 items]
(USGS DOQ, PRODUCER, Producer) [973 items]
(DC, Creator, Creator) [1249 items]
unknown [6 items]
57. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB Searching collections Bucket-level
uniform across all collections
example
search all collections for items whose Originator bucket contains the phrase geological survey
Field-level
collection-specific
but discovery and invocation mechanisms are uniform
functionally equivalent to searching the entire bucket plus additional constraint
example
search collection foo for items whose FGDC 1.1/8.1 field within the Originator bucket contains the phrase
58. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB
59. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB ADL Middleware Details
60. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB Middleware server
61. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB Interoperability problem Distributed, heterogeneous collections
locally, autonomously created and managed
Minimal requirements on collection providers
allow use of native metadata
Provide uniform client services
common high-level interface across collections
structured means of discovering and exploiting (possibly collection-specific) lower-level interfaces
Assumptions
items have metadata
items have sufficient, good metadata
i.e., this is a metadata interoperability problem
62. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB ADL core buckets (1/6) Subject-related text
Title
Assigned term
Originator
Geographic location
Coverage date
Object type
Feature type
Format
Identifier
63. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB ADL core buckets (2/6) Subject-related text
type: textual
description: text indicative of the subject of the item, not necessarily from controlled vocabularies
superset of Title and Assigned term
multiple values: concatenated
compare: DC.Subject
Title
type: textual
description: the items title
subset of Subject-related text
multiple values: concatenated
compare: DC.Title
64. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB ADL core buckets (3/6) Assigned term
type: textual
description: subject-related terms from controlled vocabularies
subset of Subject-related text
multiple values: concatenated
compare: qualified DC.Subject
Originator
type: textual
description: names of entities related to the origination of the item
multiple values: concatenated
compare: DC.Creator + DC.Publisher
65. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB ADL core buckets (4/6) Geographic location
type: spatial
description: the subset of the Earths surface to which the item is relevant
multiple values: unioned
compare: DC.Coverage.Spatial
Coverage date
type: temporal
description: the calendar dates to which the item is relevant
multiple values: unioned
compare: DC.Coverage.Temporal
66. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB ADL core buckets (5/6) Object type
type: hierarchical
vocabulary: ADL Object Type Thesaurus (image, map, thesis, sound recording, etc.)
multiple values: unioned
compare: DC.Type
Feature type
type: hierarchical
vocabulary: ADL Feature Type Thesaurus (river, mountain, park, city, etc.)
multiple values: unioned
compare: none
67. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB ADL core buckets (6/6) Format
type: hierarchical
vocabulary: ADL Object Format Thesaurus (loosely based on MIME)
multiple values: unioned
compare: DC.Format
Identifier
type: qualified textual
description: names and codes that function as unique identifiers
multiple values: treated separately
compare: DC.Identifier
68. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB
69. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB Target Audience Phase 1 Simplified Interface 1. UCSB students, faculty and staff
2. University of California students, faculty and staff
3. Researchers
4. Other academic institutions
5. GIS/Map producers
6. Non-UC/Casual users/Other local clients/General web users
70. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB ADL Does: Provides quick and accurate answers to the question "What data is available for this geographic area?
Provides both, online spatial content and metadata of library holdings for local and distributed collections.
Internet discovery, access and delivery.
ADL may be searched using background maps, other imagery, as well as by geographic placenames.
The ADL project has two venues: an operational library run by the Davidson Library and a research component (ADEPT) funded by NSF and others.
71. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB
72. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB Goals of the ADL project Provide geospatial access to all classes of information
Provide access to both library and personal collections
Provide supporting information services for:
research
Learning
Part of distributed (spatial) information infrastructure
Position UCSB as a national leader in geospatial information
73. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB
74. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB ADL Interoperability Architecture
75. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB Bucket mapping
76. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB
77. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB Collection discovery Collection registry polls known library servers
Relevance model
binary
more is better
Query language
range searching over space, time, vocabulary terms
subset of item-level query language
Limitations
no joint constraint conditions
no text statistics la STARTS
multiple, overlapping vocabularies
78. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB Architecture Search buckets:
Abstract, searchable indexes
Similar to Dublin Core, but buckets define allowable content and search semantics, and are optimized for geospatial searching
Designed to be easy for populating collections
Includes all traditional library search elements
79. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB Data model Collection
name
static, dynamic metadata
set of items
functional behaviors
Item
identifier
bucket view
searchable metadata mapped to standard, typed buckets
browse view
content abstracts Item, contd
access view
multiple access points
file-like
human interface
programmatic service
offline
other views
collection- and/or item-specific
FGDC, MARC, etc.
content
80. Masi-Carver UCDL82002 LA Hilton Alexandria digital Library Davidson Library, UCSB What exists?
Where it is located?
Is it useful given my needs?
How do I get it?
Is it in a form that I can use?
Conditions of use?
Is it in original or altered form?
How big is the digital file?