630 likes | 651 Views
Discover the magic of metasearching and central indexes with Mike Taylor. Explore the advantages and nuances of both approaches for efficient data retrieval and relevance ranking. Embrace the power of Integrated Search!
E N D
When worlds collide Metasearching meets central indexes Mike Taylor – mike@indexdata.com Index Data – http://indexdata.com/
Search When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Search When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Search Data When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Search Data Problem solved! When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Search ? ? Data Data Data When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Metasearch Magic box Searching Data Data Data Data When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Metasearch 360 Search EHIS (EBSCO) MetaLib Magic box Searching Data Data Data Data When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Metasearch 360 Search EHIS (EBSCO) MetaLib Pazpar2 (Open source) Magic box Searching Data Data Data Data When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Metasearch Magic box Searching Data Data Data Data When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Metasearch A.K.A. federated search Magic box Searching Data Data Data Data When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Metasearch A.K.A. federated search Magic box A.K.A. distributed search Searching Data Data Data Data When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
? Metasearch A.K.A. federated search Magic box A.K.A. distributed search Searching Data Data Data Data A.K.A. broadcast search When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Back to the sad searcher ? ? Data Data Data When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Central index Fat database Harvesting Data Data Data Data When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Central index Summon WorldCat Primo Central Fat database Harvesting Data Data Data Data When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Central index Summon WorldCat Primo Central MasterKey Fat database Harvesting Data Data Data Data When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Central index A.K.A. local index Fat database Harvesting Data Data Data Data When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Central index A.K.A. local index Fat database A.K.A. discovery services Harvesting Data Data Data Data When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
? Central index A.K.A. local index Fat database A.K.A. discovery services Harvesting Data Data Data Data A.K.A. vertical search When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
We need a controlled vocabulary! Metasearch = Federated search = Distributed search = Broadcast search Central index = Local index = Discovery services = Vertical search (if you ever heard anything so dumb) When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Which approach is better? Central indexing compared with metasearching: - requires harvesting infrastructure - requires lots of local storage - requires co-operation from services to be harvested - does not have access to all searchable data - will always be somewhat out of date - is faster at search time (or SHOULD be) - allows data to be normalised (e.g. dates extracted) - allows for better relevance ranking - can provide pre-baked facets - may have access to some data that not searchable When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Which approach is better? When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Which approach is better? When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Which approach is better? When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Which approach is better? Let's do both! When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
! “Integrated Search” Magic box Fat database Searching Harvesting Data Data Data Data Data Data Data Data When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
! “Integrated Search” Magic box Fat database Searching Harvesting Data Data Data Data Data Data Data Data When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
! “Integrated Search” Magic box Fat database Searching Harvesting Data Data Data Data Data Data Data Data When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
! “Integrated Search” Magic box Fat database Searching Harvesting Data Data Data Data Data Data Data Data When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Metasearch hides the complexity Magic box Searching Data Data Data Data When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Metasearch Nine tenths under The surface Magic box Searching Data Data Data Data When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Metasearch What you see looks beautiful Magic box Searching Data Data Data Data When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Problems that need solving A. Problems with pure metasearching B. How those problems change when you add a central index When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Problems with metasearching Examples based on Index Data's suite: Pazpar2 is a free metasearching engine with a stupid name http://indexdata.com/pazpar2/ MasterKey is a non-open suite that wraps it http://indexdata.com/masterkey/ MasterKey is only one way to use Pazpar2 Also integrated into other vendors' UIs. When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Problems with metasearching #1: No data server at all! Data is often only in a user-facing Web UI Must be made available via a standard protocol When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Problems with metasearching #1: No data server at all! Data is often only in a user-facing Web UI Must be made available via a standard protocol Option 1: build a gateway in Perl http://indexdata.com/simpleserver/ When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Problems with metasearching #1: No data server at all! Data is often only in a user-facing Web UI Must be made available via a standard protocol Option 1: build a gateway in Perl http://indexdata.com/simpleserver/ Option 2: MasterKey Connect (non-open) http://indexdata.com/connector-framework When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Problems with metasearching #2: data server is crap^H^H^H^Hsuboptimal Catalogs searchable using ANSI/NISO Z39.50 Support is very nominal in some cases When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Problems with metasearching #2: data server is crap^H^H^H^Hsuboptimal Catalogs searchable using ANSI/NISO Z39.50 Support is very nominal in some cases IRSpy probes behaviour http://irspy.indexdata.com MasterKey target profiles describe behaviour When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Problems with metasearching #3: Data servers don't support relevance When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Problems with metasearching #3: Data servers don't support relevance Pazpar2 does its own relevance ranking (Part of merging/deduplication) When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Problems with metasearching #4: Data servers don't return facets When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Problems with metasearching #4: Data servers don't return facets Pazpar2 calculates its own facets When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
There is a lot of magic in the magic box Searching Sorting Merging Deduplication Relevance Facet generation Time travel ... Magic box Data Data Data Data When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
There is a lot of magic in the magic box Searching Sorting Merging Deduplication Relevance Facet generation Time travel ... Remember, our engine is free: http://indexdata.com/pazpar2/ Pazpar2 Data Data Data Data When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
! What happens when we add a central index? Magic box Fat database Searching Harvesting Data Data Data Data Data Data Data Data When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Problems with integrated search #1: No data server at all! Data is often only in a user-facing Web UI When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Problems with integrated search #1: No data server at all! Data is often only in a user-facing Web UI When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Problems with integrated search #1: No data server at all! Data is often only in a user-facing Web UI You can't harvest Google When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com