630 likes | 744 Views
When worlds collide Metasearching meets central indexes. Mike Taylor – mike@indexdata.com Index Data – http://indexdata.com/. Search. When worlds collide : metasearching and central indexes Mike Taylor – mike@indexdata.com. Search.
E N D
When worlds collide Metasearching meets central indexes Mike Taylor – mike@indexdata.com Index Data – http://indexdata.com/
Search When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Search When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Search Data When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Search Data Problem solved! When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Search ? ? Data Data Data When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Metasearch Magic box Searching Data Data Data Data When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Metasearch 360 Search EHIS (EBSCO) MetaLib Magic box Searching Data Data Data Data When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Metasearch 360 Search EHIS (EBSCO) MetaLib Pazpar2 (Open source) Magic box Searching Data Data Data Data When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Metasearch Magic box Searching Data Data Data Data When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Metasearch A.K.A. federated search Magic box Searching Data Data Data Data When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Metasearch A.K.A. federated search Magic box A.K.A. distributed search Searching Data Data Data Data When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
? Metasearch A.K.A. federated search Magic box A.K.A. distributed search Searching Data Data Data Data A.K.A. broadcast search When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Back to the sad searcher ? ? Data Data Data When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Central index Fat database Harvesting Data Data Data Data When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Central index Summon WorldCat Primo Central Fat database Harvesting Data Data Data Data When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Central index Summon WorldCat Primo Central MasterKey Fat database Harvesting Data Data Data Data When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Central index A.K.A. local index Fat database Harvesting Data Data Data Data When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Central index A.K.A. local index Fat database A.K.A. discovery services Harvesting Data Data Data Data When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
? Central index A.K.A. local index Fat database A.K.A. discovery services Harvesting Data Data Data Data A.K.A. vertical search When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
We need a controlled vocabulary! Metasearch = Federated search = Distributed search = Broadcast search Central index = Local index = Discovery services = Vertical search (if you ever heard anything so dumb) When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Which approach is better? Central indexing compared with metasearching: - requires harvesting infrastructure - requires lots of local storage - requires co-operation from services to be harvested - does not have access to all searchable data - will always be somewhat out of date - is faster at search time (or SHOULD be) - allows data to be normalised (e.g. dates extracted) - allows for better relevance ranking - can provide pre-baked facets - may have access to some data that not searchable When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Which approach is better? When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Which approach is better? When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Which approach is better? When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Which approach is better? Let's do both! When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
! “Integrated Search” Magic box Fat database Searching Harvesting Data Data Data Data Data Data Data Data When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
! “Integrated Search” Magic box Fat database Searching Harvesting Data Data Data Data Data Data Data Data When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
! “Integrated Search” Magic box Fat database Searching Harvesting Data Data Data Data Data Data Data Data When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
! “Integrated Search” Magic box Fat database Searching Harvesting Data Data Data Data Data Data Data Data When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Metasearch hides the complexity Magic box Searching Data Data Data Data When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Metasearch Nine tenths under The surface Magic box Searching Data Data Data Data When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Metasearch What you see looks beautiful Magic box Searching Data Data Data Data When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Problems that need solving A. Problems with pure metasearching B. How those problems change when you add a central index When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Problems with metasearching Examples based on Index Data's suite: Pazpar2 is a free metasearching engine with a stupid name http://indexdata.com/pazpar2/ MasterKey is a non-open suite that wraps it http://indexdata.com/masterkey/ MasterKey is only one way to use Pazpar2 Also integrated into other vendors' UIs. When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Problems with metasearching #1: No data server at all! Data is often only in a user-facing Web UI Must be made available via a standard protocol When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Problems with metasearching #1: No data server at all! Data is often only in a user-facing Web UI Must be made available via a standard protocol Option 1: build a gateway in Perl http://indexdata.com/simpleserver/ When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Problems with metasearching #1: No data server at all! Data is often only in a user-facing Web UI Must be made available via a standard protocol Option 1: build a gateway in Perl http://indexdata.com/simpleserver/ Option 2: MasterKey Connect (non-open) http://indexdata.com/connector-framework When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Problems with metasearching #2: data server is crap^H^H^H^Hsuboptimal Catalogs searchable using ANSI/NISO Z39.50 Support is very nominal in some cases When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Problems with metasearching #2: data server is crap^H^H^H^Hsuboptimal Catalogs searchable using ANSI/NISO Z39.50 Support is very nominal in some cases IRSpy probes behaviour http://irspy.indexdata.com MasterKey target profiles describe behaviour When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Problems with metasearching #3: Data servers don't support relevance When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Problems with metasearching #3: Data servers don't support relevance Pazpar2 does its own relevance ranking (Part of merging/deduplication) When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Problems with metasearching #4: Data servers don't return facets When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Problems with metasearching #4: Data servers don't return facets Pazpar2 calculates its own facets When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
There is a lot of magic in the magic box Searching Sorting Merging Deduplication Relevance Facet generation Time travel ... Magic box Data Data Data Data When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
There is a lot of magic in the magic box Searching Sorting Merging Deduplication Relevance Facet generation Time travel ... Remember, our engine is free: http://indexdata.com/pazpar2/ Pazpar2 Data Data Data Data When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
! What happens when we add a central index? Magic box Fat database Searching Harvesting Data Data Data Data Data Data Data Data When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Problems with integrated search #1: No data server at all! Data is often only in a user-facing Web UI When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Problems with integrated search #1: No data server at all! Data is often only in a user-facing Web UI When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com
Problems with integrated search #1: No data server at all! Data is often only in a user-facing Web UI You can't harvest Google When worlds collide: metasearching and central indexes Mike Taylor – mike@indexdata.com