130 likes | 346 Views
Implementing Local Search with Apache Solr and Lucene. Grant Ingersoll. Topics. Use Cases Concepts of Local Search Local Search support in Apache Solr Indexing Filtering Searching Faceting Sorting Demo. Use Cases. Asset Management Social Networking Find all friends near me
E N D
Implementing Local Search with Apache Solr and Lucene Grant Ingersoll
Topics • Use Cases • Concepts of Local Search • Local Search support in Apache Solr • Indexing • Filtering • Searching • Faceting • Sorting • Demo
Use Cases • Asset Management • Social Networking • Find all friends near me • Targeted, local search results and ads • “restaurants in Austin Texas” • “Starbucks, 55313” • Business Intelligence • Restrict doc set for analysis by location
Spatial Search Concepts • Spatial Data Types • Points (latitude/longitude) • Lines • Shapes • Maps and overlays • Streets, POI • Integration with unstructured text • Metadata, descriptions, user reviews, etc. http://www.openstreetmap.org/?lat=44.9744&lon=-93.2484&zoom=14&layers=B000FTFT
Application Needs • Query Parsing • Efficient distance calculations • Euclidean, Great Circle (Haversine), Vincenty’s • Filtering • Bounding Box • Sort by Distance • Relevance Enhancement • Faceting • Advanced: shape intersections, routes
State of Solr Spatial • Native Field Types for Latitude/Longitude as well as n-dimensional Point • Native support for: • Filtering by distance • Boosting by distance • Sorting by distance • Faceting by distance (sort of) • Still needed: • Pseudo Fields • Query Parser support for geocoding • Shapes
Configuration • Schema • <fieldType name="point" class="solr.PointType" dimension="2" subFieldSuffix="_d"/> • <fieldType name="location" class="solr.LatLonType" subFieldSuffix="_coordinate"/> • <fieldtype name="geohash" class="solr.GeoHashField"/> • Solrconfig: • None!
Indexing • Just like always: <doc> <field name="id">6H500F0</field> <field name="name">Maxtor DiamondMax 11 - hard drive - 500 GB - SATA-300</field> … <field name="store">45.17614,-93.87341</field> </doc>
Distance Functions • Most spatial operations (sorting, boosting, filtering, faceting) stem from the use of Solr’s built-in Function Query capability • http://wiki.apache.org/solr/FunctionQuery • dist(Power, pointA, pointB) – n-dimensional distance calculation • sqedist(pointA, pointB) – Squared Euclidean • hsin, ghhsin – Haversine (great circle) distance • geodist – Hides the details of other distance measures • Most people should just use geodist(), but others may want more control
Filtering • Accuracy matters! • geofilt – Radius based filter • &q=*:*&fq={!geofilt pt=45.15,-93.85 sfield=store d=5} • ...&q=*:*&fq={!geofilt sfield=store}&pt=45.15,-93.85&d=5 • bbox – Bounding Box (less accurate) • &q=*:*&fq={!bbox}&sfield=store&pt=45.15,-93.85&d=5
Boosting and Sorting • Increase the score of a document based on the distance: • &q={!func}geodist()&sfield=store&pt=45.15,-93.85&sort=score asc • Sort based on Distance • &q=*:*&fq={!geofilt}&sfield=store&pt=45.15,-93.85&d=50&sort=geodist() asc
Faceting • Use the FRange Functionality • Not ideal, but works • http://localhost:8983/solr/select?&q=*:*&sfield=store&pt=45.15,-93.85&facet.query={!frange l=0 u=5}geodist()&facet.query={!frange l=5.001 u=3000}geodist()&facet=true
Resources • http://wiki.apache.org/solr/SpatialSearch • http://www.lucidimagination.com/search/?q=spatial • https://www.ibm.com/developerworks/java/library/j-spatial/ • Outdated, but covers the concepts • @gsingers • grant@lucidimagination.com