ESWC 2009 Research IX: Evaluation and Benchmarking

ESWC 2009 Research IX: Evaluation and Benchmarking Benchmarking Fulltext Search Performance of RDF Stores Enrico Minack, Wolf Siberski, Wolfgang Nejdl L3S Research Center, Universität Hannover, Germany {minack,siberski,nejdl}@L3S.de 03.06.2009 http://www.l3s.de/~minack/rdf-fulltext-benchmark/

Outline • Motivation • Benchmark • Data set and Query set • Evaluation • Methodology and Results • Conclusion • References Enrico Minack

1. Motivation • Semantic applications provide fulltext search • Underlying RDF stores have to provide fulltext search • Application developers have to choose • Best practice:  Benchmark • No fulltext search RDF benchmark • RDF stores perform ad hoc benchmarks •  strong need for RDF fulltext benchmark Enrico Minack

2. Benchmark • Extended Lehigh University Benchmark [LUBM] • Synthetic data, fixed list of queries • Familiar but not trivial ontology • University, Faculty, Professors, Students, Courses, … • Realistic structural properties • Artificial literal data • „Professor1“, „GraduateStudent216“, „Course7“ Enrico Minack

2. Benchmark Enrico Minack

2.1 Data set • Added • Person names (first name, surname)following real world distribution • Publication content following topic-mixture-basedword distributions trained by real document collection [LSA] Enrico Minack

2.1 Data set (Person Names) • Probabilities from U.S. Census 1990 • (http://www.census.gov/genealogy/names/) • 1,200 male first names • 4,300 female first names • 19,000 surnames Enrico Minack

2.1 Data set (Publication Text) Probabilistic Topic Model 100 Topics (word probabilities) Topics of documents Topic occuring probability Topic cooccurring probability trained NIPS data set 1,740 documents Enrico Minack

2.1 Data set (Publication Text) Graduate Student Faculty Professor Topic Topic Topic Publication Enrico Minack

2.1 Data set (Statistics) Enrico Minack

2.2 Query set • Three sets of queries • Basic IR Queries • Semantic IR Queries • Advanced IR Queries Enrico Minack

2.2 Query set (Basic IR Queries) • Pure IR queries • Q1: • Q2: • Q3: • Q4: • Q5: „engineer“ „network“ ub:publicationText „network“ „engineer“ ub:publicationText „network“ „engineer“ ub:publicationText „network engineer“ ub:surname „smith“ „Smith“ Enrico Minack

2.2 Query set (Semantic IR Queries) ub:Publication ub:Publication ub:publicationText „engineer“ • Q6: • Q7: • Q8: • Q9: ub:title ?title ub:publicationAuthor ub:FullProfessor ub:fullname „smith“ ?name Enrico Minack

2.2 Query set (Semantic IR Queries) ub:Publication ub:publicationText „engineer“ • Q10: • Q11: ub:publicationAuthor ub:FullProfessor ub:fullname „smith“ ub:publicationAuthor ub:Publication ub:publicationText „network“ Enrico Minack

2.2 Query set (Advanced IR Queries) • Q12: „+network +engineer“ • Q13: „+network –engineer“ • Q14: „network engineer“~10 • Q15: „engineer*“ • Q16: „engineer?“ • Q17: „engineer“~0.8 • Q18: „engineer“  Score • Q19: „engineer“  Snippet • Q20: „network“  Top 10 • Q21: „network“  Score > 0.75 ub:publicationText Enrico Minack

3. Evaluation • 2 GHz AMD Athlon 64bit Dual Core Processor • 3 GByte RAM, RAID 5 array • GNU/Linux, JavaTM SE RE 1.6.0 10 with 2 GB Memory Jena 2.5.6 + TDB Sesame 2.2.1NativeStore + LuceneSail Virtuoso 5.0.9 YARS post beta 3 Enrico Minack

3.1 Evaluation Methodology • Evaluated LUBMft(N) with N = {1, 5, 10, 50} • For each store: • For each query: • Flush the file system cache • Start the store • Repeat 6 times • Evaluate the query • Evaluation time > 1,000s, break • Stop store • Performed 5 times Enrico Minack

3.2 Evaluation Results • Basic IR Queries „engineer“ „network“ Enrico Minack

3.2 Evaluation Results • Semantic IR Queries ub:Publication ub:publicationText ub:title „engineer“ ?title ub:publicationAuthor ub:FullProfessor ub:fullname „smith“ ?name Enrico Minack

3.2 Evaluation Results • Semantic IR Queries ub:pubText ub:Pub „engineer“ ub:pubAuth ub:full ub:FullProf „smith“ ub:pubAuth ub:pubText ub:Pub „network“ Enrico Minack

3.2 Evaluation Results • Advanced IR Queries • Same relativeperformance • Feature Richness:Sesame (10)Jena (9)YARS (5)Virtuoso (1) Enrico Minack

4. Conclusion • Identified strong need for a fulltext benchmark • - For semantic application and RDF store developers • Extended LUBM towards a fulltext benchmark • Other benchmarks can be extended similarily • RDF stores provide many IR features • boolean, phrase, proximity, fuzzy queries • Multiple fulltext queries in one query are challenging Enrico Minack

5. References • [LSA] Mahwah, N.J., Handbook of Latent Semantic Analysis, Lawrence Erlbaum Associates, 2007. • [LUBM] Guo, Y., et al.: LUBM: A Benchmark for OWL Knowledge Base Systems. Journal of Web Semantics 3(2), 158-182 (2005). • [LuceneSail] Minack, E., et al.: The Sesame LuceneSail: RDF Queries with Full-text Search. Technical Report 2008-1, NEPOMUK (February 2008). • [Sesame] Broekstra, J., et al.: Sesame: A Generic Architecture for Storing and Querying RDF and RDF Schema. In: Horrocks, I., Hendler, J. (eds.) ISWC 2002. LNCS, vol. 2342, pp. 54-68. Springer, Heidelberg (2002). • [Jena] Carroll, J.J., et al.: Jena: Implementing the Semantic Web Recommendations. In: WWW Alternate track papers & posters, pp. 74-83. ACM, New York (2004). • [YARS] Harth, A., Decker, S.: Optimized Index Structures for Querying RDF from the Web. In: Proceedings of the 3rd Latin American Web Congress. IEEE Press, Los Alamitos (2005). Enrico Minack

ESWC 2009 Research IX: Evaluation and Benchmarking