120 likes | 312 Views
Faceted browsing for ACL Anthology. Praveen Bysani. ACL Anthology. a digital archive of research papers in CL and NLP contains over 20,100 papers free of cost a rchive for sister conferences and journals. Current browser. d irect and navigational search hard to navigate
E N D
Faceted browsing for ACL Anthology Praveen Bysani
ACL Anthology • a digital archive of research papers in CL and NLP • contains over 20,100 papers • free of cost • archive for sister conferences and journals
Current browser • direct and navigational search • hard to navigate • non-customized search • non-sortable results
Faceted browsing • Combination of navigational and direct search paradigms • Facets are properties of information elements • Access to organized information • Ability to explore the collection in multiple dimensions through filters
Faceted Browsing • RoR + Blacklight plugin • Apache Solr • Metadata from XML • Blacklight customization for XML
More cookies.. • User Feedback • Comment/ Share / Like • Suggestions for correcting the meta data • Ability to export bib in six formats • Author pages • List of publications • Co-authors
Third-party annotations • Automatically annotate articles with new metadata • Anthology as a corpus • API to make anthology an object of study • OAI compatible • allows metadata harvesting • @ http://aclanthology.heroku.com/
Challenges • Normalizing the quality of anthology meta data information • SIG Information • yaml files • no identifiers provided • DOI • from acm • changes in names of papers, authors
Similar works ACL Author Network • bibliometrics ACL Search Bench • Semantic search
Plans for the future • A common data schema to integrate all • Indexing the whole text data • Range queries for year facet • Exporting total volume bibliography • Enriching author pages