140 likes | 280 Views
Information Retrieval Lab. DiSCo – University of Milan Bicocca Viale Sarca 336 U14 Head: Prof. Gabriella Pasi. IR Lab people. Gabriella Pasi Associate Professor and Head of the Laboratory Silvia Calegari Post-doc DISCO Stefania Marrara Post-doc UNIMI Célia Cristina Pereira
E N D
Information Retrieval Lab DiSCo – University of Milan Bicocca Viale Sarca 336 U14 Head: Prof. Gabriella Pasi
IR Lab people • Gabriella Pasi Associate Professor and Head of the Laboratory • Silvia Calegari Post-doc DISCO • Stefania Marrara Post-doc UNIMI • Célia Cristina Pereira Post-doc UNIMI
IR Lab numbers • Small but active! • Two people (since January 2009) • Two external collaborators (since 2008) • Three workplaces for Students and Collaborators • About 60 articles in proceeding of international conferences and in international journals in the last three years • 4-5 master students per year
The IR Lab in brief The Information Retrieval Group (IRG) was established in 2005 at DiSCo, University of Milan Bicocca. FOCUS: as the amount of information available on the Web has enormously increased in last years, there is need of effective systems that allow an easy and flexible access to informationrelevantto specific user’s needs. By flexibility is here meant the capability of the system to both manage imperfect (vague and/or uncertain) information, and topersonalise its behaviour to the user context. AIM: the research activity undertaken by the IRG is aimed at defining models and techniques that improve the limitations of current systems for the Information Access to the main aim of offering personalised and flexible solutions to the problem of locating information relevant to specific user’s needs.
Research in IR: some main issues • Improving indexing text representation is usually based on keywords extraction and weighting • how to improve document representations? • Conceptual indexing based on the use of conceptual structures • Latent semantic indexing • Metadata and the Semantic Web • Modeling user preferences in query formulation usually based on selection criteria specified by terms • how to formulate queries that capture real users’ needs? • Modeling the user’s context • Accounting for vagueness • Defining mechanism for query reformulation, relevance feedback
Research in IR: some main issues • Improving relevance estimate usually based on a measure of topicality, more recently Popularity (in search engines) • It should be based on additional criteria: • Novelty • Trust in information sources • Timeliness • Contextual information (geographic location, date, author, etc…) • It should be learnt on the basis of users needs/behavior • Application of machine learning techniques • Query reformulation • Text classification • Text summarization
IR Lab Activity • Research areas: • Information Retrieval • Information Filtering • Document Clustering • Personalization • XML Retrieval • Application Domains: • Large document repositories • World Wide Web
Ongoing and future research • Definition of conceptual approaches to IR. • Definition of flexible query languages for semi-structured documents (XML). • Definition of models for multi-dimensional relevance assessment • Definition of text clustering techniques • Definition of techniques for assessment of text quality and their use for relevance assessment • Web Service Retrieval
Personalized Information Access • Personalization is the process of customizing search results according to the user’s interests and context. • Approach: generation of user-tailored ontologies • Aim: to model and learn the user context to personalize the search process at distinct levels: • Document indexing • Query formulation • Relevance assessment
XML Retrieval • In XML collections it is important to retrieve documents based on users’ constraints on both documents’ content and structure. • Approach: 1) application of fuzzy set theory to define flexible extensions of existing XML query languages. 2) definition of ad hoc indexing strategies. • Aim: to propose advanced solutions for storing, managing and retrieving semi-structured documents.
Projects • Past. • STREP Project: PENG (Personalised News Content Programming) (Gabriella Pasi, Project Coordinator) (2004 – 2006) • Submitted • PRIN. Title: What, Where, When? (W3?): Recommendation of Information concerning specific topics and spatio-temporal contexts characterized by dynamicity and imprecision. • FIRB. Title: A Cloud Service Stack for Personalized Semantic Information Retrieval. • Spanish Project: High Performance processing for large data sets represented as Graphs (HIPERGRAPH) (Principal Investigator: Ricardo Baeza Yates – Yahoo! Research) • COST Action "Combining Soft Computing Techniques and Statistical Methods to Improve Data Analysis Solutions", coordinated by ECSC (ONGOING)
Collaborations At D.I.S.Co: • Davide Ciucci • Fabio Farina • ITIS – SEQUOIAS (Information Quality; Web Service Retrieval) External Collaborations: • CNR – IDPA, Italy • European Center for Soft Computing (ECSC), Spain • IRIT – Toulouse, France • Iona College, NY, USA • Università La Coruna, Spain
Conferences and Events • Organization of: • The 2009 IEEE / WIC / ACM International Conferences on Web Intelligence (WI'09) and Intelligent Agent Technology (IAT'09), Milano, Italy, 15-18 September 2009 • International Workshop on “Managing Vagueness and Uncertainty in the Semantic Web (VUSW’09)”, Milano, Italy, 15 September 2009 • Program Chair of the International conference RIAO 2010, Paris • Poster co-chair of ACM SIGIR 2010 ----------------------- • Some Past Events (since 2005) • "Special Track on Information Access and Retrieval Systems”, within the “ACM Symposium on Applied Computing”, (Fortaleza, Ceará, Brazil, March 16 - 20, 2008, Dijon France March 2006, Santa Fe - New Mexico 13-17 March 2005, Cyprus 14-17 March 2004, Melbourne - Florida 9-12 March 2003, Madrid 10-14 March 2002). IAR2008 • International Workshop on Fuzzy Logic and Applications (WILF 2007), Hotel Portofino Kulm, Portofino Vetta - Ruta di Camogli, Genova (Italy) - July 7-10, 2007 • PhD School on Web Information Retrieval, WebBar 2007 Varenna, Italy, 26th August-1st September 2007. • Seventh International Conference on Flexible Query Answering Systems (FQAS 2006), Milano, 2-10 June 2006. • “3rd International Summer School on Aggregation Operators”, Università della Svizzera Italiana (USI-Lugano), Lugano, 10-15 July 2005
Publications from 2005 …some numbers • Papers in International Journals: 22 • Special Issues in International Journals: 4 • Edited Volumes: 3 • Chapters of International Books: 10 • Proceedings for International Conferences: 40