60 likes | 239 Views
Text summarization • MEAD • NewsInEssence • Cross-document structure • Sentence compression • Lexrank Political science • Discourse dynamics • Centrality identification Information retrieval • Blog databases • Question answering • Fact extraction. Machine learning
E N D
Text summarization • MEAD • NewsInEssence • Cross-document structure • Sentence compression • Lexrank Political science • Discourse dynamics • Centrality identification Information retrieval • Blog databases • Question answering • Fact extraction Machine learning • Graph-based learning • Semi-supervised learning • Harmonic functions • Monte Carlo methods • Information extraction Language modeling • Modeling burstiness Biomedical literature analysis • Citation network analysis • Recognizing protein interactions in text • Clustering CLAIR: Computational Linguistics And Information Retrieval Faculty: Dragomir Radev Students: Güneş Erkan, Arzucan Özgür, Xiaodong Shi, Zhuoran Chen Mark Joseph, Konstantin Zak, Tony Fader, Joshua Gerrish • Machine translation • • Syntax-based alignment • • Text generation • • Syntax-based features Models of the Web • Lexical network models • Miscellaneous • • Language reuse • • Paraphrase identification • • Lexical models of the Web • • Dependency parsing • Courses • • Information Retrieval (SI 650) – Fall 05 • • Advanced NLP/IR (EECS 767/SI 767) – Winter 06 • • Natural Language Processing (EECS 595/SI 661) – Fall 06 • • Language and Information (EECS 597/SI 760) – Fall 06 • • Database Applications Design (SI 654) – Fall 05 Write to radev@umich.edu if you have any questions
Main areas of interest • Graph-based methods • Machine learning • Text summarization • Question answering • Text mining in political science, blogometrics, bioinformatics
List of current funded projects BlogoCenter: Infrastructure for Collecting, Mining and Accessing Blogs NSF (joint with Junghoo Cho of UCLA) Probabilistic and link-based Methods for Exploiting Very Large Textual Repositories NSF Representing and Acquiring Knowledge of Genome Regulation NIH (joint with Steve Abney, David States, and H.V. Jagadish) Collaborative research: semantic entity and relation extraction from Web-scale text document collections NSF (joint with Michael Collins of MIT and Steve Abney) DHB: The dynamics of Political Representation and Political Rhetoric NSF (joint with Kevin Quinn of Harvard, Burt Monroe of PSU) NCIBI: National center for integrative bioinformatics NIH (joint with 20 other faculty)
Representative recent papers • News to Go: Hierarchical Text Summarization for Mobile Devices (SIGIR 2006) • Language Model Based Document Clustering Using Random Walks (HLT-NAACL 2006) • An automated method of topic-coding legislative speech over time with application to the 105th-108th u. s. senate (MPSA 2006 – Gosnell Award) • Summarizing online news topics (CACM 2005) • Using random walks for question-focused sentence retrieval (HLT-EMNLP 2005) • Context-based generic cross-lingual retrieval of documents and automated summaries (JASIST 2005) • Probabilistic question answering on the web (JASIST 2005) • Centroid-based summarization of multiple documents (IPM 2004) • A smorgasbord of features for statistical machine translation (HLT-NAACL 2004) • Graph-based centrality as salience in text summarization (JAIR 2004)
Papers in progress or under submission • Summarization evaluation in a cross-lingual information retrieval context. Submitted to Information Processing and Management. • Retrieval of context-specific, dynamic information: A survey of related work. Submitted to ACM Computing Surveys. • Single-document and multi-document summary evaluation using relative utility. Submitted to Information Retrieval. • Exploring Fact-Focused Relevance and Novelty Detection, submitted to Information Processing and Management • Hierarchical Summarization for Delivering Information to Mobile Devices, submitted to Decision Support Systems • Modeling Burstiness in Discourse Using a Stochastic Stack • A topological analysis of semisupervised graph-based learning with harmonic functions • Protein-protein interaction with no external knowledge • An empirical analysis of 100 lexical networks • Hiring networks in information science and computer science • Blind men and elephants: What do citation summaries tell us about a research article • Reinforcement classifiers • Dependency parsing using random walks • Modeling Document Dynamics: An Evolutionary Approach • Cross-document relationship classification for text summarization
Software available • MEAD – text summarization • NSIR – question answering • CLAIRLIB – generic NLP/IR radev@umich.edu