1 / 9

Center for Computational Learning Systems

Center for Computational Learning Systems. Independent research center within the Engineering School NLP people at CCLS: Mona Diab, Nizar Habash, Martin Jansche, Rebecca Passonneau, Owen Rambow We are part of “The NLP Group” but not of the CS department What we do: Researchers

tannar
Download Presentation

Center for Computational Learning Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Center for Computational Learning Systems • Independent research center within the Engineering School • NLP people at CCLS: Mona Diab, Nizar Habash, Martin Jansche, Rebecca Passonneau, Owen Rambow • We are part of “The NLP Group” but not of the CS department • What we do: • Researchers • Work with Kathy and Julia • Our own projects • Sometimes teach • Supervise students (PhD, Masters, independent studies) • Some of us are in CEPSR, some in the Interchurch Building • Some NLP Group meetings will take place in Interchurch Center

  2. CLiMB 2: Computational Linguistics for Metadata Building, phase 2 • Becky Passonneau (with University of Maryland) • Interactive workbench for image cataloguers/indexers: Use NLP to extract descriptive terms from scholarly text • Mellon Foundation • http://www.umiacs.umd.edu/~climb/

  3. Automated Readers Advisor, Heiskell Talking Books and Braille Library (NYPL) • Becky Passonneau • Replace some of librarians’ tasks in current over-the-phone borrowing system with automated dialogue system • Use Wizard-of-Oz paradigm for data collection • Joint project with CCNY (Esther Levin) • http://www.cs.columbia.edu/~becky/pubs/WozVariant.ppt

  4. Tracking Emergent Narrative Skills (TENS) • Becky Passonneau • Current data set: ten-year olds retelling silent movies • Develop quantitative methods to compare semantic and pragmatic content (e.g., adapt Pyramid Method for evaluating summary content) • Joint project with University of Connecticut (Elena Levy)

  5. Arabic NLP • CADIM Group: Mona Diab, Nizar Habash, Owen Rambow • Focus on Standard Arabic AND the dialects • NLP tools for Arabic: • Morphological analysis (exists) • Morphological tagging (exists, best-performing) • Tokenization • POS tagging (best-performing) • Diacritization (best-performing) • Word-sense disambiguation (in progress) • Sentence-boundary detection for ASR (in progress) • Parsing (initial research) • Names-entity recognition (joint with Fair Isaacs, in progress) • …

  6. Machine Translation • Nizar Habash • Focus: Arabic-English MT • Different hybrid MT approaches explored • Linguistic preprocessing for Statistical MT • Morphological and Syntactic preprocessing • Adding statistical resources to rule-based MT systems • Automatically extracted phrase tables combined with Generation-Heavy MT • Columbia first time participation in NIST MTEval (2006)

  7. Word Sense Modeling and Disambiguation • Mona Diab • Using corpora (including multilingual parallel and similar) for unsupervised learning • Arabic WordNet • Arabic PropBank

  8. Email Summarization:Social Networks • Aaron Harnly (PhD student) and Owen Rambow, with Kathy McKeown • Study interaction between: • Email-intrinsic factors • Language in email (lexison, syntax, …) • Email genre • Structure of dialog • Threads • Speech acts • Relation among people • Roles in organization • Social networks • Use to predict on factor from others • Use in high-level summaries of large amounts of email communication

  9. Multilingual Metagrammars • Owen Rambow (with University of Pennsylvania) • Goal: high-level abstract representation of syntax of (many/all) natural languages, from which we can automatically generate grammars that can be used for NLP • Have: Universal Grammar component and language-specific modules for Korean, German, Yiddish • Next: Icelandic, Mainland Scandinavian, English, Kashmiri, …

More Related