1 / 5

DLSI Lexical Analysis

DLSI Lexical Analysis. Prof Brook Wu and Ph.D. student Xin Chen. Lexical Analysis. Focus on processing “text” Difficulties: word sense ambiguities, e.g.: regular “mouse” v.s. computer “mouse” irregularities, e.g.: datum, data

toby
Download Presentation

DLSI Lexical Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DLSI Lexical Analysis Prof Brook Wu and Ph.D. student Xin Chen

  2. Lexical Analysis • Focus on processing “text” • Difficulties: • word sense ambiguities, e.g.: regular “mouse” v.s. computer “mouse” • irregularities, e.g.: datum, data • Part-of-speech tag ambiguities, e.g.: an “offer” (noun) v.s. “Prof Bieber offers …” (verb)

  3. Lexical Analysis in DLSI project • Purpose: generate link anchors for important concepts in returned documents. • Work involved: • Find glossaries/thesauri on the web or contact DLSI partners for information. • Organize them into a master file. • Find glossary/thesaurus term in text using lexical analysis techniques, including tokenization, part-of speech tagging, parsing, and matching.

  4. Qualifications and Supervision • You should participate because text processing and lexical analysis is getting popular, for there is very rich information available in text. Industry will want people who know how to effectively process documents. • Qualifications: • Proficiency in JAVA, or C++ • Supervision: • A team of up to 3 students will be supervised by Prof Wu, but will mainly be led by Xin Chen, a Ph.D. candidate in IS.

More Related