340 likes | 696 Views
Research Topics Natural Language Processing Image Processing. CSC 3990. Natural Language Processing. CSC 3990. What is NLP?. Natural Language Processing (NLP) Computers use (analyze, understand, generate) natural language A somewhat applied field Computational Linguistics (CL)
E N D
Research TopicsNatural Language ProcessingImage Processing CSC 3990
Natural Language Processing CSC 3990
What is NLP? • Natural Language Processing (NLP) • Computers use (analyze, understand, generate) natural language • A somewhat applied field • Computational Linguistics (CL) • Computational aspects of the human language faculty • More theoretical
Why Study NLP? • Human language interesting & challenging • NLP offers insights into language • Language is the medium of the web • Interdisciplinary: Ling, CS, psych, math • Help in communication • With computers (ASR, TTS) • With other humans (MT) • Ambitious yet practical
Goals of NLP • Scientific Goal • Identify the computational machinery needed for an agent to exhibit various forms of linguistic behavior • Engineering Goal • Design, implement, and test systems that process natural languages for practical applications
Applications • speech processing: get flight information or book a hotel over the phone • information extraction: discover names of people and events they participate in, from a document • machine translation: translate a document from one human language into another • question answering: find answers to natural language questions in a text collection or database • summarization: generate a short biography of Noam Chomsky from one or more news articles
General Themes • Ambiguity of Language • Language as a formal system • Computation with human language • Rule-based vs. Statistical Methods • The need for efficiency
Topic Ideas • Text to Speech – artificial voices • Speech Recognition - understanding • Textual Analysis – readability • Plagiarism Detection – candidate selection • Intelligent Agents – machine interaction
Text to Speech – artificial voice • Text Input • Break text into phonemes • Match phonemes to voice elements • Concatenate voice elements • Manipulate pitch and spacing • Output results • Research question: How can a human voice be used to produce an artificial voice? • Model Talker - opportunities for active, hands-on research (http://www.modeltalker.com)
Speech Recognition • Spoken Input • Identify words and phonemes in speech • Generate text for recognized word parts • Concatenate text elements • Perform spelling, grammar and context checking • Output results • Research question: How can speech recognition assist a deaf student taking notes in class? • VUST – Villanova University Speech Transcriber (http://www.csc.villanova.edu/~tway/publications/wayAT08.pdf)
Textual Analysis - Readability • Text Input • Analyze text & estimate “readability” • Grade level of writing • Consistency of writing • Appropriateness for certain educ. level • Output results • Research question: How can computer analyze text and measure readability? • Opportunities for hands-on research
Plagiarism Detection • Text Input • Analyze text & locate “candidates” • Find one or more passages that might be plagiarized • Algorithm tries to do what a teacher does • Search on Internet for candidate matches • Output results • Research question: What algorithms work like humans when finding plagiarism? • Experimental CS research
Intelligent Agents • Example: ELIZA • AIML: Artificial Intelligence Modeling Lang. • Human types something • Computer parses, “understands”, and generates response • Response is viewed by human • Research question: How can computers “understand” and “generate” human writing? • Also good area for experimentation
Image Processing CSC 3990 Some slides from Xin Li lecture notes, West Virginia Univ.
What is Image Processing? • Digital Image Processing • Analog transmission in 1920 • Early improvements in 1920s • Required digital computer (1948) • Rapid advancement since
Historical Background Newspaper industry used Bartlane cable picture transmission system to send pictures by submarine cable between London and New York in 1920s The number of distinct gray levels coded by Bartlane system was improved from 5 to 15 by the end of 1920s
Digital Image Processing • The images in previous slides are digital (now), but they are NOT the result of DIP • Digital Image Processing is • Processing digital images by a digital computer • DIP requires a digital computer and other supporting technologies (e.g., data storage, display and transmission)
Cool Applications The first picture of moon by US spacecraft Ranger 7 on July 31, 1964 at 9:09AM EDT Sir Godfrey N. Housefield and Prof. Allan M. Cormack shared 1979 Nobel Prize in Medicine for the invention of CT • Digitization • Compression • Error Recovery • Enhancement • Edges, Contrast, Brightness, etc.
Past 20 Years • Acquisition • Digital cameras, scanners • MRI and Ultrasound imaging • Infrared and microwave imaging • Transmission • Internet, wireless communication • Display • Printers, LCD monitor, digital TV
Remote Sensing America at night (Nov. 27, 2000) Hurricane Andrew taken by NOAA GEOS
Thermal Images Operate in infrared frequency Human body disperses heat (red pixels) Different colors indicate varying temperatures
Medical Diagnostics Operate in X-ray frequency chest head
PET and Astronomy Operate in gamma-ray frequency Cygnus Loop in the constellation of Cygnus Positron Emission Tomography
Synthetic Images in Gaming Age of Empire III by Ensemble Studios
General Themes • Human vision is limited • Digital images contain more information that humans perceive • Computers can use algorithms to extract more information from digital images • Computers can acquire, manipulate, compress, transmit and modify images
Topic Ideas • Biometrics – identifying faces & retinas • Target Acquisition – see a tank from space • Computer Vision – detect microscopic flaws in manufacturing • Assistive Technology – convert visual images into tactile or textual form • Entertainment – remove red eye, morph faces, digital filmmaking, movie magic • Image Description – use 3D dictionary to describe contents of 2D image