130 likes | 249 Views
ICON 2003. INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING. NLP-AI IIIT-Hyderabad CIIL, Mysore. 19-22 DECEMBER, 2003. Computational Linguistics: HOW ONE FEEDS THE OTHER. We can study anything about language ... 1. Formalize some insights 2. Study the formalism mathematically
E N D
ICON 2003 INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING NLP-AI IIIT-Hyderabad CIIL, Mysore 19-22 DECEMBER, 2003
Computational Linguistics:HOW ONE FEEDS THE OTHER • We can study anything about language ... • 1. Formalize some insights • 2. Study the formalism mathematically • 3. Develop & implement algorithms • 4. Test on real data
nlp: The Big Questions • What are the right formalisms to encode linguistic knowledge? • Discrete knowledge: what is possible? • Continuous knowledge: what is likely? • How can we compute efficiently with these formalisms? • Or find approximations that work pretty well?
Some of the Active Research • Syntax: It’s converging, but still messy • “DEEP/SURFACE/SHALOOW structure” problems of syntax • Phonology:Formalism under hot development • Speech: • Better language modeling • Better models of acoustics, articulatory pronunciation • Adaptation to particular speakers and dialects • Translation models and algorithms • Semantic theories and connection to AI – use stats? • Too many semantic phenomena. Really hard to determine and disambiguate possible meanings.
Deploying NLP • Speech recognition and IR have finally gone commercial over the last few years. • But not much NLP is out in the real world. • What kind of applications should we be working toward? • Resources: • Corpora, with or without annotation • WordNet; morphologies; maybe a few grammars • Some languages don’t gell well with NLP or speech modules, or statistical training modules. • But there are research toolkits that exist and they need to be made available.
Sneaking NLP in through the back door • ADD FEATURES TO EXISTING INTERFACES • “Click to translate” • Spell correction of queries • Allow multiple types of queries • Work on document clusters and summaries • Machines gradually replacing humans @ phone/email helpdesks – now becoming a reality elsewhere • BACK-END PROCESSING • Information extraction and normalization to build databases: Assemble good text from boilerplate - wherever • HAND-HELD DEVICES • Translator • Personal conversation recorder, with topical search
Making Search applications and technology for the masses? • Allow queries over meanings, not sentences • Need semantic network extraction from the web • Simple entities and relationships among them • Not complete, but linked to original text • Allow inexact queries – Train data and learn to generalize from a few tagged examples • Redundancy factor is important • Collapse for browsability or space • Games • Command-and-control applications • “Practical dialogue” (computer as assistant)
Applications in Discourse Modeling • Following Donia Scott & Hans Kamp, I would say that here is a field that is still unable to come to terms with the semantics-prgamatics divide of texts and natural languages and related problems. • How to combine both semantic representation and yet keep pragmatic information? • Highly elliptical utterances that are common in spoken dialogue pose special challenges. • Many theories but none is complete, although some (or aspects of some) lend themselves more readily to implementation than others.
A theory of discourse coherence (Jerry Hobbs, 1985) based on a small, limited set of coherence relations which is part of a larger, still-developing theory of the relations between text interpretation and belief systems. A tripartite organization of discourse structure (cf. Grosz and Sidner 1986) according to the focus of attention of the speaker, (the attentional state), the structure of the speaker's purposes (the intentional structure) and the structure of sequences of utterances. We have Rhetorical Structure Theory (RST) where there is a hierarchical organization of text spans involving nucleus or satellite set of relations. (Mann & Thompson) Then, there is Discourse Representation Theory (DRT) (cf. Kamp 1981), a semantic theory developed for the express purpose of representing and computing trans-sentential anaphora and other forms of text cohesion Different approaches to discourse and dialogue study
Discourse Modeling in NLP: Future Directions • Nature of Discourse Relations: textual, rhetorical, intentional, or informational? • Number of Discourse Relations: • Level of Abstraction at which Discourse is Described: • Nature of Discourse Segments, and the psychological reality issue • Role of Intentions in Discourse: • Mechanisms for Handling Key Linguistic Phenomena & Reasoning
SINCE MOST NLP SYSTEMS MUST DEPEND ON CAREFUL HANDLING OF MEANING, AS THIS MODEL SHOWS, THAT SHOULD BE OUR PRIORITY AREA NOW.
Thank You & Welcome Once Again