1 / 15

Nicoletta Calzolari ILC - CNR - Pisa, Italy

Nicoletta Calzolari ILC - CNR - Pisa, Italy. Language Resources & Semantic Web. To make the Semantic Web a reality . …need to tackle the twofold challenge of content availability and multilinguality Natural convergence with HLT: multilingual semantic processing ontologies

elma
Download Presentation

Nicoletta Calzolari ILC - CNR - Pisa, Italy

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Nicoletta CalzolariILC - CNR - Pisa, Italy Language Resources & Semantic Web COLING Workshop - 2002

  2. To make the Semantic Web a reality ... …need to tackle the twofold challenge of • content availability and • multilinguality Natural convergence with HLT: • multilingual semantic processing • ontologies • semantic-syntactic computational lexicons COLING Workshop - 2002

  3. Computational Multilingual Lexicons: an essential component for the Semantic Web • Language - & lexicons - are the gateway to knowledge • Semantic Web developers need repositories of words & terms - & knowledge of their relations in language use & ontological classification. • The cost of adding this structured and machine-understandable lexical informationcan be one of the factors that delays its full deployment. • The effort of making available millions of ‘words’ for dozens of languages is something that no small group is able to afford. • A radical shift in the lexical paradigm - whereby many participants add linguistic content descriptions in an open distributed lexical framework - is required to make the Web usable COLING Workshop - 2002

  4. Infrastructure of Language Resources... ...static • Semantic network: Euro-/ItalWordNet • Lexicons: PAROLE/SIMPLE/CLIPS • TreeBank +sw International Standards But … they will never be “complete” …dynamic • Lexical acquisitionsystems (syntactic & semantic) from text corpora • Robust systems of morphosyntactic & syntactic analysis • Word-sensedisambiguation systems COLING Workshop - 2002

  5. Italian Semantic Network Italian module of EuroWordNet (http://www.hum.uva.nl/~ewn/) • ~50.000 lemmas organized in synonym groups (synsets), structured in hierarchies & linked by ~130.000 semantic relations • ~50.000 hyperonymy/hyponymy relations • ~ 16.000 relations among different POS (role, cause, derivation, etc..) • ~ 2.000 part-whole relations • ~ 1.500 antonymy relations, …etc. • Synsets linked to the InterLingual Index (ILI=Princeton WordNet), • Through the ILI link to all the European WordNets (de-facto standard) • & to the common Top Ontology • Possibility of plug-in with domain terminological lexicons • Usable in IR, CLIR, IE, QA, ... COLING Workshop - 2002

  6. Domain - Semantic class mangiare COLING Workshop - 2002

  7. +edible Used_for Object_of_the_activity TELIC Is_the_activity_of AGENTIVE Created_by Domain - Semantic class zucchero mangiare NATURAL_SUBSTANCE alloro FLAVOURING tartufo cucinare cuocere VEGETAL_ENTITY friggere mestolo mangiare cucinare mangiare mangiare mangiare mangiare mangiare cucinarecuocerearrostirebollirelessarestufarefriggere rosolaregrigliare…… bollire mangiare pentola mangiare friggitrice carne tavola forchetta ristorante mela posata BUILDING carota cuoco coniglio FURNITURE bollitore FOOD pesce FRUIT arrosto VEGETABLES pesciera SUBSTANCE_FOOD INSTRUMENT CONTAINER PROFESSION ARTIFACT _FOOD COLING Workshop - 2002

  8. machine language learning COLING Workshop - 2002

  9. machine language learning linguistic learning development of conceptual networks linguistic change models language usage models adaptive classification systems information extraction bootstrapping of lexical information bootstrapping of grammars COLING Workshop - 2002

  10. Beyond MILE:towards open & distributed lexicons Ontology URI = http://www.zzz… Semantic Lexicon URI = http://www.xxx… Syntactic Constructions URI = http://www.yyy… Lex_object: semFeature URI = http://www.xxx…#HUMAN Lex_object: syntagmaNT URI = http://www.zzz…#NP Monolingual/Multilingual Lexicon COLING Workshop - 2002

  11. Target…..Multilingual Knowledge ManagementTechnical Feasibility: • Prerequisite: is it an achievable goal a commonly agreedtext/lexicon annotation protocol also for the semantic/conceptual level (to be able to automatically establish links among different languages)? Yes, at thelexical level More complex, for corpus annotation? EAGLES/ISLE COLING Workshop - 2002

  12. A few Issues for discussion:lexicon standards • Semantic Web standards and the needs of content processing technologies: • importance of reaching consensus on (linguistic and non-linguistic) “content”, in addition to agreement on formats and encoding issues (…words convey content & knowledge) • short/medium term requirements wrt standards for multilingual lexicons & content encoding, also industrial requirements • Relation with Spoken language community • MILE & Asian languages: how to cooperate concretely? • Define further steps necessary to converge on common priorities • …. COLING Workshop - 2002

  13. A few Issues for discussion:“content”, priorities... • For which type of resources to invest? wrt short vs. medium term results? • Need for robust systems, able to acquire/tune lexical/linguistic (also multilingual) knowledge, to auto-enrich static basic resources? • What the relation betw. lexical standards and text annotation protocols? • Knowledge management is critical. For “content” interoperability, is the field ‘mature’ enough to converge around agreed standards also for the semantic/conceptual level (e.g. to automatically establish links among different languages)? • Is the field of multilingual lexical resources ready to tackle the challenges set by the Semantic Web development? Towards a new paradigm?? COLING Workshop - 2002

  14. A new paradigm for LR? Where the focus is on cooperation New Strategic Vision? towards a Distributed Open Lexical Infrastructure? • for distributed & cooperative creation, management, etc. of Lexical Resources • technical & organisational requirements COLING Workshop - 2002

  15. Language Resources & Semantic Web “ELITE” (expression of interest for the 6thFP)“European Lexical Infrastructure and Technology” New proposed paradigm for lexicon development: Open & Distributed Lexical Infrastructure for content description and content interoperability, to make lexical resources usable within the emerging Semantic Web scenario COLING Workshop - 2002

More Related