1 / 38

Location, Location, Location

Why am I here?. Details, Details, Details. Location, Location, Location. Everybody has to be somewhere! (Eccles: Goon Show circa 1950). Computing Research Laboratory. Language Engineering at CRL. Information retrieval Language learning and language teaching Automatic translation

damisi
Download Presentation

Location, Location, Location

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Why am I here? Details, Details, Details Location, Location, Location

  2. Everybody has to be somewhere! (Eccles: Goon Show circa 1950) Computing Research Laboratory

  3. Language Engineering at CRL • Information retrieval • Language learning and language teaching • Automatic translation • Summarization • Question answering • Dictionary development • Knowledge discovery

  4. Field Guide Locations Habitat Mainly deciduous forests and woodlands; often seen over adjacent farmlands. Nesting 2 whitish eggs, heavily marked with dark brown, placed without nest or lining in a crevice in rocks, in a hollow tree, or in a fallen hollow log. Range Breeds from southern British Columbia, central Saskatchewan, Great Lakes, and New Hampshire southward. Winters in Southwest, and in East northward to southern New England.

  5. Tipster/MUC Named Entity Task Scotland (CITY) Alabama (PROVINCE 1) United States (COUNTRY)Scotland (CITY) Arkansas (PROVINCE 1) United States (COUNTRY)Scotland (CITY) California (PROVINCE 1) United States (COUNTRY)Scotland (CITY) Connecticut (PROVINCE 1) United States (COUNTRY)Scotland (CITY) Florida (PROVINCE 1) United States (COUNTRY)Scotland (CITY) Georgia (PROVINCE 1) United States (COUNTRY)Scotland (CITY) Indiana (PROVINCE 1) United States (COUNTRY)Scotland (CITY) Maine (PROVINCE 1) United States (COUNTRY)Scotland (CITY) Maryland (PROVINCE 1) United States (COUNTRY)Scotland (CITY) Massachusetts (PROVINCE 1) United States (COUNTRY)Scotland (CITY) Mississippi (PROVINCE 1) United States (COUNTRY)Scotland (CITY) Missouri (PROVINCE 1) United States (COUNTRY)Scotland (CITY) New Hampshire (PROVINCE 1) United States (COUNTRY)Scotland (CITY) Ohio (PROVINCE 1) United States (COUNTRY)Scotland (CITY) Pennsylvania (PROVINCE 1) United States (COUNTRY)Scotland (CITY) South Dakota (PROVINCE 1) United States (COUNTRY)Scotland (CITY) Texas (PROVINCE 1) United States (COUNTRY)Scotland (CITY) Virginia (PROVINCE 1) United States (COUNTRY)Scotland (PROVINCE 1) United Kingdom (COUNTRY)Scotland (PROVINCE 2) Missouri (PROVINCE 1) United States (COUNTRY) Seven Page Task Definition Multi-name expressions containing conjoined modifiers (with elision of the head of one conjunct) should be marked up as separate expressions. "North and South America" <ENAMEX TYPE="LOCATION">North</ENAMEX> and <ENAMEX TYPE="LOCATION">South America</ENAMEX> + Gazetteer – from USGS and National Geographic

  6. MINDS • User configurable summarization system based on sentence selection • Summarizes documents in Spanish, Japanese, Russian, Turkish, Korean, and English • Summaries can be biased to favor place names, or other named entities

  7. Basic Summarization Method • Document structure analysis • Keyword analysis • Part of Speech and Proper Name Recognition • Sentence selection based on weighted scores

  8. Boas: “A Linguist in the Box” Boas is a semi-automatic knowledge elicitation system that guides a language speaker through the process of developing the static knowledge sources for a moderate-quality, broad-coverage MT system from any “low-density” language into English in about six months. One of the tasks is translating a long list of place names from English into the source language.

  9. The ethnologist and linguist Franz Boas was the founder of the American school of descriptive linguistics. In this photo, circa 1900?, he is shown posing for a model which was being made of a Kwakuitl Winter Ceremonial dancer in which the dancer emerges from within a circular hole cut in the dancing screen.

  10. Onomastics The study of proper names

  11. Keizai - Human Assisted Query Translation for Cross-Language Retrieval

  12. Document Filtering – Using Names, Locations, Keywords

  13. Automatic Document Translation – Spanish, Arabic, Farsi…..

  14. ATS – Differences in Arabicالاختلافاتاللغويةفيالأقطارالعربية Arabic Speaking Countries Document Collection Morphological Analysis Information Retrieval Sub-corpus Analysis • Explore differences in lexical usage due to – • Transliteration • Cultural background(west – French, east – English) • Spelling differences For example – السيدا SIDA, الايدز AIDS الأوبيب OPEP, الأوبيك OPEC الأستاذTeacher (Algeria), الاستاذTeacher (Oman)

  15. Word AFP English Occurrences لوس انجليس Los Angeles 21 لوسانجلوس Los Angeles 23 لوس انجيلس Los Angeles 2 لوس انجيليس Los Angeles 34 انجلترا England 2 انكلتر England 1 انكلترا England 1 كارولاينا Carolina 26 كارولينا Carolina 14 ويسكونسين Wisconsin 8 ويسكنسن Wisconsin 2 ويسكونسن Wisconsin 16 نيوهامبشير New Hampshire 15 نيوهامبشر New Hampshire 9 Different Spelling Forms from AFP Arabic Newswire

  16. Word Transliteration English Name شارلوت Sharlote Charlotte المانيا Alemania Germany اوروبا Europa Europe موسكو Moscou Moscow طرابلس Tarabulus Tripoli الكويت Al Kuwayt Kuwait باولوس Paulos Pauls بروكسل Brussells Brussels برلين Berlien Berlin فلسطين Palestina Palestine بريطانيا Britania Britain بيروت Bayreuth Beirut لاغوس Lagus Lagos Place Names Transliterated using National Name

  17. Meaning Oriented Question Answering - MOQA Computing Research Laboratory (NMSU) Institute for Language and Information Technology (UMBC) CoGenTex, Inc. ILIT An AQUAINT project by: • Domain • Travels • Meetings • Languages • English • Arabic • Persian • Method • Fact Repository from Text • Ontology based: Search Form Results - Text Retrieval - Text Analysis - Question Analysis This work was supported in full by the Advanced Research and Development Activity (ARDA)’s Advanced Question Answering for Intelligence (AQUAINT) Program under contract number 2002*H167200*000.

  18. Triple Inheritance hierarchy for “Nation”

  19. FACT DATABASE: The “Asian-Nation” Instance: “Turkey”

  20. Text Meaning Representation • proposition _1 • head %exit_1 • agent human_54 “Mr. Smith” • source location_23 “London” • destination location_25 “Ankara” • means vehicle_65 “Boeing 757” • tmr-time • time-begin YYYYMMDD “July 2, 2000” • aspect • iteration single; phase end… “departed” • polarity positive • mood indicative

  21. Resources • U.S. GEOLOGICAL SURVEY • FEDERAL GEOGRAPHIC DATA COMMITTEE • NATIONAL IMAGERY AND MAPPING AGENCY • U.S. BUREAU OF LAND MANAGEMENT • U.S. FOREST SERVICE • US CENSUS BUREAU • GETTY THESAURUS OF GEOGRAPHIC NAMES

  22. Resources from (and for) the Humanities are Multilingual! Computers and the Visual Arts: Editorial ... These developments led a branch of the Comité Internationale Pour l'Histoire de L'Art (CIHA) to conceive Thesaurus Artis Universalis (TAU) which soon proved ... www.sumscorp.com/sums/articles/dahlberg.html - 11k - Cached - Similar pages [PDF]1 Regard sur l'informatisation des collections de musées d'art ... File Format: PDF/Adobe Acrobat - View as HTML ... audacieuses qui nous promettaient, par exemple, un renouvellement complet et inéluctable de l'histoire de l'art grâce au Thesaurus Artis universalis ? ... www.kikirpa.be/www2/Site_irpa/ En/Publi/Doc/PYK/Kairis.pdf - Similar pages Marco Lattanzi - [ Translate this page ] ... stata da tempo recepita dalla comunità internazionale degli studiosi che, nell’ambito del gruppo di lavoro TAU (Thesaurus Artis Universalis), ha costituito ... www.ibc.regione.emilia-romagna.it/soprintendenza/ arcaut/lattanzi.html -

  23. Getty Record for Edmonton • ID: 7013032 Record Type: administrative • Edmonton (inhabited place) • Coordinates: • Lat: 53 34 00 N degrees minutes • Long: 113 25 00 W degrees minutes • Note: Located on N Saskatchewan river; flourished as center for agricultural distribution & processing after arrival of Canadian Pacific Railway 1891; petroleum was discovered nearby at Leduc, Redwater & Pembina mid-20th cen. • Names: • Edmonton (preferred, C,V,N) • Strathcona (H,V,N) ............ formerly located on river's S bank; absorbed into city 1912 • Ft. Edmonton (H,V,N) ............ fur-trading post for Hudson's Bay Company constructed 20 miles downstream from current site 1795; abandoned 1810

  24. Getty Record for Edmonton (Contd.) • Hierarchical Position: • World (facet (hierarchical)) • North and Central America (continent) • Canada (nation) • Alberta (province) • Edmonton (inhabited place) • Place Types: • inhabited place (preferred, C)............ expanded in 19th cen. • city (C) ............ incorporated 1904 • provincial capital (C) ............since 1905 • industrial center (C) • transportation center (C) • university center (C)

  25. Address Data Content Standard Public Review Draft Subcommittee on Cultural and Demographic Data Federal Geographic Data Committee April 17, 2003 Version 2 http://www.fgdc.gov/

More Related