190 likes | 325 Views
CLARIN-D Showcase: Textual Emigration Analysis. André Blessing, Jens Stegmann , Jonas Kuhn Institute for Natural Language Processing (IMS) University of Stuttgart, Germany . Showcase Scenario. Textual Emigration Analysis Extract and visualize descriptions of emigration moves from texts.
CLARIN-D Showcase: Textual Emigration Analysis André Blessing, Jens Stegmann, Jonas Kuhn Institute for Natural Language Processing (IMS) University of Stuttgart, Germany
Showcase Scenario Textual Emigration Analysis Extract and visualize descriptions of emigration movesfrom texts Immigration Erika Lust wuchs in Kasachstan auf und emigrierte 1989 nach Deutschland. ‘Erika Lust grew up in Kasachstan and emigrated to Germany in 1989.’ 1906 übersiedelte Grabmayrnach Wien. ‘In 1906 Grabmayr moved to Vienna.’ ImJahr 1931 über-siedeltePohl nach Freiburg imBreisgau. ‘In 1931 Pohl moved to Freiburg i. Br.’ Emigration
Motivation: showcase Text analysis for the eHumanities • Exploit resource infrastructure & existing linguistic tools CLARIN webservices Converters Tokenizer had Tagger LOC in SBJ OBJ PMOD Parser prices effect PPI NMOD NMOD NMOD NMOD Named Entity Recognizer auto a big the
Motivation: stimulation e Text analysis for the eHumanities • Exploit resource infrastructure & existing linguistic tools • Accommodate discipline-specific concepts and relations • Aggregate information across textual sources • Link textual instances for critical reflection and correction • Adapt components to target language variety and domain Humanities: diverse range of disciplines dealing with aspects of text(s) (a.o., philology, linguistics, history, social sciences, …) “enabling”: facilitate innovative research on larger-scale data collections CLARIN webservices Converters Tokenizer had Tagger LOC in SBJ OBJ PMOD Parser prices effect PPI NMOD NMOD NMOD NMOD Named Entity Recognizer auto a big the
Sample task Extract instances of emigrate relation from text emigrate( , , ) Deutschland Kasachstan Erika_Lust 1906 übersiedelteGrabmayrnach Wien. ‘In 1906 Grabmayr moved to Vienna.’ Erika Lust wuchs in Kasachstan auf und emigrierte 1989 nach Deutschland. ‘Erika Lust grew up in Kasachstan and emigrated to Germany in 1989.’ ImJahr 1931 übersiedeltePohl nach Freiburg imBreisgau. ‘In 1931 Pohl moved to Freiburg i. Br.’
Sample task emigrate( , , ) Agostino_Novella Frankreich Genua AgostinoNovella (* 28. September 1905 in Genua; † 15. September 1974 in Rom) 1932 emigrierteernachFrankreich. ‘In 1932, he emigrated to France.’ 1906 übersiedelteGrabmayrnach Wien. ‘In 1906 Grabmayr moved to Vienna.’ 1906 übersiedelte Grabmayrnach Wien. ‘In 1906 Grabmayr moved to Vienna.’ Exploitmetadataandavailablestructureddata Erika Lust wuchs in Kasachstan auf und emigrierte 1989 nach Deutschland. ‘Erika Lust grew up in Kasachstan and emigrated to Germany in 1989.’ ImJahr 1931 übersiedeltePohl nach Freiburg imBreisgau. ‘In 1931 Pohl moved to Freiburg i. Br.’
A more complicated case… Abraham Lincoln (* 12. Februar 1809 …, heute: LaRue County, Kentucky; † 15. April 1865 in Washington, D.C.) … Kindheit und Jugend Abraham Lincoln wurde in einerBlockhütte auf der Sinking Spring Farm nahedemDorf Hodgenville in Kentucky geboren. Seine Elternwaren der Farmer Thomas Lincoln und dessen Frau Nancy, die beideaus Virginia stammten. Thomas Lincolns VorfahrenwareneinigeGenerationenzuvoraus Wales nachAmerikaausgewandert. ‘Thomas Lincoln's ancestors had emigrated several generations earlier from Wales to America.’
Spectrum of analytical approaches (1) General-purpose text analytics • Meta data, document structure • Look-up of names for geo-political entities emigrate( , , ) Wales Kentucky Abraham_Lincoln emigrate( , , ) Amerika Kentucky Abraham_Lincoln Abraham Lincoln (* 1809, Kentucky; † 1865 in Washington, D.C.) Thomas Lincolns VorfahrenwareneinigeGenerationenzuvoraus Wales nachAmerikaausgewandert. ‘Thomas Lincoln's ancestors had emigrated several generations earlier from Wales to America.’
Spectrum of analytical approaches (1) General-purpose text analytics Meta data, document structure Look-up of names for geo-political entities Exploit linguistic tools Accommodate concepts Aggregate information Link textual instances Adapt components emigrate( , , ) Wales Amerika Abraham_Lincoln Abraham Lincoln Thomas Lincolns VorfahrenwareneinigeGenerationenzuvoraus Wales nachAmerikaausgewandert. ‘Thomas Lincoln's ancestors had emigrated several generations earlier from Wales to America.’ Typicallyworkswell in textcollectionswithsomedegreeofredundancy
Spectrum of analytical approaches (2) Taking advantage of language analysis tools • Tokenization, Pos-Tagging, Named Entity Recognition • Identify semantic relation based on keywords emigrate( , , ) Wales Amerika Thomas_Lincoln Abraham Lincoln Thomas Lincolns VorfahrenwareneinigeGenerationenzuvoraus Wales nachAmerikaausgewandert. ‘Thomas Lincoln's ancestors had emigrated several generations earlier from Wales to America.’
Spectrum of analytical approaches (2) Taking advantage of language analysis tools Tokenization, Pos-Tagging, Named Entity Recognition Identify semantic relation based on keywords Exploit linguistic tools Accommodate concepts Aggregate information Link textual instances Adapt components emigrate( , , ) Wales Amerika Thomas_Lincoln Abraham Lincoln Thomas Lincolns VorfahrenwareneinigeGenerationenzuvoraus Wales nachAmerikaausgewandert. ‘Thomas Lincoln's ancestors had emigrated several generations earlier from Wales to America.’
Spectrum of analytical approaches (3) Adaptable eHumanities toolkit • Trainable relation extraction • Adaptable/retrainable parser and additional tools emigrate( , , ) Wales Amerika TLs Vorfahren SB Thomas Lincolns VorfahrenwareneinigeGenerationenzuvor ausWales nachAmerikaausgewandert. NK NK NK NK MO MO
Showcase Demo Textual Emigration Analysis Extract and visualize descriptions of emigration moves from texts Immigration 1906 übersiedelte Grabmayrnach Wien. ‘In 1906 Grabmayr moved to Vienna.’ ImJahr 1931 über-siedeltePohl nach Freiburg imBreisgau. ‘In 1931 Pohl moved to Freiburg i. Br.’ Emigration 1906 übersiedelte Grabmayrnach Wien. ‘In 1906 Grabmayr moved to Vienna.’
Raphaël—JavaScript Library Different maps can easily be integrated World 2013 Europe 1938 Germany 1949 • NLP challenge • Ground named entities (toponyms) to maps • Simple approach • Use gazetteer:
Adaptability, user interaction Sosonkoemigrierte 1972 aus der UdSSR in die Niederlande. ‘Sosonko emigrated in 1972 from the USSR to the Netherlands.’ Identified toponyms Mappingsuggestions Human corrections Troizk Russia UdSSR ?? Russia Niederlande Netherlands Lugano Switzerland Wijk aan Zee Netherlands Nijmegen Netherlands Polanica-Zdroj Poland
Outlook: Factuality Das EhepaarwollteeigentlichnachAmerikaauswandern, aber die Geburtihrer … Kinder ließsieihrePläneändern. ‘The couple actucallywanted to emigrate to America, but the birth of their children made them change their plans.’ Den ZweitenWeltkriegverbrachteer in Deutschland, weilernie in die USA auswandernwollte. ‘He spent WW II in Germany because he never wanted to emigrate to the USA.’ 1968 imPragerFrühlingmarschierte die Rote Armee in die Tschechoslowakeiein, Kohoutekentschiedsichdaraufhin 1970 zur Emigration nachDeutschland. ‘In 1968, in the Prague Spring the Red Army marched into Czechoslovakia; as a consequence, in 1970 Kohoutekdecided in favour of emigration to Germany.’
eHumanities showcase e (FCS) CLARIN webservices • TEA application • Interfaces to • webservices • User Interface Converters Tokenizer Tagger TCF exchange format Parser Named Entity Recognizer Exploit linguistic tools Accommodate concepts Aggregate information Link textual instances Adapt components Parser (retrainable) . GeoGrounding (retrainable) Relation extractor (retrainable)
Thank you! CLARIN-D Showcase: Textual Emigration Analysis André Blessing, Jens Stegmann, Jonas Kuhn Institute for Natural Language Processing (IMS) University of Stuttgart, Germany