This project applies semantic network technology to create an ontology and lexicon for a historical multimedia archive of the Rijksmuseum. It enables access to the archive and retrieval of historical information in a consistent and flexible way.
The semantics of historyas a model for historical archivesapplied to the Rijksmuseum collectionSubmitted as Camera project Piek Vossen Computational Lexicology & Terminology Lab LCC, Taal & Communicatie Faculteit der Letteren Seminar over Semantisch Netwerk Technologie January 22nd, 2009, VU University Amsterdam
Semantic disclosure and retrieval • Text to concepts, many-to-many mappings: • genocide, massacre,ethnic cleansingKilling • ethnic cleansingDeportation • Concepts to meta data • create structure and models from textual data, which involves inferencing: time periods, political roles, nations, alliances • use normalized meta data and structure to interpret text: genocide takes place in a particular period of time • Queries to concepts • handle unseen words and phrases: systematic violence to minorities • apply inferencing when needed/requested • sensitive to communicative perspective • Model diverse and distributed multimedia data collections in a uniform way to enable consistent and flexible retrieval and structuring: • Berlin as the capital of Germany & pictures of Berlin when it is the capital of Germany Seminar Semantisch Netwerk Technologie, 22-Jan-2009, VU University Amsterdam
The semantics of history • Camera project involving 1 AIO from the Faculty of Arts and 1 AIO from the Faculty of Exact Science, collaboration with Rijksmuseum • Goal: an ontology and lexicon for a historical multimedia archive of the Rijksmuseum. • Applied to an innovative information system for accessing the historical archive. Seminar Semantisch Netwerk Technologie, 22-Jan-2009, VU University Amsterdam
Event Ont. Historic Ont. Alignment Ontology Data Conversion Ontolization Terms & Relations Lexical mapping Term Extraction Lexicalization Lexicon Smart Indexing Objects Events Locations People Smart Retrieval Validation Data model Structured Data Software engineer Language engineer Knowledge engineer Semi Structured Data Free Text Historians Rijksmuseum Wikipedia Multimedia objects, data Historische Canon
Changing realities & perspectives • History is a chain of realities: how do we cut up continuous time? • All is finite but some things are more stable than others: • Geographical regions • Nations and countries • Regimes • People life and die • Politicians come and go • Interpretation of roles for the same period changes in each period Seminar Semantisch Netwerk Technologie, 22-Jan-2009, VU University Amsterdam
sub-event Struggle for Srebrenica WOII Watersnoodramp Installation of DutchBat Val enclave massa- graven parlementaire enquête West Germany Germany East Germany Yugoslavia Bosnia and Herzegovina 1945 1989 1993 1995 1999 1953 2002 news news news news news Hist. doc Hist. doc Hist. doc Hist. doc Historical ontology Events Persons Countries Regions Wars Disasters Floodings InternationalWar CivilWar Seminar Semantisch Netwerk Technologie, 22-Jan-2009, VU University Amsterdam
The semantics of history = semantics of change • Represent different realities: • related through causal changes over time in regions • what remains stable and what changes • representing different views or perspectives on the same reality, e.g. form a different historical angle or from different geographical or social parties. • Changes are typed as events Seminar Semantisch Netwerk Technologie, 22-Jan-2009, VU University Amsterdam
Events as key notions • Historical events: • events considered from a distance in time and abstraction of detail. • referenced by names (WOII, de Val van Srebrenica), nouns (war) or nominalizations (violation of human rights) • News events: • Reports on (the same) reality but more in the active verbal form: US soldiers shoot Iraqi citizens. • Closer to the actual event • Lacking a historical abstraction and filtering. • News becomes history over time, and we therefore expect a smooth transition in the use of language to refer to the same events, adding more and more historical perspective. Seminar Semantisch Netwerk Technologie, 22-Jan-2009, VU University Amsterdam
News classificationInternational Press Thesaurus Committee (IPTC, http://www.iptc.org/) • 16000000 unrest, conflicts and war • Acts of socially or politically motivated protest and/or violence. • 16001000 act of terror • Act of violence, often deadly, designed to raise fear and anxiety in a population • 16002000 armed conflict • Disputes between opposing groups involving the use of weaponry, but not formally declared a war • 16003002 rebellions • Armed uprising by citizens of a nation with the intent to overthrow the government, without necessarily achieving social change • 16006000 massacre • The death of a large group of people over a brief period of time • 16006001 genocide • Systematic killing of one clan, tribe or ethnic type by another Seminar Semantisch Netwerk Technologie, 22-Jan-2009, VU University Amsterdam
“Val van Srebrenica” in Wikipediawith historical perspective • Na het uiteenvallen van de Republiek Joegoslavië en de burgeroorlog die daarop volgde werd de stad, evenals Tuzla, Sarajevo, Gorazjde en Zepa, door de Verenigde Naties tot veilige enclave voor moslims verklaard, binnen een door Bosnische Serviërs beheerst gebied. • uiteenvallen Republiek ceases to exist • verklaren tot enclave role for a region • de stad (Srebrenica), evenals Tuzla, Sarajevo, Gorazjde en Zepa role fillers and parts that make up the region Seminar Semantisch Netwerk Technologie, 22-Jan-2009, VU University Amsterdam
“Val van Srebrenica” in Wikipediawith historical perspective • Op 11 juli 1995,(…), forceerden Bosnisch-Servische troepen onder bevel van generaal Ratko Mladic zich met tanks de stad binnen en deporteerden en vermoordden ca. 8.000 moslimmannen en -jongens. Het wordt gezienals de ergste daad van genocide in Europa sinds de Tweede Wereldoorlog. • forceerden de stad binnen take control • wordt gezien als genocide role • deporteren en vermoorden role fillers of genocide Seminar Semantisch Netwerk Technologie, 22-Jan-2009, VU University Amsterdam
News letter from the Dutch minister of defense with little historical perspective • De afgelopen zes maanden werd de uitvoering van deze taken aanzienlijk bemoeilijkt door de Bosnisch-Servische weigering de enclave voldoende te laten bevoorraden. Door een gebrek aan brandstof moesten patrouilles te voet worden uitgevoerd. Ook blokkeerden de Bosnische Serviers sinds mei jl. de rotatie van het personeel van Dutchbat, waardoor de bezetting werd teruggebracht van 630 naar 430 blauwhelmen. De vijandelijkheden namen geleidelijk toe, waardoor op 3 juni jl. een observatiepost in het zuidoostelijke deel van de enclave moest worden opgegeven • Verb phases: bevoorraden, blokkeren, opgeven • Potential historical terms: blokkade, val, drama, opgave, overgave Seminar Semantisch Netwerk Technologie, 22-Jan-2009, VU University Amsterdam
Qualifies as Actions enclave Deportation Shooting Killing Subevents qualified as Surrender Take-over gevecht vermoorden deporteren inname val opgeven vechten wegvoeren opgave innemen tankvuur overgeven veroveren onder vuur liggen schieten overgave verovering beschieten raken bezetting Lexicon versus Ontology Ontology Events Regions Disasters Conflicts Genocide Struggle Wars International Wars Civil Wars Bosnisch-Servische troepen deporteerden en vermoordden “wordt gezien als”, 313,000 google-hits, October 14th, 2008 Suriname wordt gezien als corrupt en crimineel land Gezien als een daad van genocide Seminar Semantisch Netwerk Technologie, 22-Jan-2009, VU University Amsterdam
Lexical ontological structure • Regions: • Yugoslavia (x), Bosnia (y), Serbia (z), hasPart (x, y), hasPart (x, z), period (t) • Roles, e.g. enclave(y): • protect (x,y,z), region (y), authority (x), threat (z), inhabit (v, y), period (t) • Events: • kill, deportation • Roles, e.g. ethnic cleansing(x): • event(x), agent (x, a), cause(x, y), move_from(y, z, r), inhabit (z, r), region (r), kill(v, a, z), deport(w, a, z), hasPart (x, v), hasPart(x, w), period (t) Seminar Semantisch Netwerk Technologie, 22-Jan-2009, VU University Amsterdam
“Val van Srebrenica” in Wikipediawith historical perspective • Headings: • 1992 ethnic cleansing campaign • The conflict in eastern Bosnia • Strugglefor Srebrenica • Text: • A fierce struggle for territorial control then ensued among the three major groups in Bosnia: Bosniak (commonly known as 'Bosnian Muslims'), Serb and Croat. In the eastern part of Bosnia, close to Serbia, conflict was particularly fierce between Serbs and Bosniaks • Serb military and paramilitary forces from the area and neighboring parts of eastern Bosnia and Serbiagained control of Srebrenica for several weeks in early 1992, killing and expellingBosniak civilians. In May 1992, Bosnian government forces under the leadership of Naser Orićrecaptured the town • thus proceeded with the ethnic cleansing of Bosniaks from Bosniak ethnic territories in Eastern Bosnia and Central Podrinje Seminar Semantisch Netwerk Technologie, 22-Jan-2009, VU University Amsterdam
What kind of questions can be answered? • Definition of time frames: • Historical events on a time axis • Listings of all low-level subevents within time frames • Mappings of role-events to subevents • Changes: • Changes within time frames, involving particular participants, in certain regions • Changes in roles of participants, causal associations • Similarities: • Varieties of labeling and naming schemes of events • Matches between multimedia objects: audio fragments, movies, art, & text archives (across languages) Seminar Semantisch Netwerk Technologie, 22-Jan-2009, VU University Amsterdam
AIO at FdL • Lexical framing of events in news reporting and historical descriptions. • Use historical thesaurus to group all the words and expressions in a lexicon relative to the same events • Differentiate implications of lexical variation: packaging of events Seminar Semantisch Netwerk Technologie, 22-Jan-2009, VU University Amsterdam
(News) Letter from the Dutch minister of defense with little historical perspective • Op dinsdag 11 juli jl. werd Srebrenica door de Bosnische Serviers ingenomen. Hieraan ging een week vooraf waarin de gevechten in het zuidoostelijke deel van de enclave en de Bosnisch-Servische beschietingen van de enclave heviger werden. Ook Dutchbat kwam daarbij onder vuur te liggen….Op vrijdag 7 juli werd de in het zuidoostelijke deel gelegen observatiepost Foxtrot door Bosnisch-Servisch tankvuur herhaaldelijk geraakt. Hierbij vielen geen slachtoffers. Op zaterdag 8 juli namen de gevechten verder toe en werd Foxtrot door Bosnisch-Servische infanterie ingenomen…. De regering is zeer bezorgd over het lot van de mannen die door de Bosnische Serviers naar Bratunac zijn weggevoerd Seminar Semantisch Netwerk Technologie, 22-Jan-2009, VU University Amsterdam
News classificationInternational Press Thesaurus Committee (IPTC, http://www.iptc.org/) • 03000000 disaster and accident • Man made and natural eventsresulting in loss of life or injury to living creatures and/or damage to inanimate objects or property. • 03001000 drought • A severe lack of water over a period of time • 03002000 earthquake • The shifting of the tectonic plates of the Earth, creating in some cases damage to structures • 03003000 famine • Severe lack of food for a large population • 03004000 fire • Ignition and consumption of materials through a combination of high heat and oxygen • 03005000 flood • Surfeit of water, caused by heavy rains or melting snow, usually in places where it's not wanted • 03006000 industrial accident • A mishap in a factory, a shop or an office, potentially harmful to humans • 03006001 structural failures • When a building, bridge or other structures collapse because of unexpected forces or poor design • 03007000 meteorological disaster • A weather-related disaster • 03007001 windstorms • A storm of high velocity but non-hurricane force movements of air with little or no rain or hail. Often highly destructive Seminar Semantisch Netwerk Technologie, 22-Jan-2009, VU University Amsterdam