570 likes | 725 Views
Digitization Challenges for [Jewish] Genealogy. Jean-Pierre Stroweis stroweis@zahav.net.il. EVA Minerva Jerusalem, November 2006. Genealogy, a Cultural Heritage?. Culture Ethnographic view (Edward Tylor).
E N D
Digitization Challenges for [Jewish] Genealogy Jean-Pierre Stroweis stroweis@zahav.net.il EVA Minerva Jerusalem, November 2006
Culture Ethnographic view (Edward Tylor) תרבות היא אותה שלמות מורכבת של ידע, אמונה, אומנות, מוסר, חוק, מנהגים וכל אותם הכשרונות וההרגלים שהאדם רוכש אותם בהיותו חלק מהחברה. Culture is that complex whole which includesknowledge, belief, art, morals, law, custom, and any other capabilities and habits acquired by man as a member of society.
Genealogy a Cultural Heritage? • Monotheist Religions • Judaism • Christianity • Islam
Genealogy in the Torah • This is the book of the generations of Adam. In the day that G-d created man, in the likeness of G-d made He him; • male and female created He them, and blessed them, and called their name Adam, in the day when they were created. • And Adam lived a hundred and thirty years, and begot a son in his own likeness, after his image; and called his name Seth. • And the days of Adam after he begot Seth were eight hundred years; and he begot sons and daughters. • And all the days that Adam lived were nine hundred and thirty years; and he died. • And Seth lived a hundred and five years, and begot Enosh. • And Seth lived after he begot Enosh eight hundred and seven years, and begot sons and daughters. Genesis, 5:1-7
Genealogy in the New Testament • 12 After the exile to Babylon:Jeconiah was the father of Shealtiel, Shealtiel the father of Zerubbabel, • Zerubbabel the father of Abiud, Abiud the father of Eliakim, Eliakim the father of Azor, • Azor the father of Zadok, Zadok the father of Akim, Akim the father of Eliud, • Eliud the father of Eleazar, Eleazar the father of Matthan, Matthan the father of Jacob, • and Jacob the father of Joseph, the husband of Mary, of whom was born Jesus, who is called Christ Matthew 1:12–16 + Luke 3:21-38
Genealogy in IslamExtracanonical traditions Muhammad bin ‘Abdullah bin ‘Abdul-Muttalib (who was called Shaiba) bin Hashim, (named ‘Amr) bin ‘Abd Munaf (called Al-Mugheera) bin Qusai (also called Zaid) bin Kilab bin Murra bin Ka‘b bin Lo’i bin Ghalib bin Fahr (who was called Quraish and whose tribe was called after him) bin Malik bin An-Nadr (so called Qais) bin Kinana bin Khuzaiman bin Mudrikah (who was called ‘Amir) bin Elias bin Mudar bin Nizar bin Ma‘ad bin ‘Adnan. References: Ibn Hisham 1/1,2Talqeeh Fuhoom Ahl Al-Athar, p. 5-6Rahmat-ul-lil'alameen 2/11-14,52
Genealogy a Cultural Heritage? • Monotheist Religions • Judaism • Christianity • Islam • Governments • Ontario Ministry of Culture • France Ministry of Culture
Genealogy HeritageWho’s in charge of Conservation? • Families? usually no • Administrations? their own records • Towns? cemetery not tombs • Historical Museums and Archives? out of their scope • Genealogical Societies? nothing systematic No systematic preservation of the genealogical heritage!
ART Created/tangible items Selected items are collected Public sphere Value to society: museums, maintenance, academic research, intellectual rights, cost-value GENEALOGY Re-construction Each individual is a subject of study Private sphere Value to family: no government support, little academic research, privacy rights Preserving Art versus Genealogy
Genealogy HeritageConservation • No systematic preservation of the genealogical heritage • No conservation of genealogy per se; Preservation of the sources that will enable future genealogical re-construction • Genealogical sources are usually preserved by institutions for which genealogy is not the primary purpose
Genealogy HeritagePlayers • LDS Church (Mormons) accessible records • Archives of former Administrations • Ellis Island • Hamburg StaatArchiv • Red Cross ITS Arolsen their own records • Holocaust Memorials • Yad Vashem • USHMM • Mémorial de la Choah collected records
Genealogy HeritageMore Players • Private Companies • Ancestry.com • FamilyDNA.com • National Archives • Genealogical Societies and SIGs • Genealogical Libraries • Genealogical Library, Germantown, Tennessee • DNA Library, Glasgow, Scotland
Genealogy Heritage[Jewish World] Players • Individual Initiatives • Jewish Genealogical Family Finder, JewishGen, Jewish Records Indexing-Poland, Routes-to-Roots Foundation, Istanbul Rabbinate Records, One-Step Web Site… • IAJGS • Center for Jewish History (NYC) • Hevrot Kaddisha (Burial Societies)
Preservation Life Cycle for Genealogical Sources • Acquisition • Authentication • Translation/Transliteration • Accuracy Assessment • Soundexing • Storage • Cataloguing • Access rights • Query & Retrieval Tool • Publication/Distribution
Preservation Life Cycle for Genealogical Sources • Acquisition • Authentication • Translation/Transliteration • Accuracy Assessment • Soundexing • Storage • Cataloguing • Access rights • Query & Retrieval Tool • Publication/Distribution
Data Acquisition • Interviews (Text - Audio – Video) • Scanning Documents, Family Tree Charts, Pictures • On-site visits to Archives & Cemeteries • Manual Data Entry • Optical Character Recognition
Preservation Life Cycle for Genealogical Sources • Acquisition • Authentication • Translation/Transliteration • Accuracy Assessment • Soundexing • Storage • Cataloguing • Access rights • Query & Retrieval Tool • Publication/Distribution
Digital Formats ofGenealogical Data • Family lore, artifacts and biographies Format: TEXT / IMAGE / AUDIO / VIDEO / 3D • Documented events Format: TEXT / IMAGE / SPREADSHEET / DATABASE • Physical traits Format: IMAGE / TAGGED FORMAT TBD • Genetic profileFormat: TEXT / TAGGED FORMAT TBD • Family TreesFormat: GEDCOM
GEDCOM • GEDCOM • Genealogy Data COMmunication, • Neutral format for exchange of genealogical data, • Specification written by LDS Church (www.familysearch.org) • GEDCOM Version 5.5 (1996) • Text-based, • ANSEL character encoding, • Widely used • GEDCOM Version 6.0 (Draft 2002) • XML-based, • Unicode characters, • Not implemented www.familysearch.org/GEDCOM/GEDCOM55.exe www.familysearch.org/GEDCOM/GedXML60.pdf
Preservation Life Cycle for Genealogical Sources • Acquisition • Authentication • Translation/Transliteration • Accuracy Assessment • Soundexing • Storage • Cataloguing • Access rights • Query & Retrieval Tool • Publication/Distribution
Standard for cataloguing Body: Dublin Core Metadata Initiative Goal: Development of interoperable online metadata standards Standard: The Dublin Core Element Set Web: www.dublincore.org
Dublin Core Element Set • Version 1.1, 2004 • Standard for cross-domain information resource description • Meta-data Elements: title, creator, subject, description, publisher, contributor, date, time, format, identifier, source, language, relation, coverage, rights
Preservation Life Cycle for Genealogical Sources • Acquisition • Authentication • Translation/Transliteration • Accuracy Assessment • Soundexing • Storage • Cataloguing • Access rights • Query & Retrieval Tool • Publication/Distribution
Standard for retrieval Body Open Archives Initiative Goal Promotes interoperability standards that aim to facilitate the efficient dissemination of content Standard The Open Archives Initiative Protocol for Metadata Harvesting Web www.openarchives.org
Open Archives Initiative Protocol for Metadata Harvesting • An application-independent interoperability framework based on metadata harvesting. • Data Providers: administer systems that support the OAI-PMH as a means of exposing metadata • Service Providers: use metadata harvested via the OAI-PMH as a basis for building value-added services
OAI - Architecture Source: www.culture.gouv.fr/culture/dll/OAI-PMH.htm
Preservation Life Cycle for Genealogical Sources • Acquisition • Authentication • Translation/Transliteration • Accuracy Assessment • Soundexing • Storage • Cataloguing • Access rights • Query & Retrieval Tool • Publication/Distribution
Our Experience Helkat Mehokek Index of Gravestone Hebrew Inscriptions on Mount of Olives Cemetery 1875 Census of the Jewish Population of Eretz Israel, Ordered by Sir Moses Montefiore Paul Jacobi’s Index of the Names (listed in monographs) Name Changes in the Palestine Gazette Different types of records Similar Verification Process
Name Changes in the Palestine Gazette Name Changes in the Palestine Gazette
What is Quality? • Accuracy • Integrity with Original Source • Internal Consistency • Completeness • Simplicity / Ease of Use
The Process Source
The Process Source Excel Table
The Process Source Excel Table Searchable database
Quality during Design • Goals • Index or Full Extract? • Team Policy • Conventions • Reference to source • Structure • Fields • Transliteration
Fields Semantics Rabbi Schimon III "DAYAN"-"BROD"-"KARA" MI-WINA Title: Rabbi First Name: Schimon Surname: DAYAN-BROD-KARA Known as: from Wien Searchable Fields Full Name: Schimon III DAYAN WIENER-BROD-KARA Non Searchable Field
Transliteration Issues • Z like Zacharia ז • or • Z like Zadokצ Tzadok
Quality at Verification Two Steps • Unit Test (column-by-column) • Integration Test (correlate fields)
Unit TestTypes of Errors Detected • Unexpected characters in field value, • Variant spellings of the same name (suspect), • Letter characters embedded in a numeric field (e.g. ‘O’ instead of zero), • Invalid and out-of-range values (e.g. for dates, ages), • Inconsistent usage of acronyms, • Inconsistent transliteration
Unit TestDerived Benefits • Maximum and Minimum Values • List of Distinct Values • Distribution of Values (Frequency)
Unit TestHow to Proceed? • sort, • auto-filter, • advanced filter, • pivot table
Integration Test • Redundancy in the Source Document • check that the various correlated values do not contradict each other • No Redundancy in Source Document • find recurring patterns and implicit rules inherent to the nature of the document • Verify that these patterns are respected