270 likes | 285 Views
The rise of Cultural Informatics. Gregory Crane Professor of Classics Winnick Family Chair of Technology and Entrepreneurship Perseus Project Tufts University. Perseus Project. DL development 1987- Ancient Greco-Roman Culture DLI-2: “A Digital Library for the Hum.”
E N D
The rise of Cultural Informatics Gregory Crane Professor of Classics Winnick Family Chair of Technology and Entrepreneurship Perseus Project Tufts University
Perseus Project • DL development 1987- • Ancient Greco-Roman Culture • DLI-2: “A Digital Library for the Hum.” • Up through early 20th century • Calculatedly disparate collections • Production: www.perseus.tufts.edu • 9million pages/month, 85% Greco-Roman • Research: what characterizes cultural DLs? • Audience / Services / Content Model Triad • Cross-over: e.g. NSDL work
Cultural Informatics • Why not “Computational Humanities,” “Humanities Computing,” “Computing and the Humanities”? • Too confining • Textual and fine arts • Associations with canonical culture, esp. western • Cultural Informatics -- very broad • Challenging but important perspective
Cultural Informatics • Object of Study: • Geo-spatial open: All cultures of the world • Temporally open: cultures as evolving process • Past, present and future • Goals: • Analysis of cultures • Communication between cultures • Fundamental to world peace and prosperity
Does culture matter? • Jerusalem • Kosovo • Baghdad • Mecca • Congo • Rwanda
Applications • Visualization: • Tracking anger against the US • (terrorism/national security) • Identifying cultural trends • (Marketing/trade) • Broad educational • Acquiring information, individual and comparative
Culture Matters! • F-Measures for Place Name Identification • Includes semantic classification and identification (Which Springfield) • Greco-Roman Sources: 95% • European Sources: 90% • US Sources: 80%!!!
Applications • Customized knowledge support • What info do readers A vs. B need? • Backgrounds, purposes etc. • Documents: what am I reading? • Objects: what is this thing? • Spaces: where am I moving? • Audiences • Tourists and visitors • Peace-keepers and ground forces • Business
What am I looking at? • Cambridge Civil War Monument (1870) • Linking to other data • City Directories • Regimental Histories • Period Maps • Old Photographs
Library Becomes Infrastructure • Moving through a neighborhood • When were these houses built? What is their style? Who lived here? • Moving thru an ecosystem • What are the plants/animals? • What systems are in play? • Answers to every quantifiable question delivered in real time on the spot
Reading in a Democratic Society • Continuation of reading revolution • 1760-1830, before and after • Now requires a cultural informatics • Includes but transcends textual materials • What is the point of health and prosperity? • Emerson’s American Scholar in the 21st century
System Input • Quantitative data -- easiest • States self-organize into databases (“Seeing like a state”) • Linguistic data -- hard • Minimally dozens, if not hundreds • Varying level of documentation • Cultural data -- hardest • Language/Culture clusters: thousands+
Cultural Informatics begin • at the limits AND intersection of • manual analytic techniques • generic computational techniques • Cross-trained experts • Serve as connectors between specialists • Have intuitive understanding of not-yet-articulated possibilities from BOTH sides
Cultural Informatics • Aggregation and Visualization • Extraction from many examples • Quantified, targetted generalizations • Focus and customization • Start from a document/object/scene • Customized decision support • Yes/No decisions (~search) • Discursive analysis (~browsing)
How do we do it now? Or do we? • Players -- no real specialists • Faculty in higher education • Librarians • Think tanks • Intelligence Community • Broadcast media • Journalists and professional authors
How do we do it now? Or do we? Computing and the Humanities Focus on semi-passive analysis Emphasis on publication Social science & empirical data How well do we work with heterogeneous data? How well do we work with multiple languages? Computer and Information Science How far have we gone in document understanding? Do we distinguish encyclopedic/semantic data?
How do we do it now? Or do we? • Cultural Grant Agencies: IMLS, NEA, NEH • Governmental libraries: LOC to public libs • Governmental museum/sites: SI, NPS • Intelligence agencies: CIA, NSA, etc. • NSF: SBE, experiments with DLI, ITR
What do we need to do? • Provide new kinds of training • Cultural Informatics • As self-standing discipline? • As new specialty in History/Anthro/classics etc. • As new specialty within Computer Science • As logical extension of Lib and Info Science
What do we need to do? • Core cultural informatics experts • 50? Able to coordinate many different efforts • History/Information Science • Domain Specific experts • 100s/1000s of experts in Area Studies/Lang Tech etc. • Build up to 100? Grad students/postdocs • Research support: $50m/year?
What do we need to do? • Create technological infrastructure • Broaden/expand the evaluation forums • More TREC/ACE/DUC/CLEF/SENSEVAL etc. • Build knowledge resources • Parallel corpora, lexica, portable heuristics • Focus on broad semantic as well as encyclopedic analysis • Homo ignavus (lat.) ~ “bad man” but …
What do we need to do? • First cut: 100 languages in five years • Allow $1,000,000/language $100m • US Knowledge Sources --> 1922 (Pub Dom) • City directories, Census, • Newspapers & Periodicals • Encyclopedias, school texts, manuals • Maps, gazetteers • Allow avg $1,000,000/year @ 300 years: $300m
What do we need to do? • World peace and prosperity are the goal • What US agencies do what? • IMLS, NEH, NEA, LOC, SI, NPS all have roles • But much work must be situated in NSF • Cultural informatics includes scientific and engineering research • NSF should, at the least, incubate these aspects of cultural informatics
How do we know we are there? • Can dynamically plot cultural states across the globe from dozens of language/culture combinations • Can support reading/spatial exploration/object analysis customized for many different categories of user