50 likes | 188 Views
Semantically-Rich Tools for Text Exploration. Andrew Ashton Center for Digital Scholarship Brown University. Center for Digital Scholarship.
E N D
Semantically-Rich Tools for Text Exploration Andrew Ashton Center for Digital Scholarship Brown University
Center for Digital Scholarship • The Brown University Women Writers Project (WWP) is a long-term research project devoted to early modern women's writing and electronic text encoding. WWP supports research on women's writing, text encoding, and the role of electronic texts in teaching and scholarship. • The Brown University Scholarly Technology Group (STG) provides advanced technology consulting to Brown humanities faculty, departments, libraries, and research centers. We explore the critical new technologies that are transforming scholarly work and helping to maintain its longevity: data and metadata standards, XML publication tools, text encoding methods, database design, and accessibility standards.
Text Encoding Initiative (TEI) at Brown 300+ WWP texts in TEI P4 Inscriptions, epigraphy, and other texts using TEI variants Encoding focuses on semantic and contextual data (i.e. genre, text structure, personal & place names, etc.)
Examples • Extract and separate a collection of texts by genre, then retrieve genre-specific structures within the text (e.g. poems, dramatic speeches, letters, recipes) • Distill from the selected texts or text pieces the personal names, and separate these by type (references to historical figures, mythological figures, biblical figures; place names; etc.) • Sort the subset of data chronologically. • Pass the data through a component that tokenizes and adds morphosyntactic information to each word. • Generate a visualization for each genre that describes changes in the association of certain adjectives with personal names, differentiated by gender.
Project activities • Identify an initial set of structural and semantic textual features that have particular significance for literary studies, and examine the ways in which these must be manipulated and processed to support analysis. • Develop a test set of about a dozen SEASR components that operate on these features; these will be contributed to the SEASR repository for common use. • Develop a set of SEASR “flows” using these components in combination with other SEASR modules, to produce analytical outcomes that address specific scholarly questions, using the WWP collection as a testbed. • Distribute these new SEASR resources via an open repository, so that they can be used by other SEASR projects and users. Thanks! andyashton@brown.edu