20 likes | 202 Views
The Diachronic Electronic Corpus of Tyneside English. What we have: The Newcastle Electronic Corpus of Tyneside English (NECTE: http://research.ncl.ac.uk/necte/ ) NECTE is an AHRC-funded corpus of dialect speech from Tyneside.
E N D
The Diachronic Electronic Corpus of Tyneside English What we have: The Newcastle Electronic Corpus of Tyneside English (NECTE: http://research.ncl.ac.uk/necte/) NECTE is an AHRC-funded corpus of dialect speech from Tyneside. It is based on two existing corpora, one from the 1960s and the other from 1994. It amalgamates the legacy corpora into a single TEI-conformant XML-encoded corpus and makes them available in a variety of formats: digitized audio, standard orthographic transcription, phonetic transcription, and part-of-speech tagged, all time-aligned. 2. What we’re developing: The Diachronic Electronic Corpus of Tyneside English (DECTE: http://research.ncl.ac.uk/decte/) DECTE, also AHRC-funded, updates and extends NECTE. It will incorporate about 100 additional interviews from 2007-current. It also incorporates thematic mark-up and associated graphical material. The aim is to make DECTE usable by the general public, the cultural industries, and by all levels of the education sector from primary to higher.
The Diachronic Electronic Corpus of Tyneside English 3. What we would like • Not to have wasted our time, that is, to have NECTE / DECTE used and to remain usable for the foreseeable future. • For the language corpus community to converge on a set of formatting, archiving, and access standards, that is, to reverse the babble of Babel. • To develop a wider range of analytical tools usable by corpora which adhere to these standards.