310 likes | 438 Views
RESEARCH DESIGN & CORPUS COMPILATION. Corpus design is intrinsic and fundamental part of the analysis . It is guided by the RQ and affects the results.
E N D
Corpus design is intrinsic and fundamental part of the analysis. • It is guided by the RQ and affects the results. • Design criteria are interpretative and must be explicit (why you chose the texts you did, how and why you organised them in the way you did) • Differentpurposes = differentcorpora.
Corpus design • What? • Which? • When? • Where? • How? • Why?
What • Choosingdiscoursetype(s) • there are epistemologicalconsiderations • And there are practicalconsiderations • General vs. topical • epistemologicalconsiderations • practicalconsiderations
Which • Choosingvariables • Youneedtohaveat leastonevariableconstant or yourcorpora are notreallycomparable • e.g. sametimeperioddifferentnewspapers • Samekindofnewspaperdifferenttimeperiod
comparison • You are comparing and looking for patterns • One occurrence of anything is not enough • a pattern is • a) a figure that emerges from a homogeneous background by means of differentiation and • b) the accumulation of similar things. • C) recurring regularities of form
Comparative analysis • Looking at: • DIFFERENCE • SIMILARITY • acrosscorpora • withincorpora
parametersacrosscorpora • mode (written vs. spoken) • discoursetype (e.g. factual vs. fiction) • time (diachronicstudies) • variety (e.g. British English vs. American English) • geography (e.g. national vs. local newspapers) • political tendency (Democrats vs. Republican’s speeches) • individual (e.g. George Elliot vs. Thomas Hardy) • ...
Parameterswithincorpora • sub-corpora • (e.g. headlines vs. articles; news vs. comment) • Items • (e.g. moral vs. ethic; boy vs. girl; immigrant vs. asylumseekervs. refugee ...)
General corpus • Integral output of a source-unit • (e.g. a whole edition of a newspaper) • A corpus of works by one author (not a single text)
Topicbased corpus • Search-term(s) basedcollection • Yougathertextsbysearching a database forall the textscontaining the search-term(s) • identifying the list of search items to ensure the coverage of the topic is as complete as possible.
Timebased • Historicallinguistics • diachronicchange/stabilityoflanguage • moderndiachronicanalysis • SeeeditionofCorpora MD-CADS forexamples (Partington2010)
Researchquestions • All the choices we make in the corpus design and data collectionphase • e.g. what to collect, how to collect it, from which platform, in • which format, etc. • alldependon the RQ!
Practicalconsiderations • availability • access • collection • speed • storage • format
The researchquestion • All the choices we make in the corpus design and data collectionphase • e.g. what to collect, how to collect it, from which platform, in • which format, etc. • depend on the RQ!
RQ example 1 • 1. How are muslims represented in the British press? • What are the appropriate search terms? • muslim*, moslem*, islam* ...? • Considersynonyms and near-synonyms, alternative spellings etc.
RQ example 2 • 2. How is religion represented in the British press? • How many terms do I need to add? • How many terms can I add?
RQ example 3 • 3. How much attention does the British press give to religion? • A search-term based corpus will not tell you. • How will you find out? • How will you delimit the work? (by limiting and defining the RQ a bit more, e.g. by defining a time period or the type of newspapers under consideration)
storage • FOLDERS • folders and file names (a repository of information, a sort of level 0 of mark-up) • FILES become our definition for what is a text • unitofanalysis
Best practice • Distribute information between FOLDER and FILE according to the structure of your corpus (and to your RQ) • Avoid having more than 2 or 3 levels of folders • Keep names short but dense with information
example 1: Do newspapers use the same language at a 20 years distance? • Which among British broadsheets has changed the most?
storageforexample 1 • CORPUS • year 1Newspaper1 • y1_n1_f1 y1_n1_f2 y1_n1_f3y1_n1_f4 ... • N2 • N3 • year 2 • N3 • N1 • N2
example 2: • How are science and religion represented in political discourse?
Solution 1 Science corpus Religion corpus Solution 2 Democrat corpus Republican corpus
Howmuch? • The bigger the better • BUT • also the size depends on the purpose! • I ask for a minimum of 100,000 words
The transformation of texts into textual resources is a process of interpretation and therefore compilers have the responsibility typically associated with an editor. • The questions we ask (and those we do not ask), affect the answers we can get, it is important to keep track of our expectations and choices and the reasonsbehindthem.
Epistemological reflexivity requires us to engage with questions such as: • How has the research question defined and limited what can be ‘found’? • How has the design of the study and the method of analysis ‘constructed’ the data and the findings? • How could the research question have been investigated differently? • To what extent would this have given rise to a different understanding of the phenomenon under investigation?
Reflexivity is an unavoidable aspect of research: • Thus, epistemological reflexivity encourages us to reflect upon the assumptions (about the world, about knowledge) that we have made in the course of the research, and it helps us to think about the implications of such assumptions for the research and its findings’(Nightingale and Cromby, 1999: 228).
Principlesofaccountability • Replicability • Theseprinciples are important in researqch and youneedtolearntoaskyourselfhowyourresearchfollows the principles
Project work • Youhavethreedaysto complete yourresearchquestion and compile your corpus • (Tuesday 5, Wednesday 6, Monday 11 November) • I willbeavailable in myoffice or the labforadvice and help • On Tuesday 12 and Wednesday 13 youwillpresentyour work in progress to the group • Be preparednot just topresentbutalsotogive feedback
Exam • The examincludes: the first draftconsistingof the abstractand corpus descriptionpresentedto the group • A finaldraftconsistingofabstract and a copy ofyourcorpus and itsdescription • A presentation on the dayof the exam. • Don’t forgetyouneedtohaveproofof B2 competence in English tobeabletoregister the exam.