1 / 20

Using corpora in translation studies

Using corpora in translation studies. What is a corpus?*. A corpus is defined in terms of f orm purpose The word corpus is used to describe a collection of examples of language collected for linguistic study .

gray-maddox
Download Presentation

Using corpora in translation studies

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Usingcorpora in translationstudies

  2. Whatis a corpus?* A corpus isdefined in termsof • form • purpose The word corpusisusedtodescribe a collectionofexamplesoflanguagecollectedforlinguisticstudy. It can alsodescribecollectionsoftextsstored and accessedelectronically. (Hunston:2002). Corpus planning and design isfunctionalto some linguisticpurpose. Itis on thisbasisthattexts are selected and stored, so thatthey can bestudiedquantitatively and qualitatively. *Ref. Text: Hunston S. Corpora in AppliedLinguistics 2002

  3. What are corporausedfor? • Corpora are oftenusedforlanguageteaching and learning. Theygive information abouthow a languageworks. • Theyalso help calculate the relative frequencyofdifferentfeatures. • Exploringcorpora can help studentstoobservenuancesofusage and tomakecomparisonsbetweenlanguages. • Corpora are alsousedto investigate cultural attitudesexpressedthroughlanguage. • NB a corpus willnotgive information aboutwhethersomethingispossible or not, onlywhetheritisfrequent or not!

  4. Usingcorpora in translation • Corpora are alsoused in translation. • Comparablecorporaallowto compare the useofapparentequivalents • Parallelcorporaallowtoseehowwords and phraseshavebeentranslated in the past. • Generalcorpora can beusedtoestablishnormoffrequency and usage.

  5. What can a corpus do? • Corpus access software isusedtorearrange the information whichhasbeenstored so thatobservationsofvariouskinds can bemade. • Itisnot the corpus whichgivesnew information aboutlanguage. Itis the software whichgivesnewperspectives on whatisalreadyfamiliar. • Software packagesprocess data showing: • frequency, • phraseology • collocation.

  6. Frequency • Corpus processing allowscomparisonsofwords in termsoffrequencylists. • Quiteobviously, grammarwords are more frequentthanlexicalwords. Thatexplainswhythey are found top of the list. • Frequencylists can beusefulforidentifyingdifferencesbetween the corpora. Butcomparisons can bemadeonlyif the corpora are comparable, i.e. iftheirlengthisapproximately the same.

  7. Concordance • The mostfrequent way toaccess a corpus isthrough a concordancingprogram. • Concordancelinesbringtogetherinstancesofuseofwords or phrases, so thatregularities in use can beobserved. • Concordancesalso help tounderstandhownouns or adjectives are used

  8. Collocation • Collocationis the tendencyofwordstoco-occur. • The collocatesof a given word are thosewordswhichoftenoccur in conjunction • Collocation can indicate pairsoflexicalitems, or the associationbetween a lexical word and itsfrequentgrammaticalenvironment. In the latter case, the termusediscolligation.

  9. Typesofcorpora • A corpus isdesignedfor a particularpurpose. Consequently, the typeof corpus depends on itspurpose: • Specialized corpus • General corpus • Comparablecorpora • Parallelcorpora • Learner corpus • Historical or diachronic corpus • Monitor corpus

  10. Specialized corpus: a corpus oftextsof a particulartype (editorials, academicarticles, lectures, essays, etc.). Specializedcorporareflect the typeoflanguage a researcherwantstoexplore. Youmayalsorestrict the corpus to a time frame, to a social setting, to a giventopic. • General corpus: is a corpus oftextsofmanytypes, ofwritten or spokenlanguage, or ofboth. A general corpus isusuallymuchlargerthan a specialized corpus. Sinceit can beusedto produce referencematerialsitissometimescalled a reference corpus.

  11. Comparablecorpora: two or more corpora in differentlanguages, or in differentvarietiesof a language. They are designedtocontain the sameproportionoftexts (i.e. newspapertexts, essays, novels, conversations, etc.). They can beusedbytranslators and learnerstoidentifydifferences and equivalences in eachlanguage. • Parallelcorpora: two or more corpora in differentlanguages, containingtranslatedtexts, or textsproducedsimultaneously in two or more languages (e.g. EU texts). They can beusedbytranslators and learnerstofindpotentialequivalents in eachlanguage, and to investigate differencesbetweenlanguages.

  12. Learner corpus: a collectionoftextsproducedbylearnersof a language. Itisusedtoidentifydifferencesamonglearners, frequency and typeofmistakes, etc. • Historical or diachronic corpus: a corpus oftextsfromdifferentperiodsoftime. Ithelpsto trace the developmentof a languageovertime. • Monitor corpus: a corpus usedtotrackcurrentchanges in a language. Itrapidlyincreases in size, sinceitisaddedannually, monthly, daily, etc. The proportionof text typeshastoremainconstant, so thateachyeariscomparablewitheveryother.

  13. Key terms • Type • Token • Hapax • Lemma • Word-form • Tag • Parse • Annotate

More Related