1 / 17

Corpus linguistics for translators

Corpus linguistics for translators. Amanda Saksida University of Nova Gorica. ... He cast a sídeways look at Harry under his bushy eyebrows. „Be grateful if yeh didn´t mention that ter anyone at Hogwarts,“ he said. „I´m – er – not supposed ter do magic, strictly speakin´.“. ...

birt
Download Presentation

Corpus linguistics for translators

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Corpus linguistics for translators Amanda Saksida University of Nova Gorica

  2. ... He cast a sídeways look at Harry under his bushy eyebrows. „Be grateful if yeh didn´t mention that ter anyone at Hogwarts,“ he said. „I´m – er – not supposed ter do magic, strictly speakin´.“

  3. ... He cast a sídeways look at Harry under his bushy eyebrows. „Be grateful if yeh didn´t mention that ter anyone at Hogwarts,“ he said. „I´m – er – not supposed ter do magic, strictly speakin´.“ Hedwig Harry Hogwarts Hagrid Quidditch ...

  4. ... He cast a sídeways look at Harry under his bushy eyebrows. „Be grateful if yeh didn´t mention that ter anyone at Hogwarts,“ he said. „I´m – er – not supposed ter do magic, strictly speakin´.“ Hedwig Harry Hogwarts Hagrid Quidditch ... wart hog = Phacochoerus aethiopicus

  5. ... He cast a sídeways look at Harry under his bushy eyebrows. „Be grateful if yeh didn´t mention that ter anyone at Hogwarts,“ he said. „I´m – er – not supposed ter do magic, strictly speakin´.“ Hedwig Harry Hogwarts Hagrid Quidditch ... wart hog = Phacochoerus aethiopicus

  6. Course outline • Introductory: what is corpora, hystory, typology, online corpora, • Areas where corpora are being used, • Corpus-based translation studies: interesting examples • Tools for building and usage of corpora

  7. What is corpus • A corpus is a collection of pieces of language that are selected and ordered according to explicit linguistic criteria in order to be used as a sample of the language. • Computer corpus: a corpus which is encoded in a standardised and homogeneous way for open-ended retrieval tasks. Its constituent pieces of language are documented as to their origins and provenance. • (Guidelines of the Expert Advisory Group on Language Engineering Standards, 1996) • Big collections of modern texts • Electronic form • Representative for language/dialect • Base for desctiptive studies (not prescriptive!)

  8. Brief hystory of corpus linguistics • 1964: Brown corpus (1 M words) • John Sinclair and the Cobuild-Revolution => Bank of English (470 M), • British National Corpus (100 M) => Other languages: Czec, Hungarian, Croatian, Slovac, …) • Web as corpus: with the digital revolution, more and more texts are available on the net => programs that build corpora using on-line texts (WebBootCat, http://www.sketchengine.co.uk/auth/wbc/mycorp.cgi)

  9. Types of corpora • Kinds of corpora: • Medium: written texts / spoken language • Size: referential corpora / specialized corpora • Time span: synchronic/diachronic corpora • Tagging: lemmatized / POS-tagged corpus • Language: mono- or multilingual corpora: • paralell • comparable • translational

  10. Corpus usage • Lexicography • Descriptive Grammars • Translational tools and studies • Foreign languages learning • Socio-linguistic studies • Language technologies

  11. Keywords • Concordance • KWIC (Keyword in Context) • Type / Token • Tag / Lemma • Collocation

  12. What can a corpus tell us? • Word frequency • How frequent a word / word form is (copared to other words)? • Lexical information • Which word frequently coocur? • Which affixes can a word have? • Syntactical information • In which syntactical structures can a word occur? • Semantical information • What are the possible meanings of a word? • Pragmatic information • In which texts can we find a word? What stylistic inforamtion does a word or it's context bear? Does the usage of a word stagnate, is the frequency increasing or decreasing?

  13. What can a corpus tell us? • Translational studies: • Parallel corpus studies can reveal characteristics of translated texts, such as tendenciestowards explicitness and avoidance of repetition. • Comparison between the translation part of the corpus and a corpus of texts ofthe same genre, written in the target language for the translation corpus, reveals atendency towards what we might call the Eliza Doolittle phenomenon: the translatedtexts, more than the texts in the control corpus, tend to contain those TLphrases, structures, and so on, which, from a comparative point of view, seemparticularly characteristic of the TL.(Malmkjaer 1996)

  14. Some of the online corpora • British National Corpus • http://www.natcorp.ox.ac.uk/ • http://view.byu.edu • Bank of English • http://www.collins.co.uk/Corpus/CorpusSearch.aspx • CORIS • http://corpus.cilta.unibo.it:8080/DEMOCORISCorpQuery.html • FidaPLUS: • www.fidaplus.net • Good link: • http://devoted.to/corpora

  15. Tools for translating • Sentence alignment: • TRADOS WinAlign • ATRIL DejaVu • Vanilla Aligner (unix/linux) • Concordances • Wordsmith Tools (www.lexically.net) • Sketch Engine (http://www.sketchengine.co.uk) • MonoConc/ParaConc (www.athel.com) • aConCorde - gut für Arabisch (http://www.comp.leeds.ac.uk/andyr/software/aConCorde/) • CQP (ims.uni-stuttgart.de) • Manatee / Bonito (www.textforge.cz)

  16. Corpus linguistics in Turkey • Kemal Oflazer: http://www.andrew.cmu.edu/user/ko/ • Informatics Institute corpus: http://www.ii.metu.edu.tr/~corpus/

More Related