510 likes | 520 Views
Collocational properties of translated language. Silvia Bernardini School for Translators and Interpreters University of Bologna at Forlì 30 July 07 silvia.bernardini@unibo.it. Collocations Brief overview Frequency vs. Phraseology A note on statistics Translation studies Theory
E N D
Collocational properties of translated language Silvia Bernardini School for Translators and Interpreters University of Bologna at Forlì 30 July 07 silvia.bernardini@unibo.it
Collocations Brief overview Frequency vs. Phraseology A note on statistics Translation studies Theory Methodology Collocation Limits Current study Aim Method Results Discussion Limits Ways forward Overview
What is a collocation? • “[] I would like to put forward the concept of collocation which I have introduced in my own work. This is the study of key-words, pivotal words, leading words, by presenting them in the company they usually keep – that is to say, an element of their meaning is indicated when their habitual word accompaniments are shown.” (Firth 1956:106-107) • E.g.: English: the English people, English literature, English reserve, English manners, English countryside, the English and all that can be said about them, the English public schools, English Universities (!) Collocations
Frequency-oriented views • “Significant” collocation is regular collocation between items, such that they occur more often than their respective frequencies and the length of the text in which they occur would predict (Jones and Sinclair 1974:19) • A collocation is a sequence of words that occurs more than once in identical form and is grammatically well-structured (Kjellmer 1987: 133) Collocations
Phraseology-oriented views • Restricted collocations are fully institutionalised phrases, memorized as wholes and used as conventional form-meaning pairings(Howarth 1996: 37) Collocations
Sum of many occurrences in texts Position important Number of words involved important Syntactic relationship can be important Frequency/statistics important An abstract entity with instantiations in texts (PERFORM + TASK) Position/number of constituents not central; Different restrictions distinguished (DOG+BARK not a collocation) Main criterion: semantic unpredictability Frequency vs. Phraseology Collocations
2 ways of finding collocations • Starting from a (set of) keyword(s) and looking left and right • Gledhill (2000): phraseology surrounding “keywords” in different sections of cancer research articles • Selecting all sequences of N words that recur a certain number of times • Kjellmer (1994): All two-word sequences appearing more than two times in the Brown corpus Collocations
A note on statistics • Frequency (Danielsson 2001) • Statistics: pointwise Mutual Information (MI) • Compares the probability of observing x and y together (the joint probability) with the probabilities of observing x and y independently (chance). (Church and Hanks 1990: 77) • Formula p(xy) * N MI(x;y)= log2 ------------- p(x) * p(y) • Limits of MI Collocations
Corpus-based TS • Theoretical background • Methodological background • Studies of collocation within TS • Limits
Theoretical background 1 Baker (1993: 243) The most important task that awaits the application of corpus techniques in translation studies […] is the elucidation of the nature of translated text as a mediated communicative event. Corpus-based Translation Studies
Theoretical background 2 Toury (1995) Translation as norm-governed behaviour: ‘translatorship’ amounts first and foremost to being able to play a social role, i.e. to fulfil a function allotted by a community […] in a way which is deemed appropriate in its own terms of reference (ibid.: 53) Corpus-based Translation Studies
Operationalising it • Studies should be carried out focusing on the nature of translational norms as compared to those governing non-translational kinds of text production (Toury 1995: 61). • Corpus research in TS should focus on the identification of universal features of translation, that is features which typically occur in translated text rather than original utterances and which are not the result of interference from specific linguistic systems. (Baker 1993:243). Corpus-based Translation Studies
“Universals” • Explicitation/explicitness • Simplification • Disambiguation • Levelling out (homogeneity) • Preference for conventional grammar • Avoidance of repetition • Exaggeration of features of the target language • Normalisation/sanitisation • Absence of TL-specific “unique items” • “Shining-through” Corpus-based Translation Studies
Methodological background • Monolingual comparable corpora • Originals in Language A and comparable translations into Language A • They should make visible “patterning which is specific to translated texts, irrespective of the source or target languages involved” (Baker 1995: 234). • Parallel corpora • Originals in Language A and their translations into Language B, usually combined with reference corpora Corpus-based Translation Studies
TS: research on collocation Olohan (2004): Collocation and moderation • Quite, rather, pretty and fairly in translated vs. original English fiction • Pretty and rather, and more marginally quite, “are used a lot less in [TEC-Fiction] but, when they are, there is usually more variation in usage than in [BNC-fiction] and less repetition of common collocates” Corpus-based Translation Studies
TS: research on collocation Øverås (1998): Collocation and explicitation • First 50 sentences of 40 novel extracts (English + Norwegian) • Additions enriching the text with a common target language collocation ST: Det var en blanding av vill dristighet og en frøkenaktig, fornem finhet i hans slekt. (a mixture) TT: There was a strange mixture of wild boldness and dignified gentility in the family. • A collocational clash in the ST is rendered with a conventional TL combination ST: the cook's fat son would play plump tunes on his accordion. TT: kokkens fete sønn spille trivelige melodier på trekkspillet sitt. (pleasant tunes) Corpus-based Translation Studies
TS: research on collocation Kenny (2001): Collocation and sanitisation • Three-way comparison: a parallel corpus (English/German) and reference corpora of SL/TL • Treatment of lexical creativity in translation • Starting points: collocation hapaxes and clusters that are repeated in the work of a single author but not attested in any other texts Augen ~ trinken ich trinke mit gierigen Augen (literally: I drink with greedy eyes) translated as: “my avid eyes drank in…” Corpus-based Translation Studies
TS: research on collocation Baroni and Bernardini (2003): Collocation in MCC • Monolingual comparable corpus of Italian original and translated articles from a single geopolitics journal. • All bigrams from the translated sub-corpus and from the original sub-corpus • Ranked according to their log-likelihood ratio value • “Translated language is repetitive, possibly more repetitive than original language. Yet the two differ in what they tend to repeat: translations show a tendency to repeat structural patterns and strongly topic-dependent sequences, whereas originals show a higher incidence of topic-independent sequences, i.e. the more usual lexicalised collocations in the language” Corpus-based Translation Studies
TS: research on collocation Danielsson (2001): Collocation: monolingual & translational • Units of meaning in two large corpora of English and Swedish • Words occurring ≥200 times • Collocates (≥5) plugs sockets (6 occurrences) headphone sockets (7 occurrences) sunken sockets (6 occurrences) bulging their sockets (5 occurrences) • Data-sparseness problem: only 2 units of meaning (of the 12,099 previously identified) occur five times or more in the ST component of the parallel fiction corpus (Swedish into English, ~400,000 words per component) Corpus-based Translation Studies
Limits • General limits of MCC • Variables • Tools and methods: too crude? • Excessive downplay of the source text • Over-generalisation of translation universals • Specific difficulty of collocational studies • Data-sparseness in relatively small corpora Corpus-based Translation Studies
Collocations: a new approach • Aim and method • Results (monolingual and parallel) • Discussion, limits, ways forward
Research questions • Are translated texts more/less collocational than original texts in the same language • i.e., are their collocation types overall more/less frequently attested and significant? • If so, is this a consequence of the translation process? • i.e., can we identify shifts that could account for the observed overall differences? Aim and method
Intuition • The point is not to look for collocations that repeat themselves frequently within small and hardly comparable “translation-driven” corpora, but to identify those collocations that are frequent and/or significant in the language as a whole. Aim and method
2 sets of corpus resources • Study corpora • Small monolingual comparable corpora of fiction texts (English => Italian; Italian => English) • Reference corpora • The British National Corpus • (100 million words from a variety of sources) • The Repubblica Corpus • (340 million words from a single Italian newspaper) • The English and Italian Web via Google/Yahoo automatic API queries Aim and method
M. Atwood/C. Penati Il racconto dell’ancella M. Atwood/M. Papi Occhio di gatto M. Cruz Smith/P. F. Paolini Gorky Park C. Fowler/S. Bini Nozze di sangue N. Gordimer/F. Cavagnoli Storia di mio figlio G. Greene/B. Oddera Il decimo uomo D. Leavitt/A. Cossiga Un luogo dove non sono mai stato R. Rendell/H. Brinis Oltre il cancello F. Camon La malattia chiamata uomo G. Celati I narratori delle pianure C. Comencini Le pagine strappate L. Blissett Q D. Maraini Donna in guerra G. Pontiggia Il giocatore invisibile G. Tomasi di Lampedusa Il Gattopardo Study corpora (fiction) Aim and method
Corpus preparation • Scanning in • Tokenisation • Tagging (part-of-speech) • Lemmatisation • treetagger • Metadata annotation • Alignment (easyalign) • Indexing (CorpusWorkBench, CWB) Aim and method
Extraction of candidates 1 • Target sequences • Lexical collocations • Made of two words • Contiguous • Pos-based extraction from study corpora • Based on literature, e.g. • JN, NN, VN, V * N, N * * N • Collection of frequencies from reference corpora Aim and method
Extraction of candidates 2 • Calculate MI • UCS (Evert 2004-2006) • Rank sequences • Take top • Arbitrary cut-off point: MI>2 and fq2 • Calculate significance of difference btw original and translated • Mann-Whitney significance tests Aim and method
Results (MCC, Mann-Whitney) • J N lit eng (MI; higher in original, p=.08) • N V lit ita (MI; p=.008) • N V lit eng (FQ; p=.05) • V N lit ita (MI; p=.01) • J * J lit ita (MI; p=.06) • N prep/conj N lit ita (MI; p=.007) • N * N lit eng (FQ; p=.06) • N * * N lit ita (FQ; p=.07) Results
Results (MCC, quantitative) Results
Results (parallel, summary) Shifts leading to increased “collocativeness” Results
Creative => collocational (7) TT: Ricordo l’odore della terra smossa, il <senso di pienezza> che davano le forme tonde dei bulbi chiusi nella mano LIT: I remember the smell of the turned earth, the sense of fullness that gave the round shapes of the bulbs held in the hand ST: I can remember the smell of the turned earth, the plump shapes of bulbs held in the hands, fullness The handmaid’s tale TT: Il <rumore dei tacchi> risuonò sulle piastrelle del corridoio. LIT: the noise of the heels resounded on the tiles of the corridor ST: Her heelsclicked on the hall tiles. Red bride Results
Different meaning (7) TT: Fa collezione di <cartine di sigarette> con disegni di aeroplani, e ne conosce tutti i nomi. ST: He collects cigarette cards with pictures of airplanes on them, and knows the names of all the planes. Cat’s eye
Free => collocational (11) ST: handpainted by Alex with purple garlic bulbs, she sees that A place I’ve never been TT: decorazioni di <spicchi d' aglio>, si rende conto che Web data Results
Explicitation (86) - general TT: All'apertura nel basso <muro di cinta> l'autista esitò, poi accelerò LIT: At the opening in the low perimeter wall the driver hesitated, then accelerated ST: He hesitated at the gap in the low wall, then accelerated and went ahead A place I’ve never been TT: schiacciato sotto il <tacco della scarpa>, seppellito LIT: ground away under the heel of the shoe, buried ST: ground away under my heel, buried My son’s story Results
Explicitation (86) - partitives TT: Non riuscivo a prendere sonno, così sono sceso a bere un <sorso d'acqua> LIT: I couldn’t sleep, so I came down to drink a gulp of water ST: I couldn't sleep, so I came down for water The tenth man TT: i <raggi del sole> filtrano dalla lunetta sulla porta LIT: the rays of the sun filter through the fanlight ST: Sun comes through the fanlight The handmaid’s tale Results
Explicitation (86) - head nouns TT: manifesti di Bon Jovi e dei Guns' n Roses attaccati con le <puntine da disegno> sul grande mare della parete ST: Bon Jovi and Guns' n Roses posters thumbtacked into the great sea wall A place I’ve never been TT: Osserviamo il <cerume delle orecchie>, il muco del naso e lo sporco tra le dita dei piedi ST: We look at ear-wax, or snot, or dirt from our toes Cat’s eye Results
More formal/more exact (16) TT: Spostando col piede i <capi di vestiario> sul pavimento, non trovò traccia della prova incriminante. LIT: items of clothing ST: Kicking around among the clothes on the floor, he found no trace of the incriminating article. Red bride TT: Si stava frugando tra le <pieghe dell'abito>, per prendere il lasciapassare LIT: folds of the robe ST: She was fumbling in her robe, for her pass The handmaid’s tale Results
Other cases (9) • Adverbs TT: Dal <punto di vista> domestico, si adattarono l' uno all' altra ST: Domestically they adjusted to one another My son’s story • Domestication TT: Il cadavere era stato fatto a fettine da una lama larga e pesante, non trovata sul <luogo del delitto> ST: The corpse had been slashed to ribbons by a large, heavy blade, not found on the premises. Red bride • Gratuitous changes TT: del greco c'era anche qualche tavolino con sudici <vasetti di fiori> artificiali e bottiglie di ketchup ST: the Greek had a few tables set out with flyspotted artificial flowers and tomato sauce bottles My son’s story Results
Discussion - MCC • Are Italian translated texts more/less collocational than originals? • Translated texts would seem to be more collocational than originals • A single exception: JN into Eng • Translated less collocational than original, why? • Probable shining-through • Over-representation of collocations with common words Discussion, limits, ways forward
JN in Eng: shining-through? Delicate fingers TT: I put some soft golden apricots as big as eggs on his plate, and watch him split them open, hardly moving his long, <delicate fingers>. ST: Gli ho messo nel piatto delle albicocche grandi come uova, morbide, dorate, e l'ho osservato mentre le spaccava, muovendo appena le dita lunghe e delicate. Donna in guerra Collocation fq1 fq2 fq1-2 MI LL delicate fingers 1646 5346 5 2.7545 53.4624 gentle fingers 2477 5346 12 2.9572 139.5338 slender fingers 701 5346 15 3.6023 219.2139 nimble fingers 101 5346 15 4.4437 279.3528 Discussion, limits, ways forward
JN in Eng: common words Overall frequency of few: translated 133, original 39
Discussion - parallel • Is higher collocativeness a consequence of the translation process? • Probably… • NB: shifts towards higher collocativeness would appear to be • partly independent • free=> collocational, different meaning (normalisation) • partly related to other strategies/procedures • explicitation, shining-through Discussion, limits, ways forward
Limits • Just how certain are we that these shifts are the cause of the observed differences? • Shifts are no doubt observable also in non-significant rankings… • (To what extent) could single author or translator preferences account for these differences? • The corpora are very small… Discussion, limits, ways forward
Further work • Bottom-up search for regularities • Other genres? • Source-oriented approach • Starting from ST collocations • Role of reference corpora • BNC, WWW, ukwac / Repubblica, WWW, itwac • Collocation extraction • Evaluation of method: no hands! • Creative exploitation of collocations • Can it be automatised? Discussion, limits, ways forward
Thank you silvia.bernardini@unibo.it
References Baker, M. 1993. “Corpus linguistics and translation studies” In Baker et al. (eds) Text and Technology. Benjamins. Baker, M. 1995. “Corpora in translation studies: An overview and some suggestions for future research”. Target 7, 2. Baroni, M. and S. Bernardini. 2003. “A preliminary analysis of collocational differences in monolingual comparable corpora”. In Archer et al. (eds), Proceedings of CL 2003. UCREL. Danielsson P. 2001. The Automatic identification of meaningful units in language. PhD Thesis. Göteborg University. Evert, S. 2004-2006. The UCS Toolkit [http://www.collocations.de/software.html] Firth, J.R. 1956 (1968). “Descriptive linguistics and the study of English”. in Palmer (ed) Selected papers of J.R. Firth1952-1959. Longman. Gledhill, C. 2000. Collocations in science writing. Gunter Narr. Howarth, P. 1996. Phraseology in English academic writing. Max Niemeyer. Kenny, D. 2001. Lexis and creativity in translation. St. Jerome. Kjellmer, G. 1987. “Aspects of English collocations”. In Meijs (ed) Corpus Linguistics and Beyond. Rodopi. Kjellmer, G. 1994. A Dictionary of English collocations. Clarendon Press. Olohan, M. 2004. Introducing corpora in translation studies. Routledge. Øverås, L. 1998. “In search of the third code: An investigation of norms in literary translation”. Meta 43, 4. Sinclair, J. McH. 1991. Corpus, concordance, collocation. Oxford University Press. Sinclair, J. McH. and S. Jones 1974. “English lexical collocations”. Cahiers de Lexicologie 24. Toury, G. 1995. Descriptive translation studies and beyond. Benjamins.