260 likes | 387 Views
Tracing idiomaticity in learner language: the case of BE. CL2001 Lancaster University 30 March - 02 April 2001. Przemysław Kaszubski School of English Adam Mickiewicz University Poznań, Poland. Premises (1). EFL learners’ overuse of high-frequency words: what does it mean?
E N D
Tracing idiomaticity in learner language:the case of BE CL2001 Lancaster University 30 March - 02 April 2001 Przemysław Kaszubski School of English Adam Mickiewicz University Poznań, Poland
Premises (1) • EFL learners’ overuse of high-frequency words: what does it mean? • Intensive collocability of core lexical items • Multi-word extensions (compounds, coinages, idioms, expressions, phrasals) • Confrontation • Available corpus-driven extraction methodsvs. • pedagogical usefulness: L1-perspective (the role of transfer) CL2001, Lancaster University
Premises (2) • Methodological assumptions • multi-corpus scheme with Polish advanced EFL learner data as hub data • variables: a) genre / text-type; b) L1; c) proficiency level d) age / maturity level • Lemma-based approach (as opposed to wordform- or family-oriented approaches) • Lexical BE: non-idiomatic or ignored because troublesome? CL2001, Lancaster University
The hypotheses • negative correlation between proficiency level and frequencies of non-idiomatic uses • positive correlation between proficiency level and frequencies of idiomatic BE except EFL learners’ ‘favourite expressions’ • traceability of (at least) some ‘favourite expressions’ to L1 CL2001, Lancaster University
The challenge of idiomatic BE (1):extraction of ‘verbal’ BE • lexical (semantic BE) = readily translatable lexically: existential BE; copular BE • main-verb function • non-finite forms: infinitives and participles (the latter if not adjectival) • non-finite forms: non-count gerunds (NOT ‘a being’) • passive auxiliary: central passives vs semi- and pseudo-passives (cf. Quirk et al. 1985: 167-171) CL2001, Lancaster University
The challenge of idiomatic BE (2):semi-lexical MWUs • modal idiom ‘BE to <do sth>’ • semi-auxiliaries (BE=linking BE) • BE able to <do sth>; BE about to <do sth>; BE apt to <do sth>; BE bound to <do sth> • ‘Polish-style’ semi-auxiliary: BE allowed to <do sth> CL2001, Lancaster University
The ‘extended’ tripartite idiomaticity model: the criteria • lexical fixedness • syntactic fixedness and / or anomaly • semantic opacity • lexicalisation / institutionalisation / specialisation / conventionality = frequency + distribution • implementation of fourth criterion via external sources BBI2 & LDOCE3 CL2001, Lancaster University
The ‘extended’ tripartite idiomaticity model: the levels • frozen expressions • restricted uses • restricted collocations • discourse formulae • free combinations CL2001, Lancaster University
The challenge of idiomatic BE (3):implementation of the frozen level • frozen uses: BE frozen lexically and formally in a particular wordform • ‘that is (to say)’; ‘to be sure’ (= certainly); ‘for the time being’ (= currently) • phrasal idioms: ‘to have been around’ CL2001, Lancaster University
Lexical BE: restricted uses (1) • phrasal-prepositional uses of BE (e.g. ‘BE into <sth>’, ‘BE on’) • super-pattern BE + idiom: A survey of the complementation patterns of lexical BE (based on the evidence of Quirk et al. 1985) has shown that the verb tends to be followed by complements that: a) either constitute idiomatic phrases b) or restrict BE’s realm of reference by influencing its subject collocates c) or else form simple, ad-hoc, fully compositional phrases (BE <noun>; BE <adj>; BE <adjunct>) CL2001, Lancaster University
Lexical BE: restricted uses (2): BE + idiom (1): prototypical types • a) BE <adj>: • BE + adjectival idioms or collocations - predicatively unified and often substitutable by a single verb (‘BE conditional upon <sth> - cf. depend on <sth>; BE alive - cf. live) • predicative pseudo-passives and semi-passives (‘BE composed of <sth>’, ‘BE connected with <sth>’, ‘BE situated <somewhere>’) • BE + adjectival / participial predicate + to-clause (‘BE liable to <do sth>’, ‘BE reluctant to <do sth>’) • b) BE <noun>: • BE + nominal idiom (‘BE a bitter pill for <sb> (to swallow)’; ‘it BE high time’) CL2001, Lancaster University
Lexical BE: restricted uses (3): BE + idiom (2): non-prototypical • c) BE <adjunct>: • means adjuncts: conventionalised / lexically fixed (‘Transport is byferry’) or replacing a longer predicate or a central passive (‘such contracts are with people who...’ = ‘are signed with’) • stimulus adjuncts: rare & restricted by BE’s subject (‘his main interest was in sport’) • agent adjuncts: usually restricted to authorship (‘The book is by an unknown writer’) • measure adjuncts: non-prototypical though probably salient (‘The jacket was 10 pounds’ - cf. ‘The prize was 10 pounds’) CL2001, Lancaster University
BE: restricted uses (4): discourse-conditioned phrases CL2001, Lancaster University
Lexical BE: free combinations • non-idiomatic complementation of the prototypical types (‘the young are reckless’; ‘he was a man in his late forties’) • non-idiomatic cases of obligatory but fully semantically compositional adverbial complementation: • BE <adjunct: time / space / metaphorical space> (‘It was 10 years ago’; ‘Pure fire (= the stars) are in the heavens’.) • BE <adjunct: purpose / accompaniment / measure etc.> (BE with <sb>, BE for <sth>, BE about <sth>) • ‘there BE’ and BE after anticipatory ‘it’: unless lexicalised or specialised, as in ‘it BE high time’, ‘there BE every reason that’ etc. CL2001, Lancaster University
Lexical BE: Other cases • in cleft and pseudo-cleft sentences: (‘It is marriage that constitutes the basic part of every nation’; ‘All his people ask for is no more war’) • subject-to-subject raising after copulas (SEEM (to be), TURN OUT <to be>) or when complementing mental verbs (BE found/thought etc. (to be)) CL2001, Lancaster University
Idiomatic BE: automatic extraction? (1) • Problem 1: collocation vs co-occurrence • word clusters • Many genuine collocations and MWUs are not contiguous (Kennedy 1998: 114) and may spill outside the typical 4:4 window • co-occurrence statistics (WordSmith; TACT, CUE) • MI - identifies ‘idiosyncratic collocations’ (Oakes 1998; 90) & fails to associate many lemmas • z-score & t-score - better suited to frequent words but also mutual and leaving much ‘noise’ • stop-listing not quite possible CL2001, Lancaster University
Idiomatic BE: automatic extraction? (2) • Problem 2: part-of-speech tagging • the passive bottleneck- the need for sampling • Problem 3: semantic disambiguation and associations • sometimes only grouping data uncovers a meaningful type of association (Stubbs 1998:4) • Problem 4: learner data CL2001, Lancaster University
The corpus base: full specification CL2001, Lancaster University
Do Polish EFL writers overuse BE? (1) • Non-lexical BE: • underuse: central passives (especially at lower proficiency levels) • overuse: ‘BE going to <do sth>’ (diminishing with proficiency) • overuse: ‘BE able to <do sth>’ (especially advanced-level learners) • Lexical BE: • frozen BE: scarce • lower-proficiency: fewer collocational idioms & many more free combinations CL2001, Lancaster University
Lexical BE: summary of 500-line concordance samples CL2001, Lancaster University
Do Polish EFL writers overuse BE? (2): some specific findings • frozen BE: • overuse: ‘what is more’ (cf. Polish ‘co więcej’) • Restricted collocations (BE + idiom): • intermediate overuse: ‘BE full of <sth>’ (cf. Polish ‘być pełnym czegoś’) • overuse: ‘BE connected / associated {etc.} with <sth>’ (cf. Polish ‘być związanym z czymś’) • underuse: ‘BE concerned with <sth>’ as opposed to the overused formula ‘as far as <sth> BE concerned’ (cf. Polish ‘jeśli chodzi o ...’) CL2001, Lancaster University
Do Polish EFL writers overuse BE? (3): some specific findings • Restricted level: discourse formulae: • heavy overuse: ‘that/this BE why ...’ (also ‘that’s why’) (cf. Polish ‘Dlatego (właśnie)’) • (sentence initial) overuse: ‘what is (more) <adj: important etc.> (cf. Polish ‘co ważne’) • OVERALL IMPRESSION: • Many more phrases are overused than underused • Overused expressions are either likely underpinned by equivalent or associated L1-based options, or by the (spoken) familiarity of a phrase (‘BE able to’; ‘BE going to’), or both. CL2001, Lancaster University
Conclusions • EFL learners do apply BE more frequently, but not only lexical uses and free combinations add to the impression • Sadly, analyses of the like kind are hardly possible automatically or even semi-automatically, but they may serve as benchmarks for developing tools that will more successfully tackle BE in larger corpora • Such quantitative and contrastive accounts of EFL learner language are needed, especially at higher proficiency levels, despite caveats about idealised native corpus ‘norms’ (cf. Leech 1998) CL2001, Lancaster University
This show shortly available from: • http://main.amu.edu.pl/~przemka/rsearch.html CL2001, Lancaster University