1 / 33

BACKGROUND TO STUDY 2

Combining intuition with corpus linguistic analysis: A study of lexical chunks in four Chinese undergraduate students’ writing Maria Leedham FLaRN 2010 m.e.leedham@open.ac.uk. BACKGROUND TO STUDY 2. Chunking through intuition: Study 1. RQ: To what extent can NSs and NNSs chunk NNS speech?

robertdreed
Download Presentation

BACKGROUND TO STUDY 2

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Combining intuition with corpus linguistic analysis: A study of lexical chunks in four Chinese undergraduate students’ writingMaria Leedham FLaRN 2010m.e.leedham@open.ac.uk

  2. BACKGROUND TO STUDY 2 FLaRN 2010 Maria Leedham

  3. Chunking through intuition: Study 1 RQ: • To what extent can NSs and NNSs chunk NNS speech? Data: • transcripts of 2 intermediate-level Japanese students’ speech • students were recorded 3 times with a 2-month gap between each • total of approx.1500 words across the 6 transcripts Method • Step 1: 3 NS linguists asked to underline chunks in the 6 transcripts (training, examples and practice given first) • Step 2: Japanese students asked to identify chunks in their own transcripts • Step 3: author chunks transcripts with assistance from WordSmith Tools (Leedham, 2006) FLaRN 2010 Maria Leedham

  4. Example of chunked transcript from Study 1 Key: italics - words classified by the NNS as a chunk. underline – words 2 or 3 out of the 3 NSs classified as a chunk 1 ahh…first err I, I learned, learnt? (mmhmm I learnt) err (2.0) I should err.. I 2 should be more positive? (right) positive… in UK because ahh…when, when I 3 went to London err… last Sunday (mhmm) ahh (2.0) some, some of the 4 underground line (mm) line was no service (oh dear) ((speaker laughs)) I was 5 really surprised and, because it can, cannot be (mm) in Japan (mm) you know, 6 sun- in, in Sunday, on? (mm) on Sunday many, many people (mm) come to 7 London (mm) and go around some place (mm)... so everyone need to, need a 8 train (mm) so, but maybe four or five lines… was not, no service (mm) so… 9 I… I have to think err what I should do ((speaker laughs)) and no, I’ve never, I 10 have never been to London that, so, this was the first time I’ve been to London 11 (mm) so… FLaRN 2010 Maria Leedham

  5. Findings from Study 1 Findings: • little inter or intra-rater reliabilitiy • many ‘missing’ chunks (eg ‘of course’, ‘you know’) both across and within raters • frustrating and time-consuming task for NSs • BUT… the Japanese ss could do this task AND also offered insights into when/why… (eg student M: “I used to say that but now I know it’s not usual”.) • the more time spent looking for chunks, the more will be found Coda • a further recording, transcribing & awareness-raising cycle suggests that this resulted in uptake • both students found it highly motivating to record and analyse transcripts of their talk FLaRN 2010 Maria Leedham

  6. Chunking through intuition: Study 1 Method • Step 1: 3 NS linguists asked to underline chunks in the 6 transcripts (training, examples and practice given first) • Step 2: Japanese students asked to identify chunks in their own transcripts • Step 3: author chunks transcripts with assistance from WordSmith Tools v.5 FLaRN 2010 Maria Leedham

  7. STUDY 2: FLaRN 2010 Maria Leedham

  8. Outline • Research questions • The students and the texts • The two methods • 4. Findings • 4.1. Method 1 • 4.2 Method 2 • 5. Conclusions and Implications FLaRN 2010 Maria Leedham

  9. Research Questions • What can a study of lexical chunks reveal about these Chinese students’ writing? • What does each method contribute? FLaRN 2010 Maria Leedham

  10. The Students Criteria - L1 Chinese (Mandarin or Cantonese) - All secondary education in home country - Contributions from years 1 & 2 and year 3 of undergraduate study Wei • Male • BSc Engineering Feng • Female • BSc Food Science with Business Ping • Female • BA Hospitality, Leisure & Tourism Management (HLTM) Hong • Male • BA HLTM FLaRN 2010 Maria Leedham

  11. The texts Reference corpora FLaRN 2010 Maria Leedham

  12. Combining intuition and corpus searches • Method 1: Manual analysis • Read all 4 Chinese students’ texts • Read twice, with 6 months between • Read equivalent, randomly-selected English students’ texts • Noted ‘salient’ features, then searched corpora of the individual’s texts, the discipline, all Chinese students’ writing, all English students’ writing. • Method 2: Key n-gram searches • Used WordSmith Tools, v.5 (Scott, 2008) • Searched for key n-grams in the corpus of texts from each student, using relevant discipline corpus from L1 English as reference • Setting p=0.00001, deleted short n-grams within longer n-grams • Compiled key n-gram lists • Looked at concordance lines and texts for more context FLaRN 2010 Maria Leedham

  13. Formulaic sequences in sample of Wei’s writing (Engineering) Introduction A design methodology for a gearbox is presented in this report. The input horse power, the input speed and net reductions in the gearbox are the parameters to be specified. A gearbox takes an input shaft rotating and converts it via a gear train into up to three outputs, the process of designing a gearbox is to figure out which ratios are needed and to implement those ratios in the form of positioning various sizes of connected gears. The specification of the gearboxdepends on its area of application. • In this report, a gearbox is designed for a commercial meat slicer which has its final shaft rotating at between 80 and 100 rev/min. The input of the meat slicer is a constant speed AC motor running at 1800 rev/min and delivering 1.2 kW. A few points have to be considered on this system, the size of the gearbox is severe restricted, since it has to go ontoa work surface where there is severe competition for space. And the motor may be in-line or at right angles to the grinder. Furthermore, the duty is expected to be up to 6 hours per day. FLaRN 2010 Maria Leedham

  14. Outline • Research questions • The students and the texts • The two methods 4.Findings 4.1. Method 1 4.2 Method 2 5. Conclusions and Implications FLaRN 2010 Maria Leedham

  15. Idiosyncratic language In one word computer based tools contribute an… In one word the overall system can be described… (Wei, years 2 & 3) In light of this, it is suggested that buying IHG… In light of this, it can be suggested that… In light of this, it is recommended that buying IHG… (Ping, year 3, in 1 text) … but simply writing a responsible tourism policy is no longer enough. It is a must to show practical action,… (Hong, Year 1) a winning city, the authorities of Liverpool have to rebuild its image to get rid of the negative picture. (Hong, Year 2) …and boost its marketing campaigns in order to catch the world’s eyes on Scotland. (Hong, year 3) FLaRN 2010 Maria Leedham

  16. Vague language • In catering services, restaurants in Oxford and Bath are more or less the same. (Hong, Year 1) • From those tables, the same thing as section 3.1 could be found … (Wei, Year 1). • …a measurement system for measuring low-lever force, a kind of cantilever rig which is called… • A kind of variable inductance sensor has been chosen… • …Furthermore, with processing data, a kind of filter is always needed to separate certain… (Wei, year 2, same assignment) • At that time, I found that this hotel is a little bit out of my expectation. (Hong, Year 2) FLaRN 2010 Maria Leedham

  17. Vague language • L1 English students use: ‘a bit of a ‘ + N • eg ‘a bit of a problem’, ‘a bit of a shock’, ‘a bit of a dog’s breakfast’ • Often this is from reflective writing • ‘The conclusion was also a bit of a victim in my editings, bringing it down to one small sentence for each of the areas of discussion’. • (6101c Cybernetics Year 3 essay) FLaRN 2010 Maria Leedham

  18. Chunks with – and without – ‘I’ & ‘we’ • From the experiment, it was known that the mechanical properties of carbon steel AN and carbon steel N…. • It was found out the mechanical properties of carbon steel AN was incorrect in this experiment,… (Wei, Year 1) • Meanwhile, if we clipped the current probe round one of the motor supply leads, and connected it to Ch1 of the oscilloscope, we could get two copies of the transient starting current of the motor from the oscilloscope. From these two copies, we could calculated… FLaRN 2010 Maria Leedham

  19. Chunks with – and without – ‘I’ & ‘we’ L1English students FLaRN 2010 Maria Leedham

  20. Linkers • This can create a positive image for Scotland, on the other hand, (Ping Year 3) • …In other words, people are buying expectations... (Hong, year 3) • As a consequence, it can attract many travelers… (Hong, Year 2) • On the contrary, the predominance of SMEs... (Ping, Year 2) • First of all, the dimension of the brake disc is decided. (Wei, Year 3) • What is more, Bath is served by a large number of local bus services… (Hong, Year 1) References to data • ‘as shown in table’ (Wei x 2, Ping x 2) • ‘according to’ (Wei x 4) • ‘as illustrated in table + NUMBER’ (Ping x 2) FLaRN 2010 Maria Leedham

  21. Summary of method 1 findings Salient chunks in the Chinese students’ writing were: • Idiosyncratic chunks (‘in light of the’) • Vague language (‘a bit of’) – though note English students’ use of ‘a little bit of’ • High use of chunks with ‘we’ and low use of chunks with ‘I’ – partly due to English students’ reflective writing • Use of favoured linkers (‘on the other hand’) • Reference to data in tables and figures (‘according to the equation’) • BUT… very difficult to intuit chunks in unfamiliar disciplines FLaRN 2010 Maria Leedham

  22. Outline • Research questions • The students and the texts • The two methods 4.Findings 4.1. Method 1 4.2 Method 2 5. Conclusions and Implications FLaRN 2010 Maria Leedham

  23. Method 2: Key n-gram searches • Used WordSmith Tools, version 5 (Scott, 2008) • Searched for key n-grams (= ‘key clusters’) in the corpus of texts from each of the 4 students • Relevant discipline corpus from L1 English used as reference corpus • P=0.00001, deleted short n-grams within longer n-grams • Compiled a key n-gram list for each student • Grouped these key n-grams into themes • Looked at concordance lines for more context FLaRN 2010 Maria Leedham

  24. N-grams FLaRN 2010 Maria Leedham

  25. Idiosyncratic language Ping's year 2 proposal ‘aim of the’ ‘of the assignment is to design’ ‘to develop an understanding of’ (Wei) FLaRN 2010 Maria Leedham

  26. FLaRN 2010 Maria Leedham Discipline-specific n-grams • “Marriott Liverpool city centre”, “the Liverpool tourism industry”, ‘the tourism industry’ (Hong) • ‘the hospitality industry’, ‘recruitment and selection’, • ‘in the hospitality industry’ (Ping) Passive voice • ‘be worked out’, ‘can be calculated’ (Wei) • ‘there will be’, ‘it is believed that’ (Ping) References to data • ‘with reference to appendix’, ‘please see appendix’ (Ping) • ‘in the appendix’, ‘briefing sheet in appendix’, ‘is shown as’, ‘tables of data’, ‘were recorded as below’ • ‘was calculated with eq.’ (Wei)

  27. Favoured linkers decrease over time FLaRN 2010 Maria Leedham

  28. Summary of method 2 findings • Many of the same findings from method 1 • idiosyncratic chunks • some linkers –esp. ‘on the other hand’ • low use of chunks with ‘I’ • references to data • Also…. discipline-specific chunks • Easy to compare one student’s texts with the discipline reference corpus & each L1 reference corpus • Similar findings occur within the Chinese students overall • NB Keyness measures difference FLaRN 2010 Maria Leedham

  29. Outline • Research questions • The students and the texts • The two methods 4. Findings 4.1. Method 1 4.2 Method 2 5.Conclusions and Implications FLaRN 2010 Maria Leedham

  30. Finds frequent chunks (n-grams) Plus Large quantities of data can be analysed quickly Accurate Easily replicable Minus Single chunks are missed Arbitrary parameters Conflation of writing from lots of individuals Sense of text as complete document is lost Finds semantically whole units (formulaic sequences) Plus A person can recognise single instances that a computer would miss The text is read as a complete document - as intended by the writer Minus Time-consuming and tiring Problem of inter-rater reliability Problem of intra-rater consistency Hard to replicate Intuitive reading Key n-grams analysis FLaRN 2010 Maria Leedham

  31. Combining methods… • Combine the two methods through a recursive process of reading texts and checking the sequences in a corpus, also searching for key n-grams for less intuitive sequences. “ultimately, the most revealing insights… will be gained from a closer look at the texts, the speakers, and the situational variables; quantitative analysis alone can never provide a satisfactory picture” (Simpson, 2004:41). FLaRN 2010 Maria Leedham

  32. FLaRN 2010 Maria Leedham

  33. References • Foster, P. (2001). "Rules and routines: A consideration of their role in the task-based langage production of native and non-native speakers", in M. Bygate, P. Skehan, and M. Swain, (eds.), Task-Based Learning: Language Teaching, Learning and Assessment. Longman: London. • Heuboeck, A., Holmes, J. & Nesi, H. 2007 The Bawe Corpus Manual. Retrieved from http://www.coventry.ac.uk/researchnet/d/505/a/5160. • Leedham, 2006. “Do I speak better? – A longitudinal study of lexical chunking in the spoken language of two Japanese students”. In The East Asian Learner. • Scott, M. 2008. WordSmith Tools v.5. Oxford University Press. • Wray, A. (2002). Formulaic Language and the Lexicon. Cambridge University Press. • BAWE corpus- ESRC project number: RES-000-23-0800 FLaRN 2010 Maria Leedham

More Related