60 likes | 422 Views
Corpus linguistic What is corpus linguistic?. Corpus linguistic use large collection spoken or written natural text that are stores in computers . One of major contribution corpus linguistic is in area explore pattern language use. Corpus Design and compilation:.
E N D
Corpus linguisticWhat is corpus linguistic? Corpus linguistic use large collection spoken or written natural text that are stores in computers. One of major contribution corpus linguistic is in area explore pattern language use
Corpus Design and compilation: Corpus is a large and principled collection text stored in electronic format. There is no minimal size for text collection to be consider as corpus, an standard size set by creator Brown corpus was on million words.
Type of corpora There are many corpus such as 1: LOB corpus 2: COCA corpus 3: BNC corpus
Issues in corpus design: One of most important factor in corpus linguistic is design of corpus. The composition of corpus reflect the anticipate research goal. Corpus used for explore lexical question to very large to allow accurate representation large number of words and of the different sense or meaning that word might have.
Corpus compilation: When creating corpus ,data collection obtain or creating electronic version of target text, and stored and organize them. Written corpora far less labour intensive to collect than spoken corpora. Data collection for written corpus mean: using scanner and optical character recognition software to scan paper document into electronic text files
Markup and Annotation: Simple corpus consist of raw text, with no additional information about origin, authors, speaker ,structure or content of text themselves. Encode some this information in markup make corpus much richer and useful esp. To research who were not involved in compilation. Structural markup refer to use of code in text to identify structural feature of text