150 likes | 173 Views
The European Union case law corpus (EUCLCORP). Aleksandar Trklja University of Birmingham. What is EUCLCORP?.
E N D
The European Union case law corpus (EUCLCORP) Aleksandar Trklja University of Birmingham
What is EUCLCORP? • The European Union case law corpus (EUCLCORP) is a standardised, multidimensional and multilingual corpus of the case law of the Court of Justice of the European Union (CJEU) and of eight EU member states’ constitutional/supreme courts.
Project development • The project has been developed in the following phases: • Phase one: project application • Phase two: data compilation • Phase three: data annotation • Phase four: web-interface • Supported by a European Research Council (ERC) Proof of Concept grant • Based at the University of Birmingham (July 2016 - December 2017).
Not just another legal database • Unlike conventional legal databases EUCLCORP contains the following corpus tools: • monolingual concordance lines • parallel concordance lines • collocations • frequency lists • n-grams • simple search • CQP-based search
Annotation • The corpus has been annotated with linguistic and external metadata information. • Linguistic information: tokenization, lemmatization, parts-of-speech tags, sentence and paragraph boundaries and enumeration of sentences and paragraphs.
Annotation • Non-linguistic metadata for CJEU subcorpus: text sections (Summary, Parties, Grounds, Costs, Operative Part and Subject), language of the case, case name, case number, date and cellar number. • Non-linguistic metadata for national judgments: language of the case, name of the court, date, case name and names of judges. • Sentences from ECJ judgments: aligned at the sentence level to enable the search on parallel concordance lines.
Web interface and corpus tools User-friendly interface for the search query [lemma="increase" & tag="V.*"] ]{0,2}[ tag="N.*"] ::match.meta_date="1980.*" within grounds
Web interface and corpus tools N-grams associated with the token ‘capable’
Contribution • EUCLCORP has been created with the aim to foster the development of empirical legal linguistics studies.
Contribution • EUCLCORP allows users to investigate in a systematic way: • the history of the meaning(s) of a particular legal term; • features that distinguish legal language from languages used in other registers; • in the case of ambiguous terms – the senses in which they are most frequently and most typically used; • the influence of national legal languages on EU case law (and vice versa); • the impact of translation on the development of EU case law; • discourse relations and argumentation patterns.