370 likes | 526 Views
TXT, HTML, XML, RDF, OWL, DLEC: European legal ontologies – What should be the next step to do?. Erich Schweighofer University of Vienna, Austria. Outline (1). State of the art: legal research and legal retrieval systems (TXT, HTML) + hypertext, + some meta information
E N D
TXT, HTML, XML, RDF, OWL, DLEC:European legal ontologies – What should be the next step to do? Erich Schweighofer University of Vienna, Austria
Outline (1) • State of the art: legal research and legal retrieval systems (TXT, HTML) • + hypertext, + some meta information • N-Lex: Standard for exchange of legal information • Good start, but improvements necessary • Legal Semantic Web, Legal Social Web • XML, RDF, RDF schema, OWL • Knowledge management in legal units • Known applications: knowledge representation, conceptual information retrieval, advanced lexical thesauri, exchange standards (MetaLex)
Outline (2) • Other uses; more support for European legal work? • Status quo of legal searching insufficient • Exchange of electronic legal meta data a big problem • Need for a legal „Dublin Meta Core“ • Future: Dynamic Electronic Commentary • Support tools for European legal work • Next steps • Conclusions
Text archive & retrieval (1) • Standard service • Easy access and efficient handling of the now so many legal documents • Retrieval: discrimination task more and more difficult (e.g. finding the Boolean combination that sufficiently selects only those documents I am interested in [e.g. finding 1 to 10 documents in a collection of 1 to x million documents])
Text archive & retrieval (2) • Legal retrieval ≠ “To Google” • Exact legal provision (or paragraph in a legal judgement); not just some information available in a redundant way • No Social Web (e.g. lawyers as a community are linking sufficiently to important legal documents) • Only in law firms with efficient knowledge management possible • Semantic Web?
Semantic Web • Tim Berners-Lee: • [T]he Semantic Web is "not a separate Web but an extension of the current one, in which information is given well-defined meaning, better enabling computers and people to work in cooperation”. • Standards for semantic information on the web • Tagged and linked using the technologies of Resource Description Framework (RDF), XML and URIs • Web Ontology Language (OWL) • Next layer: may be a logical one, an inference machine • Remains largely unrealised
Legal web, legal text corpora and beyond • Legal web = huge text corpora • Legal information systems • Web sites • Boolean logic + hypertext • Some mark-up (text structure) • Good coverage, easy handling of documents • Problem: semantic meaning and searching is insufficiently developed • Same situation as semantic web • To do: adaptation of standards for mark-up + implementation
XML (eXtensibleMarkup Language) General specification for creating markup languages Subset of the Standard Generalized Markup Language (SGML) but human-legible Free standard of the W3C; versions 1.0 and 1.1 Recommendation XML 1.1 (Second Edition): http://www.w3.org/TR/2006/REC-xml11-20060816/ Structure has to be represented like a tree Document type definition (DTD) Mark-up tags are freely extensible Allows semantic mark-up Law: definition of semantic document structure, e.g.: <!ELEMENT judgement (title, summary, grounds, operational part, citations*)> Attribute values Automatic verification
XML (2) • Valid documents conform with a particular DTD/schema • XML schema definition (XSD) • Successor of DTDs • XML Style Sheet • Extensible Style Sheet Language (XSL) • Client-side XSLT • XML-based document transformation language • Extensible Linking and Pointer Languages • XLink: simple and multiple links • Xpointer: links to other document parts • Browser: Internet Explorer from version 5.0 • File format: OpenOffice, Word2007 • DTD for legal documents for Electronic Data Interchange (EDI)
Why XML? Advantages Semantic meaning for syntactic data <name>schweighofer</name> Reuse + recycling of information Change of layout Improved searching of documents Unicode Open document format Disadvantages Hierarchical model for representation has its limits Redundancy of data Main uses in law Interchange of documents Interchange of knowledge
RDF (Resource Description Framework) RDF syntax Description of meta data in web documents Each data can be linked with a file that describes the type of this data. Recommendation: http://www.w3.org/TR/2004/REC-rdf-primer-20040210/ RDF statement: subject-predicate-object expression (triples in RDF terminology) Subject (described websource = URL) – predicate (attribute, e.g. author) – object (value, e.g. name) Query language for RDF graphs: SPARQL Semantic web Automated storage, exchange and use of machine-readable information on the web Applications: exchange and common use of web data, improved implementation of search engines, classification of a website (also with software agents) etc.
RDF Schema Extensible knowledge representation language Language for description of the structure, the content and the semantic of XML documents Basic elements for the description of ontologies (RDF vocabularies) Recommendation: 10.2.2004: http://www.w3.org/TR/2004/REC-rdf-schema-20040210/ A RDF scheme does not only describe predicates of a web source (e.g. title, author, etc.), but also the kind of the described sources (e.g. books). Development of user-oriented RDF vocabularies Object-oriented description of data structures with multiple heritage Classes, predicates, constraints Ontology for exchange of data on the Web Important initiatives: Dublin Core Metadata Initiative, PICS labels, P2P
Web Ontology Language (OWL) • Family of knowledge representation languages for developing ontologies • Revision of the DAML+OIL web ontology language • W3C standard 10 February 2004 • http://www.w3.org/2004/OWL/ • Two semantics based on Description Logics • OWL-DL • All OWL language constructs • OWL-Lite • Classification hierarchy and simple constraints (not widely used) • RDF/XML syntax • LKIF Core Ontology (Leibniz Center for Law, Amsterdam)
Semantic Web & ontologies • Semantic Web: highly developed description languages exist • Merger between web & ontologies envisaged • Quantity of mark-up so far insufficient • Legal semantic web • Semantic mark-up of legal information systems should be re-used • Field structures • Thesauri • Citations • AI & law (legal logic, conceptual information retrieval etc.) • Incorporation of world ontologies
Legal thesauri • Legal thesauri • ISO 2788 standard • Definition • Precompiled list of important words in a given domain of knowledge (controlled vocabulary) • Concepts are linked with relations • Synonyms (polysems), antonyms, broader term, narrower term, homonyms • Dictionary: definitions • Information science + legal information systems • Documentation and retrieval • Nucleus of a lexical ontology
Legal ontologies • Explicit formulation of a legal domain • Thesauri + definitions + more relations + formalisation for IT applications • Conceptual model • Abstract, simplified, computable • New form of abstraction and formalisation of law • Theory of formalisation (?) • Advantages • Computable • Links with world ontologies • Re-use of existing ontologies • Important tool for automation of law • Problems • High efforts required for knowledge acquisition • Scaling-up (well-known problem in AI & law)
Related work • Earlier formalisation attempts • Hohfeld, Allen, McCarty, Stamper etc. • 1990ies • FOLaw (Valente), FBO (van Kralingen, Visser) • Workshop on legal ontologies 1997 • LOAIT Workshop on Legal Ontologies and Artificial Intelligence Techniques 2005 and 2007 • ICAIL International Conference on Artificial Intelligence and Law • Sessions on ontologies since 1997 • LEX Legal XML Workshop Florence 2007 • Major research: Leibniz Center for Law, Amsterdam; ITTIG, Florence; University of Turin, Autonomous University of Barcelona, University of Vienna etc.
Types of applications • Representation of legal knowledge • e.g. FBO, LRI Core, LKIF • Conceptual information retrieval • Juriservice, LOIS • Advanced lexical ontologies • Multilingual thesauri • e.g. LOIS, Legal Taxonomy Syllabus • Interchange of documents and knowledge • e.g. MetaLex, eLaw
Knowledge representation (1) • Language for Legal Discourse LLD / McCarty (1989) • NORMA / Stamper (1991) • Frames-based ontology (FBO), van Kralingen and Visser • Common legal ontology; re-useable, 3 classes of model primitives, for each class a frame structure has been defined with all relevant attributes • Functional ontology (FOLaw), Valente • Aim: organisation and linking of legal knowledge, in particular in respect to conceptual information retrieval • 6 basic categories of legal knowledge • Normative knowledge, meta-legal knowledge, world knowledge, responsibility knowledge, reactive knowledge, creative knowledge
Knowledge representation (2) • ON-LINE (architecture of legal case-solving) • PROSA (training system for legal case-solving) • E-Court, LRI-Core, University of Amsterdam • Goal: semi-automated multi-lingual information management for various sources (audio, video, text); application area: penal law • LRI-Core: broad concept structure with typical legal main concepts • About 200 concepts, in development anchors • Links between foundational (upper) ontology (= world knowledge) and legal core ontology (legal concepts) • Supports legal subsumption
Knowledge representation (3) • Select/direct from various acts or agents to the legally relevant ones • E-Power, project of the Dutch Tax and Customs Administration • Application-oriented knowledge system; formalisation of laws and regulations as conceptual models • Automated tasks (e.g. subsumption, calculation, document assembly); comprehensive support from legislation to application • LKIF Core Ontology (Legal Knowledge Interchange Format)(Estrella project), University of Amsterdam • Standard OWL ontology • OWL-DL (description logic) • Description logic programs (DLP)
Knowledge representation (4) • Obligation, permission, roles, rights, duties, privileges, liabilities etc. • Top level clusters • Mereological relations • Location • Time • Changes (processes) • Agents + actions + roles • Propositions • Legal agents + actions, rights, powers • Norms • LKIF rules – more expressive than OWL • Application: traffic domain • Impressive standard
Multilingual thesauri (1) • LOIS Lexical Ontologies for legal Information Serving • Multi-lingual access to European legal databases • Formal representation of legal concepts in all languages on the basis of the WorldNet technology; similar concepts • 6 languages, 5000 synsets • ILI inter-lingual index + legal definitions • 10 partners; leader: ITTIG, Florence • Legal Taxonomy Syllabus (University of Turin) • Tool to annotate and recover multi-lingua legal information (EU Directives) • Legal dictionaries • Taxanomies of legal concepts
Multilingual thesauri (2) • DALOS (ITTIG, Florence) • Ontological-linguistic resource for multilingual drafting process (EU) • Basis: LOIS • Ontological layer: conceptual modelling at a language-independent level • Lexical layer: lexical manifestations in different languages • Term extraction using NLP tools
Advanced lexical ontologies • LOIS • Legal Taxonomy Syllabus • Juriservice • DALOS • Comprehensive legal ontology (University of Vienna) • Real world (world knowledge) • Legal system as a order of norms : socio-economic governance by law with the goal of risk reduction • Frames • Material rules • Procedural rules • Concepts • Concept frames • Starting point legal thesauri, e.g. LOIS thesauri • Links: world knowledge, rules, top legal ontology • Hard core of a legal ontology
Interchange of documents and knowledge • Interchange standards for documents • Many international and European applications • e.g. EU, eLaw (Austria), MetaLex • Interchangability of legal knowledge representation • MetaLex (University of Amsterdam) • Generic and extensible framework for XML-encoding of legal resources
Dynamic Electronic Legal Commentary (1) • Abstract representation of law in a conceptual & logical-systematic structure; like printed commentary but in a machine-useable format • Description of the world ([possible] facts) • The core: links between possible facts (situations) and legal consequences • Problem: world ontologies have still some way to improve sufficiently, legal formalisation has to move from small environments to the real big world • It‘s time to move on
Dynamic Electronic Legal Commentary (2) • For legal information systems: • Not the very, very big step, but : • Tools like a navigator [time and document types, layers of the legal order, consolidated texts] (e.g. PreLex) , citator or terminologist are possible and would be highly desirable … • Thus good paying services • In the near future • The real thing … some automated support for legal subsumption, e.g. helping in the real game of applying legal provisions (could that also called legal reasoning or a legal expert system … maybe?)
European standardof legal ontologies • Motivators • Comparative legal research, harmonisation of EU law (e.g. Services Directive), European E-Government • Ontologies are standards, thus an obvious thing to do • Meta information in Dublin Core Metadata language • Citations (standard, URI) • Ontologicalstructures • Rules • Haley Ltd. (formerly: Softlaw) • Concepts • E.g. lexicalontologiesprojects
Next steps (1) • Interchange of documents and knowledge • Legal documents: ongoing and improving • Legislative documents: many applications, standards like MetaLex may improve formalisations due to inclusion of knowledge representation aspects • Improvement and enlargement of legal thesauri • Up to 10.000 concepts • URI formalisation of citations on an EU level • Multilingual information retrieval • Conceptual information retrieval using legal thesauri • Improved searching, classification and summarisation of documents • Word sense disambiguation for easier coupling with legal information systems
Next steps (2) • Text analysis and text categorisation • Start: information system (text archive) • Classification • Concept analysis (e.g. DALOS, KONTERM) • (Semi)automatic text analysis • Summaries (e.g. SALOMON, KONTERM, FLEXICON) • Result: semantic description of the legal order; some “primitive” anchors to legal system and world knowledge • Inclusion of results of text analysis in an advanced lexical ontology
Next steps (3) • Concept frame • Header, definition, relations • + More relations, better definitions, links to legal rules + world knowledge • “Raw” conceptual legal ontology • “Raw” dynamic electronic commentary • Conceptual description of legal order with links to legal rules and world knowledge • Formalisation of dynamic electronic commentary in LKIF or other ontology languages • Big step, resources not available • More research required
Conclusions • Ontologies are the key for a computer-useable formalisation of the knowledge on the world and the legal system • XML: standard for mark-up of legal documents • XML/ontologies: emerging standard for knowledge representation • New form of a legal commentary: dynamic, electronic, computer-useable • Big support for European legal work • Legal search, exchange of data, exchange of knowledge • Next steps • Exchange standards • Multilingual information retrieval • Improvement of legal thesauri • Some (semi)automatic text analysis and categorisation, advanced lexical ontologies • Later: formalisation in LKIF • Big potential for easier better European legal work
Contacts Erich Schweighofer Universität Wien Arbeitsgruppe Rechtsinformatik Wiener Zentrum für Rechtsinformatik erich.schweighofer@univie.ac.at http://rechtsinformatik.www.univie.ac.at IRIS2009 Internationales Rechtsinformatik Symposion, Salzburg http://www.univie.ac.at/RI/IRIS2009