230 likes | 322 Views
Legal Information Retrieval on the Web The Experience of the NiR Portal. Costantino Ciampi. CONSIGLIO NAZIONALE DELLE RICERCHE Istituto di Teoria e Tecniche dell’Informazione Giuridica. h t t p : / / w w w . i t t i g . c n r . i t. Rome, 26 April 2004.
E N D
Legal Information Retrieval on the WebThe Experience of the NiR Portal Costantino Ciampi
CONSIGLIO NAZIONALE DELLE RICERCHE Istituto di Teoria e Tecniche dell’Informazione Giuridica h t t p : / / w w w . i t t i g . c n r . i t Rome, 26 April 2004 Legal Information Retrieval on the WebThe Experience of the NiR Portal(http://www.nir.it) Costantino Ciampi e-mail:c.ciampi@ittig.cnr.it • Contents • Normeinrete (NIR) – “Access to Law on the Net”: an e-Government project • Project description (goals, technology, results) • Standardization in the legal domain: • XML representation of Italian norms • URN adoption to automate hyperlinking among norms in a distributed environment
CONSIGLIO NAZIONALE DELLE RICERCHE Istituto di Teoria e Tecniche dell’Informazione Giuridica h t t p : / / w w w . i t t i g . c n r . i t NiR Project "Access to Law on the Net" Project goals • Improving accessibility to legislation by providing a unique point of access to Italian and EU legal documents published on different web sites • ICT to allow rights fulfillment • Supporting PA in managing legislative documentation life cycle and law consolidation by providing standardization, software tools and methodologies • ICT to improve PA efficiency • A system prototype (third version) is available at the Url: • http://www.normeinrete.it
CONSIGLIO NAZIONALE DELLE RICERCHE Istituto di Teoria e Tecniche dell’Informazione Giuridica h t t p : / / w w w . i t t i g . c n r . i t NiR Actors • Main Actors: • Minister of Justice (beginner) (www.giustizia.it) • AIPA -> CNIPA - (Authority ->) National Center for Information Technology in the Public Administration (founder and technical coordinator) (now CNIPA) (www.cnipa.it) • Scientific and Technical Partners: • Institute of Legal Information Theory and Technologies of the CNR, Florence (www.ittig.cnr.it) • CINECA Consortium, Bologna (www.cineca.it) • Public Administrations participating at the Project
CONSIGLIO NAZIONALE DELLE RICERCHE Istituto di Teoria e Tecniche dell’Informazione Giuridica h t t p : / / w w w . i t t i g . c n r . i t Steps and Resources of the NiR Project • Phase I(May 1999 - May 2000)First Study of feasibility and realization of the Portal prototype • Phase II(December 2000 - November 2001)Second Study of feasibility, extension of the documentary base and qualitative evolution of the Portal prototype • Phase III(years 2002/2003)Definition of standards (URN and XML) and preparation of the software for the dissemination of the standards (parser of references and parser of structures, NIREditor XML) • Phase IV(years 2004/2005)Commitment to external managers and full operation of the NIR Portal (with economic resources from the e-Government programme and Italian financial laws)
CONSIGLIO NAZIONALE DELLE RICERCHE Istituto di Teoria e Tecniche dell’Informazione Giuridica h t t p : / / w w w . i t t i g . c n r . i t NiR Project Strategy • Implementation of a specialized portal, delivering search and retrieval functions of legislative documents published on various Public Administration's web sites; • Definition of standards, consistent with Internet technologies, to represent data and metadata meaningful in the legal domain; • Development and distribution of open source software to support legislative document management and publishing; • Training and knowledge sharing among Public Administrations.
CONSIGLIO NAZIONALE DELLE RICERCHE Istituto di Teoria e Tecniche dell’Informazione Giuridica h t t p : / / w w w . i t t i g . c n r . i t Present Results • www.normeinrete.it: provides unified access to Italian and European Union legislation published on different institutional web sites So far • more than 50 public institutions have taken part in the Project; • more than 140,000 documents have been indexed; • about 160,000 search sessions are held monthly on the site; • creation and updating of the NiR Legal Database ("Norm Catalogue") including metadata; • definition of the NiR Standards. • Two standards issued by AIPA/CNIPA as technical norms • DTDs definition for Italian legislation; • URN definition for any kind of legal document; • Editors and other software tools developed and distributed to PA to support standard implementation.
CONSIGLIO NAZIONALE DELLE RICERCHE Istituto di Teoria e Tecniche dell’Informazione Giuridica h t t p : / / w w w . i t t i g . c n r . i t NiR Features • The system is based on co-operative technological architecture, resulting in a federation of legislative data bases developed on different platforms. • Co-operation is achieved by means of suitable application gatewayswhich provide "loose" integration by adopting two standards: • one for identifying legal resources (URNs), and • one for representing document structures and metadata by XML mark-up language according to ad hoc DTDs.
CONSIGLIO NAZIONALE DELLE RICERCHE Istituto di Teoria e Tecniche dell’Informazione Giuridica h t t p : / / w w w . i t t i g . c n r . i t Searching Tools and Architecture of the NIR System (1) The NIR System consists of: • NiR nodes: components belonging to administration domains containing legal database systems and related application gateways. Documents can be stored in the file system or within database/full text management systems: they are all accessible through the Internet • Central registries: components in the co-operative layer publishing information, needed to allow effective co-operation
CONSIGLIO NAZIONALE DELLE RICERCHE Istituto di Teoria e Tecniche dell’Informazione Giuridica h t t p : / / w w w . i t t i g . c n r . i t Searching Tools and Architecture of the NIR System (2) • Central registries include: • Standards repository (XML DTD and URN grammar definitions and tools); • Registry of official Authority names, needed to standardise URN adoption; • Registry of NiR nodes, containing information needed to allow interaction between NiR agents and domain application gateways; • Norm Catalogue, containing, for each norm: title, basic classification, URN and the list of known physical addresses (URL) where it is published.
CONSIGLIO NAZIONALE DELLE RICERCHE Istituto di Teoria e Tecniche dell’Informazione Giuridica h t t p : / / w w w . i t t i g . c n r . i t The Norm Catalogue(> 45.000 documents) • The Norm Catalogue is a relational database containing, for each norm:title, basic classification, URN and the list of known physical addresses (URL) where it is published
CONSIGLIO NAZIONALE DELLE RICERCHE Istituto di Teoria e Tecniche dell’Informazione Giuridica h t t p : / / w w w . i t t i g . c n r . i t NiR Standards • Uniform Resource Name (URN) definition (based on IETF) to: • identify each document regardless of its physical address (URL) • allow automatic hyperlink through a resolution system (as DNS) • Document Type Definition (DTD) for Italian legislative and regulatory acts(based on W3C XML Meta-language) to represent documents structure, semantics and metadata (*) The standards have been issued as AIPA/CNIPA technical standards and published as regulations in the Italian Official Journal
CONSIGLIO NAZIONALE DELLE RICERCHE Istituto di Teoria e Tecniche dell’Informazione Giuridica h t t p : / / w w w . i t t i g . c n r . i t URNs(1/3) • Each law contains several references to other laws: the whole legislative corpus can be seen as a net, laws being nodes connected through references; • Manual activity is required to build laws hypertext through URLs; • The URN is a persistent, location-independent, resource identification mechanism; • The URNs are defined as a combination of elements, according to a specific grammar, that are basically: name of the enacting Authority, type of norm, date, number and a some more detailed specifications when needed; • URNs can be built regardless the availability of corresponding documents on-line.
CONSIGLIO NAZIONALE DELLE RICERCHE Istituto di Teoria e Tecniche dell’Informazione Giuridica h t t p : / / w w w . i t t i g . c n r . i t URNs(2/3) • The adoption of a URN-based scheme allows to build an automated distributed hypertext, according to a model similar to the DNS (Domain Name System) used to resolve the self-explaining web sites' names into numerical HTTP addresses. • This opportunity relies on the following considerations: • the natural language expressions used in law references usually contain repetitive patterns, thus automatically detectable; • the URN is built by combining data (almost) always included in the reference; • cross references between each URN and the list of corresponding URLs, needed for the resolution service, can be built automatically.
CONSIGLIO NAZIONALE DELLE RICERCHE Istituto di Teoria e Tecniche dell’Informazione Giuridica h t t p : / / w w w . i t t i g . c n r . i t URNs: tools and examples(3/3) • Parser • Available on-line, automatically detects references within laws. • Resolution service • Resolves URNs into URLs (when known).
CONSIGLIO NAZIONALE DELLE RICERCHE Istituto di Teoria e Tecniche dell’Informazione Giuridica h t t p : / / w w w . i t t i g . c n r . i t XML Representationof Italian Legislative and Regulatory Acts(1/5) • Documents with a well-defined structure • laws, constitutional laws, regional laws • Documents partially structured • regulation acts, decrees • Generic documents • any kind of non-structured acts, enclosures,.. Three categories
CONSIGLIO NAZIONALE DELLE RICERCHE Istituto di Teoria e Tecniche dell’Informazione Giuridica h t t p : / / w w w . i t t i g . c n r . i t DTD definition approach(2/5) Three DTDs • Basic DTD: well structured simple documents • Strict DTD: well structured complex documents • Loose DTD: documents with irregular structure, exceptions (suitable for historical documents) Each DTD can represent several document types Mark-up must be carried out using only relevant elements
CONSIGLIO NAZIONALE DELLE RICERCHE Istituto di Teoria e Tecniche dell’Informazione Giuridica h t t p : / / w w w . i t t i g . c n r . i t XML Elements (categories)(3/5) • Structural elements • heading, preamble, sections, articles, paragraphs... • Special elements • references to other laws, formatted representation of text-embedded relevant entities (institution, dates, places) • Elements containing Metadata • subject-matter classification, publication data, preparatory iter • Semantic elements • obligation, prohibition, penalties, exceptions, modifications, abrogations,...
CONSIGLIO NAZIONALE DELLE RICERCHE Istituto di Teoria e Tecniche dell’Informazione Giuridica h t t p : / / w w w . i t t i g . c n r . i t Examples of Legal Texts in XML (4/5) • Example of an Italian Act, tagged with DTD Basic • Examples of fragments of legal texts in different formats (XML vs Html) • Navigating the document structure with a visual XML editor
CONSIGLIO NAZIONALE DELLE RICERCHE Istituto di Teoria e Tecniche dell’Informazione Giuridica h t t p : / / w w w . i t t i g . c n r . i t Training on XML and Development of an XML NirEditor(5/5) Considering the relevance of XML to NIR: • an intense training activity has been carried out, also with the aid of multimedia e-learning product developed by ITTIG-CNR; • an XML Editor, that will be distributed as open source software, has been developed and enriched of parsing functions by ITTIG-CNR .
CONSIGLIO NAZIONALE DELLE RICERCHE Istituto di Teoria e Tecniche dell’Informazione Giuridica h t t p : / / w w w . i t t i g . c n r . i t Opportunities deriving from NIR standards • Advanced search functions • Supporting legislative documents life-cycle (law enacting workflow, "law in force" at any given date) • Moving from a totally “free” approach to a more formally-defined organizational model in order to achieve completeness and to improve precision
CONSIGLIO NAZIONALE DELLE RICERCHE Istituto di Teoria e Tecniche dell’Informazione Giuridica h t t p : / / w w w . i t t i g . c n r . i t Conclusive Remarks:Current Developments and Future Initiatives • Software tools to support Administrations in the adoption of NiR standards • XML Schema definition • Parsing services • New metadata • Implementation of distributed URN resolution • Certification of the authenticity of acts through digital signature technology
CONSIGLIO NAZIONALE DELLE RICERCHE Istituto di Teoria e Tecniche dell’Informazione Giuridica h t t p : / / w w w . i t t i g . c n r . i t ... The End …