160 likes | 290 Views
Semi-automatic Entity-Relationship Modelling Through Natural Language Processing N azlia Omar Supervisors: Dr. Paul Hanna, Prof. Paul Mc Kevitt Faculty of Engineering University of Ulster, Jordanstown. Objectives of research.
E N D
Semi-automatic Entity-Relationship Modelling Through Natural Language ProcessingNazlia OmarSupervisors:Dr. Paul Hanna, Prof. Paul Mc Kevitt Faculty of EngineeringUniversity of Ulster, Jordanstown
Objectives of research • Design and implement ER-Converter to transform natural language specifications of database problems into Entity-Relationship (ER) models • Develop new heuristics to assist transformation • Evaluate approach against human performance and compare to other work in area
Previous work • E-R Generator (Gomez et al., 1999) • Dialogue Tool (RADD) (Buccholz et al., 1995) • DMG (Tjoa and Berger, 1993) • ANNAPURA (Eick and Lockemann, 1985) • CM-Builder (Harmain and Gaizauskas, 2003)
Heuristics in Database Design • “Heuristics are simple procedures, often guided by common sense, that are used to provide, easily and quickly, good but not necessarily optimal solutions to difficult problems.” (Zanakis and Evans, 1981, p. 84) • To determine entities, attributes, relationships and cardinalities • Gathered from past work and newly formed
Example Heuristics • HE1- A common noun may indicate an entity type. • HE2- A proper noun may indicate an entity. • HE3- A gerund (noun converted from a verb or known as verbal noun) may indicate an entity type which is converted from a relationship type. • HE4- If two consecutive nouns are present, check the second noun. If it is not one of these words (number, no,id, address and name), most likely it is an entity. Else it may indicate an attribute.
Heuristics’ Weights • Each heuristic assigned weight depending on level of confidence • Example: HE6=>{weight => "0.60",element=> "Entity",status => "New",}, HA2=>{weight => "-0.50",element=> "Attribute",status => "New",},
Memory-based shallow Parser User assistance Heuristics-based ER analysis Entity types Attribute types Relationship types Cardinalities Architecture of ER-Converter Natural Language Requirements Specification
ER-Converter • Step 1 : Read natural language input text into Memory-based Shallow Parser • 2: Part-of-speech tagging • 3: Remove plurals • 4: Apply heuristics • 5: Assign weights • 6: Human intervention • 7: Produce final ER model
Evaluation Measures Recall = N • Other measures: overgenerated, undergenerated, ask user, unattached and wrongly attached correct N N + correct missing N Precision = correct N + N correct incorrect
Based on 30 natural language specifications Evaluation results: Experimental Results
Conclusion and Future Work • Formation of NEW heuristics show contribution as supported by evaluation results • Integration of WordNet • Semantic analysis