200 likes | 361 Views
Language Resources and Tools For Supporting The System Engineering Process. Onditi V. O. et. al. Computing Department Lancaster University. Overview. System Engineering is a collaborative process. The process is characterised by decisions: about the product about the process
E N D
Language Resources and Tools For Supporting The System Engineering Process Onditi V. O. et. al. Computing Department Lancaster University
Overview • System Engineering is a collaborative process. • The process is characterised by decisions: • about the product • about the process • The decisions are used for estimating individual’s contribution, system maintenance and training NLDB 2004
Challenges/Solutions • Challenges • Decisions aren’t adequately recorded • Use different strategies for recording decisions e.g. use both minutes and audio • Decisions are unstructured and therefore difficult to retrieve and use • Use formal representations during capture • Formalism introduces cognitive overload on decision makers NLDB 2004
Challenges/Solutions (Cont’d) • Decisions are implicitly recorded • Discover decisions through actions • Solution • Use unstructured representation to create and share a structured representation NLDB 2004
The architecture POS/Semantic Rules Indicator Words POS/Semantic Tagging Document Stream Tokenization Structure/Style Rules Extract Actions Sentences Syntactic Pattern Store Action & Context on DB Html derived document NLDB 2004
Document tokenisation • Use document’s style and structure to break a document into paragraphs. • Use string patterns to tokenise the paragraphs into: • sentences, multi-word-expressions, and words NLDB 2004
Analysing a document’s content • Analysis is done at two levels: • surface and • deep (syntactic/semantic) • In surface analysis: • choose and constrain indicator words. • use indicator words for identifying agenda items and minute items. NLDB 2004
Analysing a document’s content (cont’d) • In deep analysis: • use part-of-speech (pos) attribute for selecting content words (nouns, pronouns, verbs, adjectives) • use a pos pattern for extracting action statements (actions) • use semantic attribute to associate action sentences NLDB 2004
Template for action sentences • An action sentence comprises: • an object (action), a verb or verb phrase • a subject (agent), a noun or pronoun • Nouns and verb phrases are syntactically arranged: • the subject appears at the head • the object appears at the tail NLDB 2004
Template for action sentences (cont’d) • a modal verb or function word ‘to’ ties the subject and object • An action sentence template is defined thus: • subject + modal verb/function word ‘to’ + object • the template = NP* + VM/TO + V* in CLAWS (Constituent Likelihood Automatic Word Tagging Systems) tag set. NLDB 2004
Template for action sentences (cont’d) • NP* matches all proper nouns (subject) • VM/TO matches all modal verbs or function word ‘to’ • V* matches all verbs (object) • There are other ways to arrange a subject and an object in a sentence. • the subject can be at the head instead of the tail NLDB 2004
Action template: An example NLDB 2004
Action template: An example (cont’d) • In the example, the elements • <s> = sentence, <w> = word • Element <w> has attributes: • id (identity) - identifies the ordinal number of a sentence in a document and the ordinal number of a word in a sentence • pos = part-of-speech • sem = semantic category NLDB 2004
Action template: An example (cont’d) • pos sequence from id 37.5 to 37.7: • is NP1, TO, VVI • matches the action template NP* + TO/VM + V* • The sentence is marked as an action NLDB 2004
Action template: Results NLDB 2004
Action template: Results (cont’d) • Three sets (1,2,3) of minutes from four organisations (A,B,C,D) were processed • Rel = relevant actions, Ret = actions retrieved by the tool, RelRet = relevant actions retrieved • Recall = RelRet/Rel, Precision = RelRet/Ret • Overall precision = 78, overall recall = 62 NLDB 2004
Representing extracted information • Extracted information is represented in a structured format: • agenda items, minute items and actions are represented as database objects • associations between the objects are captured • associations between the objects and the minute documents are captured NLDB 2004
Retrieving actions • Actions can be retrieved through: • browsing • query • The context of the actions can be retrieved by jumping into the minute document NLDB 2004
Retrieving actions (cont’d) NLDB 2004
Conclusion • Minute documents can be automatically structured and efficiently shared. • Actions sentences can be automatically extracted from minutes documents. • Process decisions can be tracked through actions. NLDB 2004