230 likes | 343 Views
Program Systems Institute, RAS. Artificial Intelligence Research Center. Pereslavl-Zalessky , Russia. Lines of research. Knowledge-based Dynamic Systems Computer Linguistics: Information Extraction, Information Retrieval, Text Categorization Image Analysis of Data Nested Petri N ets.
E N D
Program Systems Institute,RAS Artificial Intelligence Research Center Pereslavl-Zalessky, Russia
Lines of research • Knowledge-based Dynamic Systems • Computer Linguistics: Information Extraction, Information Retrieval, Text Categorization • Image Analysis of Data • Nested Petri Nets
Miracle PS A program system of tools for designing intelligence systems
Control overdocking of a space vehicle with the orbital station Control System Model: • docking parameters (restrictions); • analytical description of control zones; • ship conditions database; • ship model; • station model; • a set of goals; • a system of rules; • planned trajectory.
Control overdocking of a space vehicle with the orbital station Main control fields and boundaries between them
Control overdocking of a space vehicle with the orbital station Main Goals: • Approaching • Divergence • Minimal destruction contact with the station Subgoals: • Finding the station • Approaching • Hovering • Flyby
Control overdocking of a space vehicle with the orbital station Interface Visualization Module Research Prototype
SIRIUS Intelligent Meta-Search System
Intelligent Meta-SearchSystem Sirius - Meta-SearchSystem with the multiagent environment of the distributed calculations and the powerful linguistic module of texts analysis
Features of system Sirius • Expansion of standard keywords search mechanisms • Input of inquiry in a natural language • Use of semantic texts processing methods • Automatic inclusion of new information sources • Increase in accuracy of search • Use of parallel calculations
Example of search inquiry The inquiry = “The President has arrived to Bruxelles” Semantic relationDIR(X, Y) defines that Y there is a direction of movement X (role ofXis «subject», role ofY is «directiv»): DIR(President, Bruxelles)
The calculation of relevance Relevance is calculated on : • Semantic roles • Semantic connections • Key words
INEX: Tools for Information Extraction Artificial Intelligence Research CentreProgram Systems InstituteRussian Academy of Science 152020 Pereslavl-Zalessky Russia +7 08535 98065 inex@epk.botik.ru
Information extraction Objective: • extract meaningful information of a pre-specified type from (typically large amounts of) texts for further analytical purposes Output: • data structures of a pre-specified format (filled scenario templates)
Possible IE application scenarios: • inference of new information (knowledge acquisition) • query formulation and answering in human-computer systems • automatic generation of abstracts and summaries • visualization of document content, etc.
Named entity recognizer • identifies proper names • assigns semantic features to certain items
Information extraction rules • a domain knowledge representation formalism (scenario templates) • a set of patterns to identify template elements in a text (covering the many possible ways to talk about the target event elements)
IE pattern includes: • a set of rules that define how to retrieve this pattern in a text • a set of constraints imposed on textual elements to fit into a particular slot of the target
Coreference Resolver • recognizes different occurrences of the same entity in a text
Merging partial results • merging partially filled templates to produce a final, maximally filled template
Text categorization system • The goal of text categorization is to classify documents into a certain number of predefined categories, or classes. Each document may fall into one, more than one, or not even one category. When machine learning is used for text categorization, the goal is to train classifiers on a training set (a set of category-labeled documents).
Features • Both one-word and multi-word terms are used for text categorization. • Extraction of multi-word terms is based on partial syntactic analysis of texts. • Conventional statistics-based term weighing is enhanced by taking into account different types of term occurrence in a document.