320 likes | 503 Views
CS621 Seminar on: “Intelligent Database Systems”. By: - Dhaval Bonde (MTech1-08305910) - Prasad Gawde (MTech2-07305908). Under the Guidance of Prof. Pushpak Bhattacharyya. Outline :. Introduction Current types of Databases Features of Ideal Intelligent DB Evolutionary approach
E N D
CS621 Seminar on: “Intelligent Database Systems” By: -Dhaval Bonde (MTech1-08305910) -Prasad Gawde (MTech2-07305908) Under the Guidance of Prof. Pushpak Bhattacharyya.
Outline : • Introduction • Current types of Databases • Features of Ideal Intelligent DB • Evolutionary approach • Expert System • Case Study : Petrographer • Comparison: Traditional DB and IDB • Conclusion • References
Introduction:Going beyond Relational Concepts • ‘Database’ means a ‘repository of data’. • It is not just tables and relationships. • May contain documents and articles.
Introduction: Case 1: “Information Anxiety” No results Library DB David “Information- Management” “Information” Huge List Of Documents Library DB Categories This example is inspired from “D. Sleight, ‘Intelligent Databases: Easing Access to Information,’ Michigan State University, Spring 1993.”
Introduction: “Information” 5153 results Library DB - Information AND Anxiety - Information ADJ Anxiety 3996 results “Anxiety” Library DB • Synonyms for Information Anxiety ? This example is inspired from “D. Sleight, ‘Intelligent Databases: Easing Access to Information,’ Michigan State University, Spring 1993.”
Introduction: Case 2: “Information” + “Anxiety” + “Information Anxiety” “Information Anxiety” Some appropriate items + Synonyms for “Information Anxiety” “Information Anxiety” Library DB Leah ? • 100 common imp words in example doc, and ranked them in order of occurrence • Compared those words with all docs in DB • Displayed those with high no. of matches Justify Selections Synonyms not useful Rules, words, weightings This example is inspired from “D. Sleight, ‘Intelligent Databases: Easing Access to Information,’ Michigan State University, Spring 1993.”
Introduction: • Identifies negative examples (rules, weights) • Increases weightings of phrases that define “Information Anxiety” Search Again Relevant Results! • Database stores the history of her search This example is inspired from “D. Sleight, ‘Intelligent Databases: Easing Access to Information,’ Michigan State University, Spring 1993.”
Current Types of Databases: • Full Text Databases: • Character string search • Will return no documents if the exact string does not match. • Many variations have to be tried.
Current Types of Databases: • Indexed Keywords: • Each item identified by keywords. • Keywords are assigned by the author or database manager. • Exact keyword should be known by the user to get correct documents. • May involve lot of guessing.
Current Types of Databases: • Hypertext Links: • Links between non-hierarchical but related types of information. • Nebulous searches easier. • Links followed may or may not match the user’s thoughts. • Eg. user may click the name of a person mentioned in economics article to know his contribution to economics but may end up getting the person’s biography instead!
So what is an Intelligent Database? • An intelligent database is a full-text database that employs Artificial Intelligence (AI), interacting with users to ensure that returned items (hits) contain the most relevant information possible. • John Hopkins Institute online glossary / Whatis.com
Features of an Ideal Intelligent DB • Feedback • Eg of what the user is searching for and what he is NOT. • Interface • Windows showing previous searches. • Help • Guiding the user to formulate queries. • Selecting search terms • Alternate search terms from an online thesaurus. • Displaying results of a search • Relevance of each hit, query words highlighted.
“Evolutionary" approach: • The amalgamation or extension of existing technologies into hybrid forms. • There are 3 possible configurations of DB-AI integration: • Extending existing AI system (Loose coupling) • DB system and AI system are segregated with an additional component used for relaying the information between the two. • These expert systems automatically formulate the queries and move data between the databases. • Transformation processing cost due to segregation of DB and KB. • These AI extensions are application specific.
“Evolutionary" approach: • Extending existing DBMSs (Tight coupling) • Deductive databases is the best example of tight coupling. • A deductive database system is a database system which can make deductions (i.e. conclude additional facts) based on rules and facts stored in the (deductive) database. • i.e. it can define relations extensionally (as facts) as well as intentionally (as rules.) in the database. • Ex: Consider an extensional relation “parent” represented as parent(X,Y) where “X” has parent “Y”. Then we can define relations intentionally as, rule 1: ancestor(X,Y) :- parent(X,Y) rule 2: ancestor(X,Y) :- parent(X,Z), ancestor(Z,Y). • Here rule 1 says , if ‘X’ is a parent of ‘Y’ then ‘X’ is also an ancestor of ‘Y’ and rule 2 says that, if ‘X’ is a parent of ‘Z’ and ‘Z’ is an ancestor of ‘Y’ then ‘X’ is an ancestor of ‘Y’.
“Evolutionary" approach: • Here relation “ancestor” is defined by rules which says every parent is an ancestor and parent of an ancestor is also an ancestor. • Deductive databases are an extension of relational databases which support more complex data modeling. They are more expressive than relational databases but less expressive than logic programming systems. • The strict approach: • The system is built from scratch using total integration of DB technologies and AI technologies. • Very complex and time consuming
Knowledge Graphs: • Knowledge graphs belong to the category of semantic networks. • One essential difference between knowledge graphs and semantic networks is the explicit choice of only a few types of relations. For example, it uses relations such as, • CAU : cause-effect relation (eg: unstable market positions cause polarization) • AKO : is a kind of (eg: a married man is a kind of man) • PAR : is part of (eg: having relations with high status is a part of social capital) • Inverse relations are also allowed, such as, • CBY : is caused by • HAP : has as part • HAK : has a kind • In principle, the composition of a knowledge graph is including concept (tokens and types) and relationship(binary and multivariate relation). Types are labeled points representing generic concepts and token represents an arbitrary instantiations of types. As an example, “Pluto” is a token and “dog” is a type.
Knowledge Graphs: Example • The figure shows a knowledge graph with CAU relation and different concepts in rectangular boxes. Partial Figure of the figure in “Popping, R. (2003). Knowledge graphs and Network Text Analysis. Social Science Information 42(1): 91-106.”
Expert Systems: • Commonly an expert system would consist of two components: • a “knowledge base” (a database of facts and rules), and • an “inference engine” (a program that can apply those rules and facts and come up with an "expert" solution to the question of a novice.) • Example of an expert system is the Himalayan club which we discussed in the class.
Case Study: Petrographer • By considering the symbolic knowledge representation and inferencing strengths of an expert system and the large scale data handling abilities of a relational database (i.e. storage, management and consultation) Petrographer is developed. • The system offers, • Sophisticated interface to support rock description (interface) • Allows the user to consult large amount of Petrographic data via SQL interface (help) • Provides geological interpretation of described petrographic features according to the knowledge extracted (display of results) • Feedback • Selecting search term
Petrographer : Knowledge Modelling • The knowledge is elicited from expert in the form of “cases”. • Knowledge is modeled at 3 levels of abstraction; first from actual domain to knowledge model and from knowledge model to relational model. • Domain knowledge is represented in Object Attribute Value (OAV) structure to preserve most of the semantic contents. • These objects are mapped to a “frame” in symbolic system and also to entities in E-R model. • Thus, these entities are implemented in database. The figure is taken from the paper “Querying Petrographic Descriptions in an Intelligent Database System” by Cristina Paludo Santos, URI University.
Petrographer : Knowledge Modelling • Roughly speaking each frame in knowledge model is mapped to an entity in E-R model and each attribute is mapped to a field in E-R model. • The inferential aspect of knowledge is represented by knowledge graphs. • Knowledge graphs capture association between petrographic features and geological interpretations. • Knowledge graphs are also mapped & stored in relational database as a special consultation table. The figure is taken from the paper “Querying Petrographic Descriptions in an Intelligent Database System” by Cristina Paludo Santos, URI University.
Petrographer : System Architecture Modified figure of the figure in the paper “Querying Petrographic Descriptions in an Intelligent Database System” by Cristina Paludo Santos, URI University.
Petrographer : System Architecture • The “visual interface” allows user to effectively interact with the system. • Users can perform quantitative analysis as well as visual evaluation through various interfaces provided by this module.
Petrographer : System Architecture • The “symbolic system” is at the heart of the architecture. • Its functions are shared by 3 subsystems. • Frame module • Inference module • Other subsystems • A frame module keeps the schema structure of entire domain. It also maintains the knowledge graphs. • Inference module tries to match several parts of the case submitted by the user against the nodes in the knowledge graph.
Petrographer : System Architecture • It uses forward chaining for inferencing. • The best match is calculated by the number of user features which match against the graph and the influence of those features in associated conclusion. • The set of subsystems support many aspects of petrographic description task such as statistics and quantitative analysis.
Petrographer : System Architecture • It is a standard RDBMS. • The database system stores all descriptions managed by Petrographer . • It manages complete domain knowledge. • The symbolic frame system is connected to RDBMS module by a standard SQL interface such as ODBC.
Case Study: Petrographer • Similar to the example of Leah discussed in the introduction, Petrographer helps users to formulate the queries by accepting query expressions (ex: attribute=val instead of any SQL like query) from them. • Hence it hides the internal schematic descriptions from users. • During the search, if user is not satisfied with the results returned by Petrographer, he/she can modify the query expressions to get better results and these modified query expressions can be stored in the database for future reference. • Thus Petrographer learns new facts while searching is being performed and these facts can be stored in the database for future references.
Case Study: Petrographer • Advantages • Allows users to write “query expression” instead of an SQL query • Users need not know complete schema of the database • The query expressions generated by the users can be stored in the database for later use • Disadvantage • large overheads in terms of processing speed in periods of high interaction between the various sub-systems, most noticeable during the interpretation process where inferencing is performed. • In multi-user access, the performance degrades.
Users need to know complete schema for querying Users must write SQL queries Database system does not help in searching Lacks semantic value Stores only facts Users need not know complete database schema Users can simply use query expressions Provides help to make searching effective Semantic information is stored via Knowledge graphs and other data structures in the database itself Stores facts and rules Traditional Databases Vs IDBs: Intelligent DB Traditional DB
Conclusion: • Intelligent database is basically an “expert system” added as a layer on the top of “traditional database”. This integration leads to an efficient solution where expert systems are required to deal with larger databases. • Intelligent databases are more expressive than the traditional databases. • Although IDBs seem to have an edge over traditional databases, we need some mechanism to reduce their processing overheads.
References: • D. Sleight, “Intelligent Databases: Easing Access to Information,” Michigan State University, Spring 1993. • Lee Jorgensen, “Artificial Intelligence: Intelligent Databases,” akri.org, Jan 2003. [online]. Available: http://www.akri.org/ai/intdata.htm. • Mara Abel et al., “PetroGrapher: a solution in intelligent database system for petrographic analysis,” [online]Available: http://www.inf.ufrgs.br/pos/SemanaAcademica/Semana98/marabel.html • Pamela A. Taylor and Dana L. Wyatt, “Database and artificial intelligence integration: a challenge to academia,”, ACM,SIGCSE Bull, 1992. • Cristina Paludo Santos,URI University “Querying Petrographic Descriptions in an Intelligent Database System” ICAIS,2002 • Popping, R. (2003). Knowledge graphs and Network Text Analysis. Social Science Information 42(1): 91-106.