590 likes | 1.3k Views
Intelligent Databases. An Overview of Ideas and Developments Jenny Carter – De Montfort University, UK Web ref: www.jennycarter.com. Introduction. Overview of developments & research w.r.t. Database/AI integration Active databases Overview of KBS (Knowledge Based Systems)
E N D
Intelligent Databases An Overview of Ideas and Developments Jenny Carter – De Montfort University, UK Web ref: www.jennycarter.com
Introduction • Overview of developments & research w.r.t. Database/AI integration • Active databases • Overview of KBS (Knowledge Based Systems) • Deductive Databases • Coupling of KBS and standard DBMS • State of the Art • Other developments
1. Active Databases • Traditional databases are passive: i.e. queries, updates, transactions executed only when requested. • Certain applications e.g. inventory control, factory automation, etc. are not well supported by passive DBMS • Capabilities such as automatic monitoring of conditions & ability to take actions (e.g. re: timing) require an ACTIVE DBMS. • Uses the idea of TRIGGERS.
1. Active Databases • Two initial approaches: • specific code in applications programs to perform these tasks(problems – maintenance can be difficult – conditions/actions might be spread over a few applications programs. Also, can be hard to understand such code fragments) • building special applications software that periodically polls the DB to determine relevant events(generally all coded in one application program. Frequency of poling is an issue though) • Due to problems with both of these methods, many systems extended with built in centralised sub-system to provide active capabilities (i.e. active rules, or Triggers)
Basic Concepts of Active rules Active rules are in form ECA: On Event IF Condition Do Action e.g. On Update of Employees Salary If the new salary < 10000 Do rollback the update • Example events: insertions/ deletions/ updates on columns; temporal events - time when rule should be activated; application defined events – can be external to database e.g. temperature as measured by a sensor. Application needs to tell DBMS. • Example conditions: like WHERE clause in an SQL statement, or even a complete query. Or from procedure written in host language with possible embedded database queries. Can be related to special system variables e.g. Current User etc. • Actions: data updates, further queries, other DB operations (commit, abort, etc.), calls to applications procedures.
Active Rules • Format for writing a trigger in oracle is: Create trigger name {before | after} {insert | delete | update [of list-of column-names]} on table-name [referencing references] [for each row] [when condition] PL/SQL block;
Example of Trigger in Oracle create trigger salary_check before insert OR update of salary, job on employee for each row when (new.job <> ‘PRESIDENT’) declare /*start of PL/SQL block */ minsal number; Maxsal number; begin select minsal, maxsal from sal_guide where job =:new.job; if(:new.sal < minsal OR :new.sal > maxsal) then raise_application_error(-20601, ‘salary’||:new.sal|| ‘out of range for job’ ||:new.job||’for employee’||:new.name); end if; end; /* end of PL/SQL block*/ Oracle provides commands for trigger management. E.g. alter trigger, drop trigger, enable, disable etc.
Active Databases • Most work on active DBs is associated with RDBMS rather than OODBs. • Partly because of OODBs having methods incorporated as well as data • Also because of complexity that including rules would cause – scope issues due to inheritance/ overriding features etc. • Many attempts have taken the approach that rules apply to whole class. • There are a number of research prototypes, & work in this area is ongoing.
2. Brief overview of Knowledge Based Systems(KBS) • KBSs differ substantially from traditional DBs. They contain rules (as well as simple facts) and they have an inference engine. • Two main types of representation for KBS are: • Rule based – supporting inference by resolution (could be in Expert System form, or logic programming form) • Frame Based – supports inference by inheritance.
KBS – Rule Based • The MYCIN system is probably the best-known example (& historically important) of an Expert System. Built in the mid 70s & containing about 500 rules, it was designed to perform medical diagnosis in the field of bacterial infections. • Example rule from MYCIN is: IF The infection type is primary-bacteremia, and The site of the culture is one of the sterile sites, and The suspected portal of entry of the organism is the gastro intestinal tract THEN There is suggestive evidence (0.7) that the identity of the organism is bacteroides.
KBS – Frame Based • Inheritance is one of the most powerful and popular concepts used in AI. • allows grouping of similar notions into classes, economise on descriptions of some of the attributes; • allows deductions to be made about properties of lower level entities; • allows definition of new classes as variants of existing ones.
A simple inheritance hierarchy Need to incorporate possibilities for over riding where necessary (e.g. all elephants are grey, except for a particular known instance, Nellie, who is pink.)
3. Deductive Database Systems • DBs need to store & manage data from which users can extract relevant information • Difficult where there is large amounts of complex data • More difficult when information must be derived according to some complex rules • An approach to this might be to code rules into application programs
3. Deductive Databases • Deductive databases attempt to solve the problem by storing explicit data and deductive rules that enable inferences to be made from stored data. • Data obtained via action of deductive rules on stored data is known as Derived Data • Deductive databases are therefore the result of combining logic programming with traditional databases. • Characterised by handling large amounts of data as well as performing reasoning based on that information.
Basic Concepts of Deductive DBs • Includes set of data – FACTS (sometimes known as the extensional database) • Includes set of inference rules – RULES (sometimes known as the intentional database) • The DATALOG language offers an approach to this. • It is a combination of a database and the logic language Prolog. • It allows definition of both tables & rules. • Includes facilities for defining integrity constraints etc. • Easier to store facts than with a logic programming language.
Prolog database with facts and rules A possible query might be: ?-weather(X)
Deductive DB Architectures An example of a heterogeneous system known as NAIL (Not Another implementation in Logic), developed at Stanford University. Links DATALOG to a conventional SQLDB system. DDBs are especially useful for problems involving temporal and/or spatial aspects. Also see ProDBI: www.sics.se/isl/quintus/prodbi/db.htm
4. Coupling of KBS and ‘standard’ DBMSs. • These types of system often use KBS as a front end, with a DBMS as the back end. • Some people say this leads to a fundamental mis-match due to: • Knowledge representation (KR) – flat DB tables are not compatible with some of the advanced KR techniques used • ESs often have fact base developed in an ad-hoc way - can result in performance problems that a traditional DB system would not have. • Often end up with use of redundant data descriptions in order to make data exchange possible
4. Coupling of KBS and ‘standard’ DBMSs. • Use of static inference processing in AI versus dynamic queries in DBs: • DBMS uses operational knowledge from information in applications programs e.g. embedded SQL, stored procedures etc. The operational part of a KBS is represented by declarative knowledge in the rule base. • Granularity mis-match – KBS can’t handle set optimisation that is a benefit in DBMS: • KBS works with a row at a time instead of sets of rows, hence lose effect of optimisation on sets of data. • There are implementations existing already that suit the purpose & that are not seriously affected by these problems.
Coupling of DBMS & KBS • Can adopt different levels of coupling the two types of system. • Communication channel between two subsystems • Extract data from DBMS, store & use the snapshot in the KBS • (problems here – snapshots soon become obsolete as DBs are updated frequently, may need snapshots from a few sources at once, slow.)
Architectural solutions for KBS/DBMS integration The first architecture in the diagram was implemented by Trinity College, Dublin - system known as DIFEAD (Dictionary interface for ES and DB). One of the first systems to do this and also first to base the interface functionalities on the data dictionary concept. An earlier similar example is KADBASE. Uses a network data access manager to provide central interface between different components. (KESE = Knowledge Engineering Software Environment)
Extending KBS with DB components • This is the solution adopted by ES tool vendors, so that their systems can use information extracted from a database. • A well known product that operates in this way is KBMS. • Written in C • Uses idea of forward & backward chaining • Incorporates NL facility by allowing developers to write rules in English • Includes its own relational DB storage facility. • Uses If-Then type rules
5. State of The Art A series of annual workshops take place that aim to go beyond the classical KBS/DB connection. These are international and the first one was held in 1994. The proceedings for these can be found at: http://sunsite.informatik.rwth-aachen.de/Publications/CEUR-WS/Vol-54/ KRDB (Knowledge Representation meets Databases) Workshops
CYC project • Launched in late 1984 as an MCC (Micro-electronic & computer technology corporation, Texas) project. • A very large KB built with a huge amount of common sense knowledge. Includes ideas to do with time, space, substances, contradiction, causality, emotions, beliefs, etc. • Contains more than a million hand inserted assertions, that are made up of facts and rules. • Includes interface tools, runs on various platforms. Currently developing another interface so that general public can insert facts and rules. Can see website about this at: http://www.CYC.com
CYC Project Attempts to redress ‘narrowness’ problem of domains addressed by KBSs. It is being used in concrete applications now.
6. Other Developments • Temporal databases • Ontologies • Semi structured & un-structured data • Internet indexing & retrieval • Data Mining
Temporal DB example - Temibase • Integrates AI rules with temporal database • Can handle incomplete temporal information • Supports temporal reasoning • Supports learning through derivation performed on data and rules • Supports both active and passive rules depending on purpose of system • Currently developing NL interface See web page for links
7. Personal interest – music representation • Symbolic approaches have proved useful • Vocabulary of symbols used to represent concepts or objects • Programmer uses the vocabulary to say in its terms how the programs can achieve results • Level of abstraction for music? • No right answer for everyone • Jackendoff’s idea of “musical surface” “lowest level of representation which has musical significance”
Music representation • Wiggins& Smaill propose 2 dimensions on which to compare music representations: • Expressive completeness • Structural generality
Music representation • They aim for • “a represntation with an explicit but not too erstrictive musical surface, within which the widest possible range of data can be represented” • Enables sharing of data between researchers • Better means of expressing and exchanging new and difficult ideas • Propose the CHARM system • Language independent but most implementations have been in Prolog – v. good for symbol manipulation
Summary • Various approaches to bringing AI and database technology together • Many applications for which these are already being used • Many potential applications • Especially useful for unusual problems