380 likes | 516 Views
Contract Driven Business Applications. Talk slides. Talk Outline. Introduction Research problem – Tenets – EDEE / CamPACE Architecture RDF / Semantic Web – Event Calculus – FLBC Projects and prototyping Multithreaded Prolog servers and other forms of madness.
E N D
Contract DrivenBusiness Applications Talk slides
Talk Outline • Introduction • Research problem – Tenets – EDEE / CamPACE • Architecture • RDF / Semantic Web – Event Calculus – FLBC • Projects and prototyping • Multithreaded Prolog servers and other forms of madness. • Fair and logical trading project • The UPenn Bloodbank project • Future research directions • Conclusion
Research Problem • Businesses interact under terms and conditions. • Set up in written contracts or agreed protocols. • Sometimes norms are established implicitly. • Computer systems in business are way too brittle. • Lack of understanding and ambiguity is an error condition, • Yet in real-world business it is commonplace. • Function of technology-driven optimisation (e.g. RDBMS). • Striving for a better human/computer separation. • Leave computers doing storage and querying. • Add more human flexibility in terms of semantics. • We are trying to find an “always-on” solution.
Two tenets • Business agents react to their environment. • We approximate such agents’ world view as being over a finite set of states of affairs. • The state of the world changes over time. • We approximate time as an integer measure, • thus the state changing can be viewed as an effect of events occurring at points in time. • Some events may refer to derivatives.
Fundamental data representation? • Given those tenets, how do we represent data?! • E.g. a relational schema for purchase orders is already way too specific. • (although useful for caching currently static data) • We want to be able to enumerate items (or concepts) and the various relationships between them. • Enter the Resource Description Framework (RDF). • (and exit performance)
Where do contracts fit in? • Brittle operating system software code is desirable. • (Assuming it is mostly correctly programmed.) • This is because such code repeats deterministic jobs, operating in a clean, digital world. • Successful real world interactions require flexibility. • Parties may appear to be violating your obligations, but who is to say your perspective is correct? • Need to support agents maintaining different world views. • Appropriate knowledge representation will allow us to handle evidence without fully understanding it. • All obligations are proposed on the basis of evidence.
Alan’s Occurrence Store • Alan Abrahams designed a knowledge representation framework with RDF-similar expressiveness. • ‘Occurrence’ triples unify storage of: • Business events / state • Queries related to outstanding obligations, etc. • ‘Factories’ that generate new occurrences • Basis of the CamPACE implementation • Cambridge Policy Analysis and Checking Environment. • ASP.NET web application with Prolog conflict checker. • Tag natural language into occurrence forms • Runs safety and liveness checks. • Follows on from his EDEE implementation.
Deontic semantics • Alan’s Ph.D. thesis also examined a comprehensive set of deontic terms. • Types of obligation, prohibition, power, authority, permission, voiding, fulfilment, etc. • These terms form the building blocks of electronic contracts. • Conflicts are accepted to be commonplace, and not an exception. • For example, safety conflicts include being obliged but prohibited, or being liable but immune. • Liveness problems include being desired but not obliged, or being obliged but not able.
Regulations & Contracts CamPACE Contract Analysis Environment • CFR 610.40: Blood • Each donation shall be tested for HBsAg by an FDA-licensed test of third-generation sensitivity • If the initial test is non-reactive… 1. Identify Structured Terms 2. Store Terms Input Merriam-Webster Online CamPACE Linguistic Indicator Rules DB of Provisions& Occurrences UPenn VerbNet Prolog Conflict Checking Rules Berkeley FrameNet Actual Event Traces 4. Monitor for Compliance 3. Check for Conflicts CamPACE environment
Define Clauses Syntax Highlighting Uninflected-Verb Lookup (Merriam-Webster Online) Semantic DatabaseConsultation Indicator Rules Exceptions to Rules Tool Tips
CDBA Architecture: E4 • Our CDBA architecture evolves Alan’s projects using: • Event-based middleware • Event logging. • Event Calculus. • Event semantics. • (there are subtle differences in all senses of the word!) • In terms of knowledge representation, RDF replaces the triple-oriented occurrence store. • Shift in focus from deriving software from natural language documents, to just correlating semantics.
CDBA Architecture: Behaviour • We have also increased the emphasis on analysing and effecting dynamic behaviour. • Log times are connected to all events. • By the publish/subscribe emulator at the moment. • The Event Calculus is employed to define reactive behaviour and monitoring. • Symbolic programming language code is used to exchange software functions. • In our case a security-filtered form of Prolog. • Content-based messaging is used to exchange messages between interested parties.
Agent Inference engine (Event Calculus) Caches (RDBMS) Agent Fluent queries Agent SQL Operator Agent TEQL Event Store (RDF + time) Policy Contracts History CDBA Architecture External Events FLBC + domain-specific event encodings Event transport (Secure pub/sub)
Two broad levels of CDBA operation • It is not clear from the preceding diagram that there are at least two broad levels of operation. • The confusion stems from us using Prolog for both! • The core level (0) is the event-based infrastructure. • Here inconsistency is an exception situation. • Event log records being added with times long past. • Malformed RDF graphs. • Above the core is the application infrastructure. • At this level inconsistency is likely to occur as it does in the real world. • The beliefs of the agent about such situations will be recorded like all other events are.
Architecture: Event Store (1) • RDF is a highly referential structure. • How on earth to perform updates? • If possible, just… don’t. • Intuition: Log everything. (slightly refined later) • Business decision-making needs history. • Obviously beneficial – the constraint is one of granularity. • Domain-specific events should be as close as possible to human-level decision granularity. • Hence our focus on business environments. • Humans are the decision bottleneck in at least some areas.
Architecture: Event Store (2) • Robust RDBMSes use journals (event logs?). • Why isn’t it a first-class member of more systems? • Over time the event log can be migrated to cheaper, slower media. • Storage is cheap anyway… • Also, parallelism is now commonplace. • C.f. Google’s approach. • Even better – our log is WORM. • Necessary redundancy • Agents store their view of events for subsequent compliance checking. • Multi-level: fold away extraneous detail over time…
Architecture: Caches • We compile programs into faster forms of code. • However, non-JIT compilers discard information. • We see relational tables as representing compilation of data rather than software code. • This is what RDBMSes do with their journals. • The more journal data the better in terms of optimisation. • Beyond compiling the event log into relational tables, we also can aggregate event log data over time. • This will be necessary for DPA compliance in the UK. • To implement event log folding efficiently, extra time fields will be needed. • Queries over the old data should gracefully degrade to relationships over the aggregation checkpoints.
Architecture: RDF (primer) • RDF has numerous representations. Basic idea: • <subject> <predicate> <object> . • Effectively defines graphs with named edges. • Any part of this RDF ‘statement’ can be URI. • Otherwise it is a literal value (possibly typed). • Can do reification: statements about statements. • RDF is really a storage-level solution. • Missing useful notions of structured objects • RDF Schema (not XML Schema) adds classes, etc. • OWL describes structural relationships across SW. • E.g. two different RDF descriptions might be unifiable. • OWL glues the semantic web together (when possible).
Architecture: RDF messages • We also use RDF as a messaging format. • Ideal for agent-centric systems, since the RDF atom namespace is based on agents’ DNS names. • We need support for transmitting message graphs: • E.g. by reference (semantic web), or • by instance (the receiving agent will copy the graph). • More importantly, we must checkpoint RDF graphs. • In the semantic web, at any time further triples can be linked to a given source Id. • This is not acceptable if it implies modifying event history!
Architecture: Event Calculus • Beautifully simple formalism that permits deductive, inductive and abductive reasoning. • Events occur at a point in time. • Fluents are states of affairs that hold over time. • Half open intervals. • Fluents can be multi-valued, and can represent derivatives. • Driven by a set of domain-specific rules – which events initiate and terminate which fluents. • Frame problem is solved by circumscription. • If something isn’t said to change, it doesn’t change. • Countless extensions: modal operators, continuous change, non-instantaneous events, etc.
Architecture: Event Calculus (2) • We are using EC to program the run-time reactive behaviour of each agent. • For this use we not need a parallelisable process calculus. • At present we are aiming to allow agents to reason backwards in time from the present. • Reasoning forwards in time efficiently might require a more complex formalism. • This is interesting to us, since often businesses will want to make prospective decisions. • Full parallel process calculi might allow us to prove properties for the overall behaviour of our system • Not likely to be useful above the core infrastructure.
Architecture: FLBC • Formal Language for Business Communication. • Provides for compositionally encoding messages. • Much EDI is not compositional at all – description of “Bob and Carol” would require creating a new ID for the pair. • His theory of disquotation encodes easily into RDF. • A request should prototype the expected fulfilment events. • Data not structurally equivalent may be decided to “count as” being so from the agent’s perspective. • Sentence meaning versus Speech meaning • Logging agents’ speech acts can usefully hint at their interpretations (e.g. they do not believe an assurance).
Architecture: TEQL • Mark Spiteri’s Ph.D. thesis: An Architecture for the Notification, Storage and Retrieval of Events. • He introduces the TEQL query langauage • (although didn’t provide the formal grammar!) • TEQL covers a large set of desirable time query constructs that would be fiddly in, say, SQL. • Defines instantaneous events, and time intervals • A good fit with the Event Calculus • I have implemented a very small subset of the language for proof-of-concept decision support.
Projects and ongoing prototyping • FLBC based projects. • Fair and logical trade. • Salutory agents. • SWIFT simulation. • Other regulatory notions. • US taxation conflict resolution. • FDA blood-bank example.
Initial F< prototyping • Aiming to assess the viability of Steve’s FLBC (Formal Language for Business Communication). • Shift from EDI message focus, to defining compositional terms within any given communication. • Semantics can be introduced gradually. • Uses speech acts as the main representation of evidence in our interactions. • Parties do not need to agree on the truth: • X states she believes Y has satisfied requirement Z. • Record that the speech act occurred (via the TTP) regardless of its actual validity.
Initial F< prototyping (2) • Note that there is embedded propositional content in the prior statement. • Steve Kimbrough’s ‘Disquotation theory’ provides the tools to package and unpackage such constructs. • Explored representation at a workshop last year. • Will return to it with actual agent behaviour. • Aim to decrease the first trade problem in SME electronic commerce. • Computer supported semantics will be limited: • E.g. flag new promises related to completed past events. • Aim to provide human-level business decision support.
Agent-based simulation • Early prototype implementation of our architecture. • Written in multi-threaded Prolog (of course) • Uses RDF-equivalent Prolog terms for data representation • Soon to migrate to the SWI Prolog Semantic Web library. • Currently uses TCP/IP (!) to send messages to a simple publish subscribe system emulator. • Subscription coverage based on Prolog unification • RDF allows us to easily handle structured data. • We want the same features managing code. • The Prolog code/data integration is superbly convenient. • Although had some interesting security implications! • Naturally we could use other languages, but the implementation would be far messier.
Salutory agents • Insults / greetings exchanged between two agents. • Five agents in total: Bob, Bob’s boss, Carol, Carol’s boss, and the publish/subscribe manager. • These agents are completely separate Prolog processes. • The four players sleep randomly after each “day”. • Although the actual code is of course run-time. • Bob and Carol both have a queue of instructions from their respective bosses as well as event logging. • This example serves to demonstrate: • Agent behaviour influenced by their own event log. • That the communication and application framework works. • That Netscape 2’s HTTP push extensions live on in Firefox.
SWIFT simulation • Steve has had experience with SWIFT. • They own a messaging standard to deal with international monetary transfers. • It is replete with obscure codes and keywords. • Our aim is to show that the FLBC can contain simple SWIFT transactions and add useful flexibility. • Uses complex RDF structures representing FLBC messages with embedded propositional content. • The primary goal of our prototype is to demonstrate programmable “counts-as” functionality. • Allow agreeing a settlement on rounding error within a transaction to be reconciled dynamically.
Event Calculus implementation • Started with the Simplified Event Calculus. • Negation as failure. • Soon discovered that extensions were necessary. • Implemented deadlines – a form of continuous change. • Deadline-surpassing fluents causing events required rewriting of the SEC predicates. • That rewrite happened to suit indexing far better! • Get for free a hash lookup on event types. • Queries over time ranges now use AVL trees. • Balanced binary trees for O(logn) operations. • Prolog might be O(n) slow, but I’ve not had to care yet…
Ontologies pending • To date my RDF terms have been somewhat ad hoc. • I want to specify numerous RDF ontologies: • Event Calculus • There is an XML encoding (“so what?” indeed…) • FLBC • Define a sufficiently comprehensive set of compositional, encapsulating business messages to fit common daily use. • Deontic terms • Working with Alan to determine a spanning set of terms for obligations, prohibitions, fulfilment, dispensation, etc. • Publishing our ontologies allows sharing. • E.g. may reference parts of the Rei policy ontology.
Documenting ontologies • RDF terms will need to be described using natural language text. • Leaves humans to understand the specifics. • Unlike current EDI, can support incremental augmentation. • I have developed a proof-of-concept approach to tagging natural language text using DHTML and SVG. • This will allow convenient browsing of the relationship between our EC+Prolog policy data and natural language descriptions of them. • Again Firefox 1.5 comes out a winner in terms of DHTML+CSS and its SVG implementation.
Bloodbank Project (I) • UPenn Bloodbank project examines a section of the FDA CFR relating to the management of blood donations. • Insup Lee and friends in UPenn CS tried to use NLP techniques to convert the policy text into EFSMs. • I am sceptical that EFSMs are sufficiently expressive. • Prescriptive rather than restrictive: suitable for creating good traces more than showing any given trace is not bad. • I propose to use an Event Calculus formulation instead, using terms from our deontic ontology. • An example of their EFSM graphs follows…
Bloodbank Project (II) • The workflow from the note to the FDA CFR:
Future challenges • Moving out of RAM… • Referential redundancy caused by graph-based structures being inserted into a relational database table. • Event log indexing • Access control • Distributed time and clocks • Trust management • Ontology publication. • Composite events • Distributed simulation • Parties agree to simulate situations, whilst preserving true anonymity of secret business information.
Conclusion • Provided background of how we got to this point. • Integrating the four E’s looks promising. • Event-based middleware • Event logging. • Event Calculus. • Event semantics. • Initial prototyping is getting somewhere. • Many really interesting research challenges remain. • Much resonance with other Opera Group research • All collaboration welcome! • Any questions?
Recent voyage • Taken me through: • Multithreaded Prolog • NetLogo • Slidey • SlideMaker • Xgl • Multidimensional indexing • Wink • Dataflow PD / jMax • Functional programming • Subtext (crazy) • Also: • Philadelphia • New York City • Princeton