1 / 39

Constructing Complex Queries in Pathway Tools using Emacs, Lisp, and Perl

Constructing Complex Queries in Pathway Tools using Emacs, Lisp, and Perl. Randy Gobbel, Ph.D. May 14, 2003 gobbel@ai.sri.com. Overview. Why would you need to write complex queries? Emacs Lisp perlcyc The GFP API, and Pathway Tools-specific functions Examples and exercises.

cassie
Download Presentation

Constructing Complex Queries in Pathway Tools using Emacs, Lisp, and Perl

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Constructing Complex Queriesin Pathway Toolsusing Emacs, Lisp, and Perl Randy Gobbel, Ph.D. May 14, 2003 gobbel@ai.sri.com

  2. Overview • Why would you need to write complex queries? • Emacs • Lisp • perlcyc • The GFP API, and Pathway Tools-specific functions • Examples and exercises

  3. When do you need complex queries? • Many common queries are accessible from the command menu • By name • By substring • By class • Others are specialized by the type of the object being displayed • Other queries of arbitrary complexity can be created by writing a (simple) program • Example: find all reactions with more than 5 citations

  4. Programmatic Access to PGDBs • LISP and PERL languages used for programmatic queries and updates to PGDBs • Generic Frame Protocol (GFP) is API for PGDBs

  5. Emacs • “The extensible, self-documenting editor” • (Most of the time) typing a printing character simply inserts it • Just like most Windows and MacOS programs • Control and Meta keys in combination with other keys run commands • Again, just like keyboard shortcuts in most programs • Control-H: Help • T -> tutorial, A -> apropos, W -> “where is <command>” • K -> “what does this key combination do?” • Many commands are now available from pulldown menus

  6. Emacs • Three ways to run Pathway Tools from within Emacs • Use the Emacs/Lisp interface provided with Allegro Common Lisp (fi) • Use the free ILisp package (wriitten in Emacs Lisp) • Run Pathway Tools from a shell within Emacs • Windows users: lowest-common-denominator • Cut and paste still works • Advantages of using Emacs with Lisp • Syntax highlighting • Automatic indentation • One-keystroke evaluation of Lisp forms in fi and ilisp

  7. Lisp • An idea that keeps reinventing itself • Function, arguments • What is a list? • Unit of syntax: (a b c) • Unit of data: (a b c) • Unit of execution: (get-slot-value ‘arca ‘citations) • Most languages: function(arg1, arg2, …) • Fine for writing • Lisp: (function arg1 arg2 arg3 …) • Much easier to deal with in a computer

  8. Lisp Data Types • Numbers • 1 • 1.325 • Strings • “hello” • Symbols • E.g.: ARCA (or, arcA) • Make a literal symbol by quoting it: ‘ARCA • Case-sensitive symbols require vertical bars: ‘|Genes| • Special symbols: T and NIL • Used to mean True and False • NIL is also the empty list: ()

  9. Lisp Expressions and Evaluation • (+ 3 4 5) • ‘+’ is a function • (+ 3 4 5) is a function call with 3 arguments • Arguments are evaluated: • Numbers evaluate to themselves • If any of the args are themselves expressions, they are also evaluated • (+ 1 (+ 3 4))  8 • The values of the args are passed to the function • Some functions allow variable numbers of arguments • (+)  0 • (+ 1)  1 • (+ 2 3 1 3 4 5 6)  24 • (+ (* 3 4) 6)  18

  10. Lisp Expressions and Evaluation • Also called “top level” and “read-eval-print loop” • Uses a three-step process • Read • Reader converts elements outside “” and || to uppercase • Evaluate • Print • Anything you type in is evaluated • 1  1 • “hello”  hello • (+ 2 3)  5 • Quoting prevents evaluation • ‘(+ 2 3)  (+ 2 3) • Setting a symbol to a value creates a variable: • (setq foo ‘(a b c))  (a b c) • foo  (a b c) • No declarations required!

  11. The Lisp Listener • Useful forms in listener: • Previous Results: *, **, *** • But: not in programs (+ 1 2)  3 (+ 3 *)  6 **  3

  12. Dealing with the Lisp debugger • Error conditions result in a call to the Lisp debugger: • :continue continues, a numeric argument selects between possible options • Lower-numbered options generally take less drastic actions • :reset unwinds to the top level • WARNING: may exit the Pathway Tools window! • :zoom displays the stack EC(4): (xxx) *debugger-hook* called. Error: Attempt to take the value of the unbound variable `X'. [condition type: UNBOUND-VARIABLE] Restart actions (select using :continue): 0: Try evaluating X again. 1: Use :X instead. 2: Set the symbol-value of X and use its value. 3: Use a value without setting X. 4: Return to Top Level (an "abort" restart). 5: Abort entirely from this process. [1] EC(5): :res

  13. Lisp Variables • Global variable values can be set and used during a session • Declarations not needed (setq x 5)  5 x  5 (+ 3 x)  8 (setq y “atgc”)  “atgc”

  14. Equality in LISP • Internally LISP refers to objects via pointers • Fundamental equality operation is EQ • True if the two arguments point to the same object • Very efficient • Other comparison operators: • = for numbers: (= x 4) • EQUAL for list structures or exact string matching: (equal x “abc”) • STRING-EQUAL for case-insensitive string matching: (string-equal x “AbC”) • EQL for characters: (eql x #’\A) • EQ for list structures or symbols (compares pointers): (eq x ‘ABC) • FEQUAL for frames: (fequal x ‘trp) • Simple rule: Use EQUAL for everything except frames

  15. Functions for Operating on Lists • length • (length x) • Returns the number of elements • first • (first x) • Returns the first element • nth • (nth j x) • Returns the Jth element of list X (element 0 is the first element)

  16. loop • Loop allows you to iterate • Through a series of numbers • for i from 1 to 10 • Through a list • for rxn in rxns • Conditionals control whether execution continues • when (> (length (get-slot-values rxn ‘citations)) 5) • do lets you do something • do (+ i total) • collect lets you gather up values • collect (get-frame-name rxn)

  17. loop • You can combine as many loop clauses as you need: (loop for i from 1 to 10 for j from 10 downto 1 do (print (+ i j)) collect (* i j))  (10 18 24 28 30 30 28 24 18 10)

  18. Defining Functions • Put function definitions in a file • Reload the file when definitions change • EC(1): :ld my-queries.lisp • (defun <name> (<arguments>) … code for function …) • Creates a new operation called <name> • Examples: (defun square (x) (* x x)) (defun message () (print “Hello”)) (defun test-fn () 1 2 3 4)

  19. Accessing Lisp from Pathway Tools • Starting Pathway Tools for Lisp work: > pathway-tools –lisp EC(1): (select-organism :org-id ‘XXX) Windows: pathway-tools-lisp.exe • Lisp expressions can be typed at any time to the Pathway Tools listener Command: (get-slot-value ‘trp ‘common-name)  “L-tryptophan” • Invoking the Navigator from Lisp: EC(2): (eco)

  20. The perlcyc API • Written by Lukas Mueller at TAIR • Downloadable from the TAIR Web site • Installs as a standard CPAN module • From within Pathway Tools, start the server by hand: • (start-external-access-daemon) • (start-external-access-daemon :verbose? t) for tracing output • Function names are the same as Lisp, with hyphens replaced by underscores, question marks by _p • get-class-all-instances  get_class_all_instances • coercible-to-frame?  coercible_to_frame_p • Pathway Tools functions are callable as standard Perl functions • Frame names are symbols which can be passed back to Lisp • Control structures are standard Perl

  21. javacyc • Uses the same Unix domain socket interface as perlcyc • Function names use Java conventions • Get-slot-values  getSlotValues • Includes a C library for Unix domain sockets

  22. Lisp vs. Perl • Task: find all reactions with fewer than 5 citations • Perl: use perlcyc; my $cyc = perlcyc->new(“ECOLI"); my @found; foreach $r ($cyc->all_rxns()){ my @citations = get_slot_values($r, “citations”); if (scalar(@citations) < 5) { push @found, $r; } • Lisp: (loop for r in (all-rxns) when (< (length (get-slot-values r ‘citations)) 5) collect r)

  23. Pathway Tools User Accessible Functions • Internal Pathway Tools functions that users can call • Includes: • Generic Frame Protocol (GFP), the Ocelot object database API • Additional functions specific to Pathway Tools • For more information see • http://bioinformatics.ai.sri.com/ptools/ptools-resources.html

  24. Generic Frame Protocol (GFP) • A library of Lisp functions for accessing Ocelot DBs • GFP specification: • http://www.ai.sri.com/~gfp/spec/paper/paper.html • A small number of GFP functions are sufficient for most complex queries

  25. Generic Frame Protocol • (get-class-all-instances Class) • Returns the instances of Class • Key Pathway Tools classes: • Genetic-Elements • Genes • Proteins • Polypeptides (a subclass of Proteins) • Protein-Complexes (a subclass of Proteins) • Pathways • Reactions • Compounds-And-Elements • Enzymatic-Reactions • Transcription-Units • Promoters • DNA-Binding-Sites

  26. Generic Frame Protocol • Note: Frame.Slot means a specified slot of a specified frame • Frame and Slot must be symbols! • (get-slot-value Frame Slot) • Returns first value of Frame.Slot • (get-slot-values Frame Slot) • Returns all values of Frame.Slot as a list • (slot-has-value-p Frame Slot) • Returns T if Frame.Slot has at least one value • (member-slot-value-p Frame Slot Value) • Returns T if Value is one of the values of Frame.Slot • (print-frame Frame) • Prints out the contents of Frame

  27. More useful functions • (coercible-to-frame-p Thing) • Returns T if Thing is the name of a frame, or a frame object • (save-kb) • Saves the current KB • (replace-answer-list <list of frames>) • Makes the specified frames browseable via the Pathway Tools GUI

  28. Generic Frame Protocol –Update Operations • (put-slot-value Frame Slot Value) • Replace the current value(s) of Frame.Slot with Value • (put-slot-values Frame Slot Value-List) • Replace the current value(s) of Frame.Slot with Value-List, which must be a list of values • (add-slot-value Frame Slot Value) • Add Value to the current value(s) of Frame.Slot, if any • (remove-slot-value Frame Slot Value) • Remove Value from the current value(s) of Frame.slot • (replace-slot-value Frame Slot Old-Value New-Value) • In Frame.Slot, replace Old-Value with New-Value • (remove-local-slot-values Frame Slot) • Remove all of the values of Frame.Slot

  29. Additional Pathway Tools Functions –Semantic Inference Layer • Semantic inference layer defines built-in functions to compute commonly required relationships in a PGDB • http://bioinformatics.ai.sri.com/ptools/ptools-fns.html

  30. GKB editor • GUI for browsing the frame hierarchy • Command: Special  Taxonomy Viewer • View  Browse Class Hierarchy (ctrl-B) • Allows viewing of classes, slots, and instances • You can’t write a query unless you know the exact class and slot names • Class names are usually case-sensitive symbols • |Genes|, |Proteins|, …

  31. LISP and GFP References • Common LISP, the Language -- The standard reference • Paper edition by Guy Steele • Online version • http://www.lispworks.com/reference/HyperSpec/Front/index.htm • Information on writing Pathway Tools queries: • http://bioinformatics.ai.sri.com/ptools/ptools-resources.html • http://www.ai.sri.com/pkarp/loop.html • http://bioinformatics.ai.sri.com/ptools/debugger.html

  32. Pathway Tools information Web site • Top top-level page • http://www.biocyc.org/ • General Pathway Tools information • http://bioinformatics.ai.sri.com/ptools/ • How to submit a bug report • http://bioinformatics.ai.sri.com/ptools/bug.html • Writing queries, introductions to Lisp, etc. • http://bioinformatics.ai.sri.com/ptools/ptools-resources.html

  33. Examples (select-organism :org-id ‘ecoli)  ECOLI (setq genes (get-class-all-instances ‘|Genes|))  (……………) (setq monomers (get-class-all-instances ‘|Polypeptides|))  (…………….) (setq genes2 genes)  (…………….)

  34. Problems • all-substrates • enzymes-of-reaction • genes-of-reaction • genes-of-pathway • monomers-of-protein • genes-of-enzyme

  35. Example Session (setq x ‘trp)  trp (get-slot-value x ‘common-name)  “L-tryptophan” (setq aas (get-class-all-instances ‘|Amino-Acids|))  (……..) (loop for x in aas count x)  20

  36. Example Session (loop for x in genes for name = (get-slot-value x ‘common-name) when (and name (search “trp” name)) collect x))  (…) (setq rxns (get-class-all-instances ‘|Reactions|))  (…) (loop for x in rxns when (member-slot-value-p x ‘substrates ‘trp) collect x)  (…) (replace-answer-list *)

  37. Example Session (setq x ‘(trp arg))  (TRP ARG) (replace-answer-list x)  (TRP ARG) (eco)

  38. How to write a good bug report • Use dribble-bug • (excl:dribble-bug “bug.txt”) to start dribbling • (excl:dribble-bug) to stop • How to get out of the debugger • :bt – short backtrace of what functions are being called • :zoom – more detailed trace • :cont <n> - continue. Lower numbers are less drastic • Be specific, and as detailed as you can stand • What button/key did you push? • Which screen/editor were you using at the time? • What object were you viewing/editing? • Try to find a reproducible test case if at all possible!

  39. How to use autopatch • Patches load automatically on startup, or-- • Special  Install Patches • Download and install • Or simply install • Goes to our Web server gets patches, and installs them • Restarting is usually not required • Functions are redefined on the fly • But: if the patch involved initialization, you might need to restart

More Related