1 / 42

An Introduction to SPITBOL

An Introduction to SPITBOL. Programming Languages Robert Dewar. SPITBOL Background. A series of string processing languages developed at Bell Labs (Griswold et al) SNOBOL SNOBOL-3 SNOBOL-4 Later Griswold developed ICON Based on SNOBOL-4

evers
Download Presentation

An Introduction to SPITBOL

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An Introduction toSPITBOL Programming Languages Robert Dewar

  2. SPITBOL Background • A series of string processing languages developed at Bell Labs (Griswold et al) • SNOBOL • SNOBOL-3 • SNOBOL-4 • Later Griswold developed ICON • Based on SNOBOL-4 • SPITBOL is a fast compiled implementation of SNOBOL-4 (dewar et al)

  3. Silly Acronyms • StriNg Oriented symBOlic Language • SPeedy ImplemenTation of snoBOL4 • Strictly speaking, SPITBOL is a dialect • Removes a few very marginal features • Adds a number of extensions

  4. Dynamic Typing • A = 123 ;* A has an integerA = “BCD” ;* A has a stringA = array(10) ;* A has an array • Full typing information available • Full type checking done • But types can vary dynamically • No static declarations

  5. Datatypes (partial list) • INTEGER (typical 32-bit signed integer) • REAL • STRING • Varying length string as first class type • Not in any sense an array of characters • ARRAY • TABLE • PATTERN • CODE

  6. Basic Syntactic form • Line oriented • Labels in column 1 • Rest of line free format (keep to 80 cols) • Continuation lines have . (period) in col 1 • Comment line starts with * • Multiple statements on line using ; • But no ; normally after a statement • The combination ;* makes a line comment

  7. More on basic syntax • Assignment uses = • Must have spaces around = • Must have spaces around binary operators • Must not have space after unary operator • Null operator (i.e. space) is concatenation

  8. Simple Arithmetic • Normal arithmetic operators • A = 123 A = A + 2 B = 126 A = (A + B) / (A * B) • Note: precedence of / is lower than * so we could have written last line as: A = (A + B) / A * B

  9. Real arithmetic • Same set of operators • A = 123.45 B = 27.55 C = A / B • Automatic widening of integers C = C + 1 ;* 1 treated as 1.0 here

  10. Strings • Strings can be any length • String literals have two forms • Surround by “ can contain embedded ‘ • Surround by ‘ can contain embedded “ • Examples: A = “123’ABC” N = ‘b”c’ C = A N A ;* concatenation* C has value 123’ABCb”c123’ABC

  11. Strings and Integers • Can auto-convert between string/integer • X = 123 K = X “abc” ;* K = string 123abc K = X “” ;* K = integer 123* concatenating with null is special as above X = “123” ;* X = string “123” M = X + 1 ;* M = integer 124 M = X + “a” ;* run-time error

  12. Predicates • Predicates are functions that either return the null string (on true) or “fail” on false • Integer predicates: eq le lt ne gt ge • eq(1,2) fails ne(1,2) succeeds, returns null • Note: no space between function name and left parenthesis (rule applies to all functions)

  13. Gotos and labels • A label is an identifier in column one • At the end of any statement can have a goto field in one of five forms: • :(Label) unconditional goto Label :S(B1) on success goto b1, on fail fall through:F(B2) on success fall through, on failure goto B2:S(F1)F(X) on success goto F1, on failure go to X:F(F1)S(X) on failure goto F1, on success go to X

  14. Example of use of Labels • A simple loop (add numbers from 1 to 10) • N = 1 S = 0SUM S = S + N N = LT(N, 10) N + 1 :S(SUM) • Note that if LT(N,10) succeeds it returns null • The null is concatenated with the value of N • Now you see why the special rule that concatenating null does nothing at all!

  15. Comparing Strings • Cannot compare using eq, ne • Since these work only for Integer, Real • For example EQ(“123”,”00123”) succeeds • But EQ(“ABC”,”ABC”) is a run-time error • So to compare two strings • Use IDENT(A,B) or DIFFER(A,B) to compare • Missing args are null so • IDENT(A) or DIFFER(A) checks for being equal to null or not equal to null

  16. Input-Output • To write to standard output: • OUTPUT = string • To write to standard error: • TERMINAL = string • To read from standard input • LINE = TERMINAL • fails if no more input (end of file)

  17. To Read/Write Files • Dynamicaly associate variables with the files and subsequent assignments write the file and subsequent references read. • Here is a file copy program • INPUT(‘IN’,1,”filename1”) OUTPUT(‘OUT’,2,”filename2”)CL OUT = IN :S(CL)END • End label ends program (always true) • 1 and 2 are unit numbers, must be unique

  18. Pattern Matching • General format is • subject ? pattern • subject ? pattern = value • The ? can be omitted • Match may fail • If match succeeds in second form, value replaces matched part of subject • Pattern can contain strings or special pattern primitives

  19. Pattern Matching Examples • Example: • X = “123AABCTHECAT” X ? “A” ARB “THE” = “HELLO” • Here ARB matches anything (special primitive) • Match is to left most occurrence • So ARB matches “ABC” • Resulting value in X is “123HELLOCAT”

  20. Other primitives • These can be used as pattern components • LEN(int) matches int characters • ANY(“AB”) matches A or B • SPAN(“ “) matches longest spaces string • BREAK(“A”) matches up to but not incl ‘A’ • REM matches rest of string • BAL matches paren balanced string

  21. Pattern Constructors • Alternation • P1 | P2 • Matches either P1 or P2, try P1 first • Concatenation • P1 P2 • Matches P1 then P2

  22. Pattern Output • The use of the dot operator • STM = “label x = terminal” STM ? BREAK(‘ ‘) . L SPAN(‘ ‘) REM . S • If match succeeds (only if) period results in assigning matched part to given variable • After above match L = “label” S = “x = terminal”

  23. Pattern Output • The $ operator is like the dot operator, but assignment is immediate • "ABC" ? ARB $ TERMINAL 'x‘END • Output is ten lines: (blank line) (arb matches null string before A) A AB ABC (blank line) (arb matches null string between A and B) B BC (blank line) (arb matches null string between B and C) C (blank line) (arb matches null string after C)

  24. Patterns as Values • Patterns can be assigned etc • Vowel = ‘oe’ | ‘ae’ | ‘a’ | ‘e’ | ‘i’ | ‘o’ | ‘u’ • Cons = Notany(“aeiou”); • Now can use Vowel in a pattern • So a big pattern can be built up • Using a series of assignments to build it from component parts • Vowelseq = Arbno(Vowel) Isolatedcons = Vowelseq Cons Vowelseq • etc.

  25. Fancy Recursive Patterns • Here is a BNF grammar for simple expressions • EXPR ::= TERM | EXPR + TERMTERM ::= PRIM | PRIM * TERMPRIM ::= LETTER | ( EXPR )LETTER ::= ‘a’ | ‘b’ | ‘c’ … ‘z’ • Generates strings like • a+b*(c+d)

  26. First attempt at pattern • Here is a pattern matching that grammar • EXPR = TERM | EXPR ‘+’ TERM TERM = PRIM | PRIM ‘*’ TERM PRIM = LETTER | ‘(‘ EXPR ‘)’ LETTER = ANY(“abc .. xyz”) • Neat  But wrong  • Why, because when you execute the assignment to EXPR, TERM are null • EXPR = ‘’ | ‘’ ‘*’ ‘’ • That’s not what you want

  27. Second attempt at pattern • Here is a pattern that works • EXPR = *TERM | *EXPR ‘+’ *TERM TERM = *PRIM | *PRIM ‘*’ *TERM PRIM = *LETTER | ‘(‘ *EXPR ‘)’ LETTER = ANY(“abc .. xyz”) • This works, because unary * means don’t look in the variable until pattern matching times.

  28. More neat patterns • Match all palindromes • PAL = POS(0) ARB $ STR *REVERSE(STR) RPOS (0) • POS(0) matches null string at start • RPOS(0) matches null string at end • The unary * actually means don’t evaluate expression until pattern matching time, so reverse is called during the pattern match.

  29. Arrays • Array created by call to array function • AR = ARRAY(50) • To index, we use <>, fail if out of range • To fill AR with integers 1 .. 50 • N = 0 LP AR<N = N + 1> = N :S(LP) • Multidimensional arrays allowed etc.

  30. Tables • Like arrays but subscript can be anything • Implemented typically by hash tables • R = TABLE(100) LP S = TERMINAL :F(END) TERMINAL = NE(R<S>) S “given “ R<S> “times” R<S> = R<S> + 1 :(LP) END

  31. Functions • Functions are defined dynamically • Everything in SNOBOL4 is dynamic  • Factorial function • DEFINE(“FACT(X)”) TERMINAL = FACT(6) :(END) FACT FACT = EQ(X,1) 1 :S(RETURN) FACT = X * FACT(X – 1) :(RETURN) END • RETURN is a special label to return from a function

  32. More on functions • Wrong modification of previous program • DEFINE(“FACT(X)”)FACT FACT = EQ(X,1) 1 :S(RETURN) FACT = X * FACT(X – 1) :(RETURN)TERMINAL = FACT(6) END • That’s because execution “falls into” the definition of the function. If you run the above program you get a message like “RETURN from outer level”

  33. More on functions • Correct modification of previous program • DEFINE(“FACT(X)”) :(FACT_END)FACT FACT = EQ(X,1) 1 :S(RETURN) FACT = X * FACT(X – 1) :(RETURN) FACT_ENDTERMINAL = FACT(6) END • That’s a very standard style for defining functions • Similar to jumping past data in assembler

  34. More on Functions • Can have multiple arguments • DEFINE(“ACKERMAN(X,Y)”) • Can have local variables • DEFINE(“MYFUNC(A,B,C)L1,L2”); • No static scoping • The way both arguments and locals work • On entry, save old values, set arguments, set locals to all null values • On return, restore saved values

  35. The EVAL function • The function EVAL takes a string and evaluates it as a SNOBOL-4 expression • Here is a simple calculator program • LP TERMINAL = EVAL(TERMINAL) :S(LP) END • Note that since we are within a single program, variables etc stick around, so this is more powerful than it looks • Also assignments are expressions in SPITBOL!

  36. Running the Calculator Program b = 12 12 a = 32 32 b + a 44 c = "str" str c ? arb . q 'r' str q st

  37. The CODE function • Even more fun and games • The function CODE(str) takes a string and treats it as a sequence of snobol-4 statements and compiles them. • The result is an object of type CODE • The special goto form :<obj> will jump to the compiled code.

  38. A More Powerful Calculator • Here is a more interesting calculator • LP C = CODE(TERMINAL “; :(LP)”) :S<C> END • Here we take the input from the terminal, concatenate a goto LP so that control will return to the loop, and if no end of file and the code compiled successfully execute the code

  39. Calculator 2 at Work a = 6 b = 5 terminal = a + b 11 define("f(x)") :(e);f f = eq(x,1) 1 :s(return);f = x * f(x - 1) :(return) ;e terminal = f(6) 720

  40. Use of Predicates in Patterns • Here is a pattern that matches anbncn • That is: a string of a’s b’s c’s with equal number of each • abc = Pos(0). Span(‘a’) $ a. Span(‘b’) $ b *eq(size(a),size(b)). Span(‘c’) $ c *eq(size(a),size(c)). Rpos(0) • The calls to eq are made at pattern matching time and either fail or return the null string.

  41. Using Your Own Predicates • Here is a predicate that matches only strings of digits where the value is prime • prime = span(‘0123456789’) $ n. *Is_Prime(n) • You now write an Is_Prime function that returns the null string on true and fails on false. • A function fails by branching to FRETURN • eq(a,b) :s(return)f(freturn)

  42. Let’s end with a fun pattern • Find longest numeric string • DIG = '0123456789' LI = NULL $ W FENCE . BREAKX(DIG) . (SPAN(DIG) $ N *GT(SIZE(N),SIZE(W))) $ W . FAIL T = 'abc123def1234789xyz99!' T LI TERMINAL = W END Output of running this program is 1234789

More Related