430 likes | 585 Views
An Introduction to SPITBOL. Programming Languages Robert Dewar. SPITBOL Background. A series of string processing languages developed at Bell Labs (Griswold et al) SNOBOL SNOBOL-3 SNOBOL-4 Later Griswold developed ICON Based on SNOBOL-4
E N D
An Introduction toSPITBOL Programming Languages Robert Dewar
SPITBOL Background • A series of string processing languages developed at Bell Labs (Griswold et al) • SNOBOL • SNOBOL-3 • SNOBOL-4 • Later Griswold developed ICON • Based on SNOBOL-4 • SPITBOL is a fast compiled implementation of SNOBOL-4 (dewar et al)
Silly Acronyms • StriNg Oriented symBOlic Language • SPeedy ImplemenTation of snoBOL4 • Strictly speaking, SPITBOL is a dialect • Removes a few very marginal features • Adds a number of extensions
Dynamic Typing • A = 123 ;* A has an integerA = “BCD” ;* A has a stringA = array(10) ;* A has an array • Full typing information available • Full type checking done • But types can vary dynamically • No static declarations
Datatypes (partial list) • INTEGER (typical 32-bit signed integer) • REAL • STRING • Varying length string as first class type • Not in any sense an array of characters • ARRAY • TABLE • PATTERN • CODE
Basic Syntactic form • Line oriented • Labels in column 1 • Rest of line free format (keep to 80 cols) • Continuation lines have . (period) in col 1 • Comment line starts with * • Multiple statements on line using ; • But no ; normally after a statement • The combination ;* makes a line comment
More on basic syntax • Assignment uses = • Must have spaces around = • Must have spaces around binary operators • Must not have space after unary operator • Null operator (i.e. space) is concatenation
Simple Arithmetic • Normal arithmetic operators • A = 123 A = A + 2 B = 126 A = (A + B) / (A * B) • Note: precedence of / is lower than * so we could have written last line as: A = (A + B) / A * B
Real arithmetic • Same set of operators • A = 123.45 B = 27.55 C = A / B • Automatic widening of integers C = C + 1 ;* 1 treated as 1.0 here
Strings • Strings can be any length • String literals have two forms • Surround by “ can contain embedded ‘ • Surround by ‘ can contain embedded “ • Examples: A = “123’ABC” N = ‘b”c’ C = A N A ;* concatenation* C has value 123’ABCb”c123’ABC
Strings and Integers • Can auto-convert between string/integer • X = 123 K = X “abc” ;* K = string 123abc K = X “” ;* K = integer 123* concatenating with null is special as above X = “123” ;* X = string “123” M = X + 1 ;* M = integer 124 M = X + “a” ;* run-time error
Predicates • Predicates are functions that either return the null string (on true) or “fail” on false • Integer predicates: eq le lt ne gt ge • eq(1,2) fails ne(1,2) succeeds, returns null • Note: no space between function name and left parenthesis (rule applies to all functions)
Gotos and labels • A label is an identifier in column one • At the end of any statement can have a goto field in one of five forms: • :(Label) unconditional goto Label :S(B1) on success goto b1, on fail fall through:F(B2) on success fall through, on failure goto B2:S(F1)F(X) on success goto F1, on failure go to X:F(F1)S(X) on failure goto F1, on success go to X
Example of use of Labels • A simple loop (add numbers from 1 to 10) • N = 1 S = 0SUM S = S + N N = LT(N, 10) N + 1 :S(SUM) • Note that if LT(N,10) succeeds it returns null • The null is concatenated with the value of N • Now you see why the special rule that concatenating null does nothing at all!
Comparing Strings • Cannot compare using eq, ne • Since these work only for Integer, Real • For example EQ(“123”,”00123”) succeeds • But EQ(“ABC”,”ABC”) is a run-time error • So to compare two strings • Use IDENT(A,B) or DIFFER(A,B) to compare • Missing args are null so • IDENT(A) or DIFFER(A) checks for being equal to null or not equal to null
Input-Output • To write to standard output: • OUTPUT = string • To write to standard error: • TERMINAL = string • To read from standard input • LINE = TERMINAL • fails if no more input (end of file)
To Read/Write Files • Dynamicaly associate variables with the files and subsequent assignments write the file and subsequent references read. • Here is a file copy program • INPUT(‘IN’,1,”filename1”) OUTPUT(‘OUT’,2,”filename2”)CL OUT = IN :S(CL)END • End label ends program (always true) • 1 and 2 are unit numbers, must be unique
Pattern Matching • General format is • subject ? pattern • subject ? pattern = value • The ? can be omitted • Match may fail • If match succeeds in second form, value replaces matched part of subject • Pattern can contain strings or special pattern primitives
Pattern Matching Examples • Example: • X = “123AABCTHECAT” X ? “A” ARB “THE” = “HELLO” • Here ARB matches anything (special primitive) • Match is to left most occurrence • So ARB matches “ABC” • Resulting value in X is “123HELLOCAT”
Other primitives • These can be used as pattern components • LEN(int) matches int characters • ANY(“AB”) matches A or B • SPAN(“ “) matches longest spaces string • BREAK(“A”) matches up to but not incl ‘A’ • REM matches rest of string • BAL matches paren balanced string
Pattern Constructors • Alternation • P1 | P2 • Matches either P1 or P2, try P1 first • Concatenation • P1 P2 • Matches P1 then P2
Pattern Output • The use of the dot operator • STM = “label x = terminal” STM ? BREAK(‘ ‘) . L SPAN(‘ ‘) REM . S • If match succeeds (only if) period results in assigning matched part to given variable • After above match L = “label” S = “x = terminal”
Pattern Output • The $ operator is like the dot operator, but assignment is immediate • "ABC" ? ARB $ TERMINAL 'x‘END • Output is ten lines: (blank line) (arb matches null string before A) A AB ABC (blank line) (arb matches null string between A and B) B BC (blank line) (arb matches null string between B and C) C (blank line) (arb matches null string after C)
Patterns as Values • Patterns can be assigned etc • Vowel = ‘oe’ | ‘ae’ | ‘a’ | ‘e’ | ‘i’ | ‘o’ | ‘u’ • Cons = Notany(“aeiou”); • Now can use Vowel in a pattern • So a big pattern can be built up • Using a series of assignments to build it from component parts • Vowelseq = Arbno(Vowel) Isolatedcons = Vowelseq Cons Vowelseq • etc.
Fancy Recursive Patterns • Here is a BNF grammar for simple expressions • EXPR ::= TERM | EXPR + TERMTERM ::= PRIM | PRIM * TERMPRIM ::= LETTER | ( EXPR )LETTER ::= ‘a’ | ‘b’ | ‘c’ … ‘z’ • Generates strings like • a+b*(c+d)
First attempt at pattern • Here is a pattern matching that grammar • EXPR = TERM | EXPR ‘+’ TERM TERM = PRIM | PRIM ‘*’ TERM PRIM = LETTER | ‘(‘ EXPR ‘)’ LETTER = ANY(“abc .. xyz”) • Neat But wrong • Why, because when you execute the assignment to EXPR, TERM are null • EXPR = ‘’ | ‘’ ‘*’ ‘’ • That’s not what you want
Second attempt at pattern • Here is a pattern that works • EXPR = *TERM | *EXPR ‘+’ *TERM TERM = *PRIM | *PRIM ‘*’ *TERM PRIM = *LETTER | ‘(‘ *EXPR ‘)’ LETTER = ANY(“abc .. xyz”) • This works, because unary * means don’t look in the variable until pattern matching times.
More neat patterns • Match all palindromes • PAL = POS(0) ARB $ STR *REVERSE(STR) RPOS (0) • POS(0) matches null string at start • RPOS(0) matches null string at end • The unary * actually means don’t evaluate expression until pattern matching time, so reverse is called during the pattern match.
Arrays • Array created by call to array function • AR = ARRAY(50) • To index, we use <>, fail if out of range • To fill AR with integers 1 .. 50 • N = 0 LP AR<N = N + 1> = N :S(LP) • Multidimensional arrays allowed etc.
Tables • Like arrays but subscript can be anything • Implemented typically by hash tables • R = TABLE(100) LP S = TERMINAL :F(END) TERMINAL = NE(R<S>) S “given “ R<S> “times” R<S> = R<S> + 1 :(LP) END
Functions • Functions are defined dynamically • Everything in SNOBOL4 is dynamic • Factorial function • DEFINE(“FACT(X)”) TERMINAL = FACT(6) :(END) FACT FACT = EQ(X,1) 1 :S(RETURN) FACT = X * FACT(X – 1) :(RETURN) END • RETURN is a special label to return from a function
More on functions • Wrong modification of previous program • DEFINE(“FACT(X)”)FACT FACT = EQ(X,1) 1 :S(RETURN) FACT = X * FACT(X – 1) :(RETURN)TERMINAL = FACT(6) END • That’s because execution “falls into” the definition of the function. If you run the above program you get a message like “RETURN from outer level”
More on functions • Correct modification of previous program • DEFINE(“FACT(X)”) :(FACT_END)FACT FACT = EQ(X,1) 1 :S(RETURN) FACT = X * FACT(X – 1) :(RETURN) FACT_ENDTERMINAL = FACT(6) END • That’s a very standard style for defining functions • Similar to jumping past data in assembler
More on Functions • Can have multiple arguments • DEFINE(“ACKERMAN(X,Y)”) • Can have local variables • DEFINE(“MYFUNC(A,B,C)L1,L2”); • No static scoping • The way both arguments and locals work • On entry, save old values, set arguments, set locals to all null values • On return, restore saved values
The EVAL function • The function EVAL takes a string and evaluates it as a SNOBOL-4 expression • Here is a simple calculator program • LP TERMINAL = EVAL(TERMINAL) :S(LP) END • Note that since we are within a single program, variables etc stick around, so this is more powerful than it looks • Also assignments are expressions in SPITBOL!
Running the Calculator Program b = 12 12 a = 32 32 b + a 44 c = "str" str c ? arb . q 'r' str q st
The CODE function • Even more fun and games • The function CODE(str) takes a string and treats it as a sequence of snobol-4 statements and compiles them. • The result is an object of type CODE • The special goto form :<obj> will jump to the compiled code.
A More Powerful Calculator • Here is a more interesting calculator • LP C = CODE(TERMINAL “; :(LP)”) :S<C> END • Here we take the input from the terminal, concatenate a goto LP so that control will return to the loop, and if no end of file and the code compiled successfully execute the code
Calculator 2 at Work a = 6 b = 5 terminal = a + b 11 define("f(x)") :(e);f f = eq(x,1) 1 :s(return);f = x * f(x - 1) :(return) ;e terminal = f(6) 720
Use of Predicates in Patterns • Here is a pattern that matches anbncn • That is: a string of a’s b’s c’s with equal number of each • abc = Pos(0). Span(‘a’) $ a. Span(‘b’) $ b *eq(size(a),size(b)). Span(‘c’) $ c *eq(size(a),size(c)). Rpos(0) • The calls to eq are made at pattern matching time and either fail or return the null string.
Using Your Own Predicates • Here is a predicate that matches only strings of digits where the value is prime • prime = span(‘0123456789’) $ n. *Is_Prime(n) • You now write an Is_Prime function that returns the null string on true and fails on false. • A function fails by branching to FRETURN • eq(a,b) :s(return)f(freturn)
Let’s end with a fun pattern • Find longest numeric string • DIG = '0123456789' LI = NULL $ W FENCE . BREAKX(DIG) . (SPAN(DIG) $ N *GT(SIZE(N),SIZE(W))) $ W . FAIL T = 'abc123def1234789xyz99!' T LI TERMINAL = W END Output of running this program is 1234789