690 likes | 940 Views
Programming Parrot. Dan Sugalski dan@sidhe.org. Important safety tips!. Parrot has no safety net You can crash pretty easily Parrot’s not entirely bug-free All the ops are documented perldoc ops/ file .ops (Knowing the right file can be interesting). PASM & PIR. PASM - Parrot assembly
E N D
Programming Parrot Dan Sugalski dan@sidhe.org
Important safety tips! • Parrot has no safety net • You can crash pretty easily • Parrot’s not entirely bug-free • All the ops are documented • perldoc ops/file.ops • (Knowing the right file can be interesting)
PASM & PIR • PASM - Parrot assembly • PIR - Parrot Intermediate Representation • PASM is simple • PIR is convenient • Mix and match across files
Standard extensions • .pasm - Parrot assembly • .imc - Parrot intermediate representation • .pbc - Parrot bytecode
Invoking Parrot • parrot foo.pbc • parrot -o foo.pbc foo.imc • parrot -o foo.pbc foo.pasm • parrot foo.pasm • parrot foo.imc • parrot -o foo.pasm foo.imc
Extension autodetect • Source automatically compiled and run • Input type based on extension • Output type based on extension • No automatic decompilation, though
PASM • Implicit main routine • Everything manual • Simple • No help from parrot
PIR • Compilation units required • Infinite register store • Calling convention shortcuts
Behold, the opcpde • opname [dest,] [source[, source…]] • add I2, I3, I4 • print “some thing\n” • PIR shortcut: dest = op source1, source2 same as op dest, source1, source2
Hello, world! print “Hello, PASM world!\n” end
Hello, world! .sub foo @MAIN print “Hello, PIR world!\n” end .end
Constants • Signed integer constants • Floating point constants • Single or double quoted string constants • Standard string backslash escapes apply
Named constants .const typename = value .const int five = 5 .const str error = “Don’t do that!” • Constants are scoped to the compilation unit
Include files .include “filename” • Includes the file as if it were right there • Nothing fancy • No path searching right now
Defining Subroutines .sub sub_name Sub body here .end
Subroutine properties • Go after sub name, comma-separated • method, prototyped, @LOAD, @MAIN • method - self filled in with calling object • prototyped - takes or returns non-PMCs • @MAIN - main sub for bytecode file • @LOAD - sub to execute on bytecode load
Virtual Registers • Sub-local • $In, $Pn, $Nn, $Sn • Assembler auto-spills for you
Basic Math • add, sub, mul, div • mod and cmod • inc and dec • Transcendental math • Docs: ops/math.ops
PIR shortcuts • Infix math works • Normal notation works • I3 = I5 + I6 • $P6 = $P6 - 5 • result = source * 4 • C shortcuts work • $P6 += 4
Safety tip • PMC destination generally must exist • Op docs say abs(in PMC, in PMC) must exist find_global(out PMC, in STR, in STR) returned
Creating PMCs • new dest, type • dest = new type • Type can be a predefined constant line = new Integer line = foo + 12 • Undef is a good type for temps (it changes type on first assignment)
Boolean Integer String Float Null Undef Env FixedtypeArray ResizeabletypeArray Basic PMC types
Finding PMC types • Basic types have predefined constants • User types can be found with find_type $I4 = find_type ‘String’ • Type of an existing PMC with typeof $I5 = typeof $P5 $S5 = typeof $P5
Named temps .local typename • sub-local • Automatically spilled, just like virtual registers • PMC locals must still be new’d • Virtual registers don’t automagically have a valid destination
Basic string ops • Concatenation: $S4 = concat $S5, $S6 • Repetition $S5 = repeat “ “, 10 • Finding substring offset = index source, substring[, start] • String length strlen = length $S103
String length tricky • Bytes, code points, and characters • bytelength, codepointlength, graphemelength, respectively • length is shortcut for graphemelength • Part of encoding upgrade in process
Calling Subroutines foo() foo(1, 2, $S4, $P5) result = foo(1, 2, 3) (result1, result2) = foo(1, 2, 3) • Can call a PMC representing a sub, or a sub in the current namespace • No typechecking!
Returning values .pcc_begin_return .result xxx .pcc_end_return • No type checking. Get it right or find bizarre bugs later
Moving around • Labels end with a colon exit_spot: • Labels are sub local • branch goes to a label branch exit_spot • jump goes to a label too • branch is relative, jump absolute • Use branch
Branch basics • Branches are relative • 2G offset plus or minus • We’ve not found this to be a limitation in practice
Tiny subs • bsr branches to a label too • Pushes return address on stack • ret will return from a bsr • Much lighter-weight than a sub call • No calling conventions • Must stay within a sub
Conditional Branch: if if thing, label • Tests thing for truth, branches if true • Numbers: 0 is false • Strings: Empty string or string “0” • PMCs: We ask them
Conditional Branch: unless unless thing, label • Branch if false
Comparison branches • eq, ne, lt, gt, ge, le • Same type, or low type and PMC • Each has an _str and _num variant to force string or numeric comparisons
Existence branches isnull thing, dest • Used for strings and PMCs • Branches if the register has a NULL pointer or a Null PMC in it
PIR shortcuts if thing1 op thing2 goto label • Op is <, >, <=, >=, !=, == • No then in there
Assignment and value setting • Int and Floats are value types • PMCs and Strings are reference types • Simple assignment with set set I5, I6 set S4, S6 • Copies contents of register • Aliases PMCs and Strings
Assignment and value setting • Assign copies the data assign $P4, $P5 • Calls destination’s set function • Works for strings and PMCs
Cross-type assignment • Source and destination different basic types • Performs automatic conversion • PIR = does assignment $S5 = $I5 $N5 = $P5
Safety tip • PIR = operator is set $P5 = $P6 • Same as set $P5, $P6 • Not assignment!
Input and output • Basic filehandle-based IO system • Async & event driven under the hood • (When we get that finished)
Output • Simple output with print • Prints any type, as well as constants • Integers and floats get stringified • PMCs get their __get_string methods called • To stdout by default • Or provide a filehandle
Output print “Hello, world\n” printerr “Hello, error world\n” print some_filehandle, “Hello, filehandle\n”
Input • Block-oriented reads $S1 = read 10 $S2 = read $P4, 10 • Line-oriented reads $S2 = readline $P4 • Line reads stop at newline or 64k, whichever comes first
Standard filehandles • getstdin, getstdout, getstderr fetch filehandles $P5 = getstdin $S5 = readline $P4
Opening files • Open with open • $P5 = open “filename”, file_mode • Returns an undef on error • File modes are perl-like: • >, <, >>, +<, +> • fopen r, w, a, r+, and w+ modes • Only files right now
File info • stat and fstat ops return stat info on files • stat uses filenames, fstat file numbers $I5 = stat ‘foo.txt’, STAT_FILESIZE $I6 = fstat 2, STAT_FILESIZE • Constants defined in runtime/parrot/include/stat.pasm • Both portable and non-portable data there
Global variables • Parrot has a hierarchic, unified global variable store • Each sub has a default namespace • Set with .namespace directive .namespace [‘Foo’; ‘Bar’] • Global store only holds PMCs
Fetching globals • Fetch with find_global $P6 = find_global ‘globalname’ $P6 = find_global [‘foo’; ‘bar’], ‘globalname’ • Fetches a pointer! • Changes to PMC will be changed in global store • Throws exception on failure