920 likes | 1.16k Views
An Introduction to Parrot. Dan Sugalski dan@sidhe.org. January 28,2004. Overview. What’s it all about. Purpose. Optimized for Dynamic Languages Perl 5, Python, Ruby specifically Run really, really fast Or at least as fast as reasonable under the circumstances Easily extendable
E N D
An Introductionto Parrot Dan Sugalski dan@sidhe.org January 28,2004
Overview What’s it all about
Purpose • Optimized for Dynamic Languages • Perl 5, Python, Ruby specifically • Run really, really fast • Or at least as fast as reasonable under the circumstances • Easily extendable • Easily embeddable • Play Zork
History How we got where we are
OSCON 2000 • Infamous mug pitching incident • Perl 6 started • Language and software developed separately
Perl 6 -- not too much bigger • That hasn’t lasted • Allison’s talking about that one • The start was smallish, though • Fix the annoyances • Amazing how many things turned out to be annoying
Big language umbrella • Not much semantic difference between Perl 5, Python, and Ruby • Perl 6 was obviously going to borg them and a bit more • Even ML and Haskell haven’t been safe • More concepts have gone in as time has progressed
Parrot went for them all • Yeah, we were getting bored • Had to do something • We liked Ruby and even Python • We hated having multiple interpreters around
Parrot and the Parrot Prank • 2001 April Fools Joke • Perpetrated by Simon Cozens • Parrot -- New language • Perl & Python Amalgam • Pretty funny as these things go
Timeline • The project came first • Then, the Parrot Joke • We grabbed the name
Non-Purpose • Don’t care about non-dynamic languages • Not much, at least • Other people can worry • Engineering tradeoffs favor dynamic languages
True language neutrality is impossible • Vicious sham • All engines have a bias • Even the hardware ones • Processors these days really like C
Architecture How it’s supposed to look
Buzzwords • Register based, object-oriented, language agnostic, threaded, event-driven, async I/O capable virtual machine • No, really
Software goals • Fast • Safe • Extendable • Embeddable • Maintainable
Administrative goals • Resource Efficient • Controllable • Not suck when used as an apache module • Cautious about whole-system impact
Driving assumptions • C function calls are inexpensive • L1 & L2 caches are large • Memory bandwidth is limited • CPU pipeline flushes are expensive • Interpreter must be fast • JIT a bonus, not a given
User Stack Interpreter Core String registers Integer registers Lexicals Globals Float registers PMC registers Control Stack Frame Stack Frame Stack Frame Stack Frame Stack Interpreter Core in Pictures
Parser • Source goes in, AST comes out • Built in part on perl 6 rules engine • Pluggable parser architecture
Compile and optimize (IMCC) • Turns the output of the parser into executable code • Optional optimizing step • Register coloring algorithms provided here
Execution • Interpreter • JIT • C code • Native executables
Base Engine • Bytecode driven • Platform-neutral bytecode • Register-based system • Stacks • Continuation-passing style
Bytecode • Directly executable • Resembles native executable format • Code • Constants • Metadata • No BSS, though
Designed for efficiency • Directly executable • mmap()ped in • Only complex constants (strings, PMCs) need fixup • Converts on size and/or endian mismatch
Platform Neutrality • If native format, used directly • Otherwise endian-swapped • Off-line utlity to convert • Only difference is speed hit on startup
Registers • All operations revolve around VM registers • Essentially CPU registers • Four types • Integer • Float • String • PMC • 32 of each
Registers • Parrot’s one RISC concession • Non-load/store must operate on registers or constants • JIT maps VM registers to platform registers if there are some • Otherwise pure (and absolute) memory addressing to VM registers
Stacks • Six stacks • One general purpose typed stack • Four register backing stacks • Push/pop half register frames in one go • Faster than push/pop of frames to general stack • One control stack
Stacks • Bit of a misnomer • Really tree of stack frames • Confusing, though
Continuation Passing Style • Used for calling conventions • Parrot makes heavy use of continuations • If you don’t know they’re there you’ll not care • All Ruby’s fault, really • Hidden from HLL code
Parrot’s data Where the magic lives
Data isn’t passive • Lots of functionality hidden in data • Partly OO • Or as OO as you get in C
Strings • Language neutral • Encapsulate language behavior, encoding, and character set • Annoyingly complex
Basic String Diagram Buffer Info Encoding Charset Language Flags
Encoding • Represents how the bits are turned into ‘characters’ • Code points, really • Even for non-unicode encodings • Handles transformations from/to storage
Character Set • Which characters the code points represent • Basic character manipulation happens here • Case mangling, substrings • Transformations to other character sets
Language • Nuances of sorting and case mangling • Interpretation of most asian text when using Unicode • Ignorable if you don’t care
Unicode • Parrot does Unicode • Used as pivot encoding/charset • IBM’s ICU library • Didn’t want to write another badly done unicode library
Efficiency concerns • Multiple encodings/charsets means less conversion • Transform data only when needed • Strings are mutable • COW system for space/speed efficiency
The PMC • Represents a HLL variable • Language agnostic • Everything pivots off PMCs
Vtable Flags Cache Data Pointer Metadata GC handle Synchronization PMC diagram
The Vtable • How all the functionality is implemented • Almost everything defers to PMCs • Large part of interpreter logic in PMCs • Allows fast operator overloading and tying
Addition Subtraction Multiplication Division Bitwise operations Loading Storing Comparison Truth Type conversion Logical operations Finalization Some vtable operations
Vtable functions may be Parrot • How languages implement user operator overloading • Used for perl-style tying • Usable for operator wrapping
PMCs are typed • Types can change • Allows customized behavior • Cuts out some overhead
All PMCs indexable • As array or hash • Operations may be delegated • PMC may be both hash and array • Scalar as well
Multimethod dispatch • Core interpreter functionality • Used for many PMC operations • Beats hand-rolling it • Dispatch surprisingly fast
Magic all hidden • User code never knows about magic • Allows transparent behaviour changes • One big pivot point for dispatch
Objects • Standard but optional object system • Standard object protocols • Standard object opcodes
Everything can be an object • Objects have attributes • Objects can have methods call on them • All PMCs have get/set attribute vtable entries • All PMCs have a method call entry • Therefore, all PMCs are objects