1.18k likes | 1.49k Views
Mango. A General Purpose Programming Language. My Background. Early experience on Apple II University of Illinois – Champaign – Urbana. Bachelor's degree in computer engineering Three years at Neoglyphics: software Three years at Alpha: hardware. How Mango Got Started.
E N D
Mango A General Purpose Programming Language
My Background • Early experience on Apple II • University of Illinois – Champaign – Urbana. • Bachelor's degree in computer engineering • Three years at Neoglyphics: software • Three years at Alpha: hardware
How Mango Got Started • Frustrated with C/C++/Java • There had to be a better way • Foolishly began designing my own language • Foolishness can be a virtue • It’s been a seven year voyage
The Problem • Common computing infrastructure is written in C/C++ • C/C++ is inadequate • Lack of higher level abstractions • Programmers must use low level constructs • Results of C/C++ use • Unsafe/unstable software • Slower development times • Higher development costs
A lack of alternatives • Java, Python, Perl, Pascal, Ada, Modula, C# • Not viable replacements • Lack C’s virtues in performance and flexibility • Dependent on C for core tasks • Lack of widespread appeal (clumsy, i.e. Ada?) • Not sufficiently different to switch
The Solution: Core Goals • Provide higher level abstractions • Avoid low level constructs when not needed • Make programming easier, more enjoyable • Retain performance and flexibility • Allow unrestricted operations as necessary • Avoid overhead • Match machine execution model • Overall: make a better experience for programmers
Overview • High level design goals and decisions • Feature walk through • Future directions
Design goals • Syntax • Static Typing vs. Dynamic Typing • How Vs. What • Large Languages Vs. Small Languages • Object Orientation: Yes or No
Goal #1: A Good Syntax • Syntax is key • It’s underrated (focus on semantics) • Makes the language easier to learn • Makes it accessible to non-programmers • Makes the language self-documenting • Marketing versus engineering • Bad marketing of a good product will fail • Bad syntax around good semantics will have a harder time gaining acceptance
Static versus Dynamic Typing • Dynamic languages are very popular • Due to poor implementations of static languages • Advantages of static typing • Critical for performance • Types act as documentation • They catch many errors at compile time • Types allows overloading of names
How versus What • CS fantasy to forget how and focus on what • How is a hard problem • Distinguish features that are theoretically equivalent but practically different • Everything can be a list, but it’ll be slow • Rich set of primitive and aggregate types • Side effects to use memory more effectively • Reduce copying • Manual memory management • GC is not possible for some applications • Done right beats a garbage collector
Large Vs. Small Languages • Small language • Secondary features are in a standard library • Advantages: easier to learn core features, make a compiler • Large language • First class treatment of secondary features • Allows specialized operations, makes programs more readable • Advantage: a smoother user experience • Ease of learning is dependent on more than language size • You still have to learn library API’s • Intangible quality: how the language corresponds to human cognition • Open source makes compilers easier to write • Open front end acts as the spec
Object-Orientation: Yes or No? • Stepanov’s criticism • programming = data structures + algorithms • multi-sorted algebras • Object orientation has shown itself useful in certain circumstances • Offer OO as an option • Leave inheritance behind • Inheritance hierarchies are difficult to follow • Fragile base class problem requires reanalysis of class behavior
Mango walk through • Influences • Syntax plus some basic examples • Module system + incremental compilation (Include requires clause, parameter and options) (platform and foundation files) • Naming and overloading • Literals (include string format) • Primitive Types • Memory model (include pointer/reference syntax) • Records and Abstracts • Procedures, functions, constants (pure functions?) • Statements: Control flow • Statements: I/O • Abstract Data Type (objects) • Iterators and iteration operators • Mutexes • Exception Handling • Strings, arrays, buffers • Collection Types • Global variables/External symbols/Module Constructors • Calling convention part of the type system • Packet/Device types • Development aids • Genericity
SETL - set operations ALGOL - imperative block structure and syntax C - low level ops, low overhead ML - type inference, type syntax ADA - fine grained control over primitives PYTHON - indentation based syntax JAVA - interfaces C++ - STL, operator overloading, IO syntax CLU - iterators PASCAL - sets (bit masks) PERL - richness of expressibility COBOL - readable syntax SIMULA - objects
Mango’s Syntax • Mango looks like pseudo code • Indentation based syntax • Reduces clutter and typing • Allows more code to be shown on the screen • Blocks can be enclosed with begin/end delimiters if desired • All directives, definitions, statements begin with a keyword • Simple style that is easy to remember • User’s symbolic names cannot conflict with reserved words • This makes the language easy to extend • There are no exclusively reserved keywords • Legacy of lex?
Modules • Module is a set of symbols • Each file is a module • Module name must correspond to pathname • A modules symbols can be public or private • Public: symbols are visible to other modules • Private: symbols are invisible to other modules • Modules may import and include other modules • Import: foreign symbols are localized (private) • Include: foreign symbols are exported (public)
Incremental Compilation • Mango supports incremental compilation • Module has changed • A dependency has changed • Major or minor revision? • Compares syntax trees • Major revision: public change • Minor revision: private change • Comparing syntax • Advantage: Eases implementation • Disadvantage: Set of major revisions is obviously larger
Naming • Naming is not hierarchical, but geometric • Symbol names exist within a four-d grid • Namespace, keyword, extension, auxiliary • Only namespace and keyword are mandatory • Each module is part of a namespace • Public symbols use declared namespace • Private symbols use special namespace “local” • Format: • namespace::keyword@extension$auxiliary
Shallow Overloading • Overloading occurs for every imported or included module • 4d namespace is collapsed into 1d namespace • Utilizing keyword or extension • Other partial namespaces as well • Symbol can be access using proper name or alias • Ensures all overloaded symbols have a unique name • As a result, all overloading is superficial or shallow • Operator overloading is also supported
Mango’s Memory Model • Value semantics • Put stuff on the stack, particularly primitives • Key for performance • Offers more flexibility • Three types of memory • Static: • Compiled data, global variables • Heap items that are never deleted • Arbitrary: • Heap items that are eventually deleted • Local: • Items that live on the stack
Safe manual memory management • Static datums • Always safe to use • Arbitrary datums • Need to be guarded to avoid dangling pointer references • Local datums • Compiler must enforce restrictions to avoid dangling pointer references
Sentries • Pointer guards are called sentries • Pointers to arbitrary datums are fat • One address to the datum on the heap • One address to the datum’s sentry • Sentries live on the heap too • Have a static lifetime (i.e. never deallocated) • They are very small ~= 5 bytes
Sentry Performance • When # sentries is small • Good performance on modern hardware • Sentries stay in the cache • Half of the processor’s time is spent waiting on memory • As # sentries increases • cache starts to overflow • We need to reduce the number of sentries
Arenas and Recycling • Method #1: Allocate in pools • A group of datums share a sentry • Allocated arbitrarily, but deallocated at once • Method #2: Recycling • Use static datums instead • When static datums are deleted • Initialized to zero • Stay on the heap until datum of the same type is requested • Incorrect results are possible, catastrophic failures are not • The program cannot break the type system
Literals • Definition • A value which is known at compile time • Types • Immediate values • Numeric primitives and text • Literal expressions • Interpreted during compilation • Parameters • Values used to configure the compiler • Options • User supplied values to customize a build • Literals can be named
Numeric Literals • Six types • Integer, Decimal, Hex, Address, Binary, Signal • Integers and decimals also include complex plane signifiers • Can be anonymous • Anonymous literals are stored as untyped strings • Converted to a real value when type is known • There are no constraints on range • Can be typed • Type is specified with value • Converted immediately to the desire type value • There are no constraints on range
Text Literals • Two types • Characters and Strings • Stored as untyped strings until desired type is known • Characters enclosed with the back tick • Text string enclosed with the double quote • Literals can be inserted into characters and text strings • Named literals • Parameters • Options • Character codes • Character aliases
Literal Expressions • Two forms of literal expressions • Normal literals: immediate evaluation • Macro literals: deferred evaluation • Macros are evaluated over arguments • Result value is optionally typed • Expressions can include • Condition construct • Conversion construct • Local aliasing
Parameters and Options • Changes cause recompilation of module • Part of the public interface of the module • Checked when comparing syntax • Only options that are used included in dependency analysis • Parameters included in dependency analysis • Specified by user • Have a major impact on compilation
Core Types • Primitives • Tuples and Unions • Addresses, Pointers, References • Polymorphic Types • Strings, Arrays and Buffers • Collections • Records and Abstracts
Type Qualifiers • Pliancy: Immutable, Mutable • Reactivity: Volatile, Inert • Duration: Local, Static, Arbitrary • Memory: IO, Virtual, Physical? • Useful for embedded system with multiple memories • Undecided: means to access hardware registers directly
Primitives • Logical • Bit (2 State), Boolean (3 State), Signal (4 state) • Ordinal • Range from 0 to N, where N is user specified • Character • ASCII, UTF 8, UTF 16, UTF 32, 8 Bit Data • Register • Binary register, user specified dimensions • Signal • Signal bus, user specified dimensions
Primitives (cont’d) • Cardinal • Unsigned integer, 1/2/4/8/16/32/64 bits • Subrange • Signed range, upper/lower bound, default value • Integer • Signed integer, 8/16/32/64 bits • Rational • Signed rational, fixed or floating denominator • Decimal • Fixed point decimal number • Whole and fractional component • Number • Floating point number, specified size
Primitives (cont’d) • Complex numbers • Primitives qualified with units • Enumerations • Matrices • Coordinates
Primitive Modifiers • Conversion • Automatic, manual, none, universal • Evaluation • None, Fixed, Fluid • Approximation • Round, Truncate, Conserve • Overflow • Check, Limit, Wrap • Byte Order • Big, Little, Host, Network
Tuples and Unions • Anonymous type products • Fields can be labeled • Unions • Each term of product is overlapped • Unsafe • Elaborate types decompose into tuples
Addresses, Pointers, References • Addresses have bitwise resolution • Address is 2 product tuple • Upper value: byte count • Lower value: bit count • Addresses still optimal • types with byte-wise alignment will drop bit count • Pointers • Address of type where address is significant • References • Address of type where type is significant
Polymorphic Types • Sums • Superposition of multiple types • Qualified with type tag to ensure safety • Handle • Anonymous pointer (points to anything) • Anything • Stores any type value
Strings, Arrays, Buffers • Arrays • Dimension type can be customized • Slices of arrays preserve range information • Strings • Array of items from 1 to x • Dimension type can be customized • Slices of strings begin at 1 • Buffers • Fixed length string that wraps around itself • Varying start and end positions • Dimension type can be customized
Collections • Entry • a node of a linked list • Segment • a combination of string and pointer • List • appends new data at the end • elements allocated in pages • Stack • FIFO • Can be prioritized • Sequence • inserts new data • elements allocated in pages • Queue • LIFO • Can be prioritized
Collections (cont’d) • Mask • A bit mask • Range • A numeric range with upper and lower bounds • Set • A hashed set • Doubles as a one-to-one map • Table • A hashed one-to-many mapping • Group • A special collected used for comparisons • Graph • Used for frequency tables
Records • Three visibility states • Public: accessible anywhere • Protected: accessible by modules that declare access • Private: accessible within the module • Two layout types • Fixed: in order of declaration • Packed: ordered to reduce record size • Fields can be qualified to remove them from the build
Abstracts • Interface to multiple record types • Records mapped to abstracts using link directive • Gives more flexibility in mapping abstracts • Mappings can occur after declaration • i.e. library types can still be mapped to abstracts • Simplifies remapping of fields • Fields can be qualified to remove them from the build • Medley: combines abstract with a group of records