530 likes | 762 Views
How Should We Think about Programming Languages?. I use to think about them under the conceit of Aristotle and Copernicus cosmology with these forming two poles of a continuum
E N D
1. Stanley B. Lippmanslippman@microsoft.com Evolving C++ onto the CLI Environment
Integrating a Static and Dynamic Programming Model
2. How Should We Think aboutProgramming Languages? I use to think about them under the conceit of Aristotle and Copernicus cosmology with these forming two poles of a continuum …
Aristotlean languages impose a top-down vision independent of the actual machine technology of the day …
CLU, Scheme, Self
Copernican languages emerge from a bottom-up discovery of the actual machine technology of the day …
FORTRAN, C/C++/Java
And you could then cluster the other languages – CLOS, Smalltalk, Ada – along the continuum …
3. How Should We Expect to be Thought of as Programmers?
Under this conceit, one would expect
the Copernican languages to be widely used, but
the Aristotlean languages to be widely admired.
the Copernican languages to come out of industry [labor], but
the Aristotlean languages to come from the academy [thought].
the Copernican programmer to be considered less intelligent and therefore less worthy of respect, and
the Aristotlean programmer to be … well, you get the idea …
Admittedly, this only gets us so far … but I see nothing in the history of programming to invalidate the conceit.
4. Let’s Step Back a Moment
One of the cool things about computer science is that it is a very young field – therefore, we can hold it in our mind’s eye – that is, if we constrain it to the digital era:
Forget Leibniz, although he was a genius and a visionary, and deserved better from Newton and the Royal Society.
Forget Babage, although he came very very close, and
Forget Ada Lovelace, although she provides an attractive alternative to the image of the programmer as a soulless nerd.
So, what was the language used by the ENIAC computer, which I am going to start from?
5. There Was No Language Oddly enough, the idea of an independent program – let alone the idea of a programming language – wasn’t part of the original invention of the computer.
Rather, cables were plugged into one configuration for that formulation or reconfigured for this formulation, and so on.
This is where the term tight coupling originated … ?
6. Programs Dwell in aComputational Environment
The modern computing era began without the concept of either a program or a programming language.
Of course, a transcription of a mathematical formula into a format within the computer was necessary, but
it was not thought of as a symbolic program notation
there was no concept of inventing a language …
there was no such entity thought of as a computer programmer …
The immediately intractable problems were hardware –
could the vacuum tubes persist long enough to maintain a computation.
the math was done in base 10 …
7. Von Neumann Hijacked the ENVAC The successor to ENIAC was the EDVAC.
Von Neumann arranged to be taken on as a consultant to the ENIAC project…. He now become an avid supporter of the Moore School’s work, legitimizing it in the eyes of the scientific establishment, and he was helpful in its getting the EDVAC contract.
On June 30, 1945, a 101-page document arrived at the Moore School from von Neumann….Von Neumann’s paper on EDVAC was replete with references to neurons and other parts of the human nervous system, comparing them to the automatic computer… The machine he described had a stored, programmable memory.
Again, the ENIAC computer had no concept of a stored program, let alone a programming language. It cost $486,804.22. We’re told it
solved problems in 15 seconds that would have required several weeks’ work by a trained main.
solved in 2 hours a problem which would have taken 100 trained men a year to solve manually.
8. Programs and Program Languages A Dialectic with the Computational Environment
The introduction of the program solved logistical bottlenecks of the pre-existing computational environment …
Trade-off of decoupling the processing from the program:
Faster, `automatic’ loading of the program
the need to invent and implement a software abstraction layer … in this case, a loader …
This decoupling has been accelerating …
9. The Evolution of Complex Structure … Software did not begin as software – it was a hard-wired configuration – a dance without a separate dance notation.
This evolved into a reproducible program bit map that could be loaded and flushed from memory. A purely numeric representation.
The first abstraction level, in a sense, was the use of hexidecimal over binary …
The assembler formed a nucleus of a symbolic representation of a program but still at the level of individual instructions that could be grouped by function. A mnemonic representation.
This was controversial and Grace Hopper reports that it was resisted by a portion of the very small number of programmers who felt it was getting too far from the machine.
At each stage, more software complexity is introduced between the program representation and the machine.
10. The Invention of Programming Languages Created a Tradition of Uncivility …
FORTRAN, of course, was a proof of concept that we could program as a higher level of abstraction and still generate efficient code …
They eliminated aspects of the design if it proved too difficult to compile … this in itself was proof against its purity of design …
FORTRAN also began the language wars …
The disappointed ALGOL team described it as graffitti written on a bathroom wall
They attributed its success to the 800-lb guerilla that was IBM
These kinds of battles have never ceased – they’ve just changed as the languages have. Why?
11. A Darwinian Way to Think about Programming Languages
Programming languages are a response to a particular computational environment:
facilitates expression within a current environment …
improves on one or a set of existing program solutions …
provides a vocabulary and shared point of view … a community
12. All Languages Become Extinct …
As the computational environment changes, the more specialized the language to the previous computational environment, the less adaptive it proves in the new environment …
But the historical accumulation of structure seems to overwhelm these efforts.
The conditions that give rise to a language leads to its eventual extinction …
There are many more extinct
than active programming languages
13. All Languages Compete for Scarce Resources …
Although a language is not an organism, there is a continual struggle for survival among its population …
There is a competition for finite budgetary resources to feed new projects and sustain existing one …
There is competition to reproduce in the minds of a new generation of programmers.
Language wars are virtually bloody both in tooth and claw
14. All Languages Resist Extinction …
Typically, the language leaders at some point cease resisting and attempt to readapt the language to the changing environment …
this however may backfire … emphasizing its current maladaption to the new environment …
The population of a language constricts when it fails to reproduce in the minds of the new members of the community
Dropping below a certain threshold, it no longer has the critical mass to command finite budgetary resources
It is relegated to unique niche environments – the deserts and swamps of the software development landscape …
15. So, Where Is This Leading Us? The Common Language Infrastructure (CLI) is a major consolidation of thoughts about a software abstraction layer between the program and the Operating System.
This is not in itself new – Smalltalk carried its own environment, and Java targets its own virtual machine.
What is new is its inclusiveness: it supports over 30 languages ... It rightly diminishes the focus on languages …
This is what we look at briefly in the next section.
16. So, Where Is This Leading Us? C++/CLI is an adaptation of ISO-C++ to the dynamic programming object model of the CLI. It follows a tradition of C++ adaptations:
C with Classes (~1979) (ADT)
Object-Oriented Programming (~1984) (OO)
Generic Programming (~1991) (Templates)
Dynamic Programming (~2005) (CLI)
This is what we look at in the last section of this talk
17. Therefore, it puts existing languages at risk, and provides an opportunity for new languages to thrive. The CLI Changes
the Computational Environment
18. A Note on Terminology I will be speaking of both the CLI and CLR as kind of counterpoints of one another. Here is their formal relationship …
Common Language Infrastructure (CLI)
This is an ECMA/ISO platform-independent standard. It represents the abstract facilities/architecture.
Common Language Runtime (CLR)
This is the Windows Operating System implementation of the CLI. This is what we mean when we speak of .NET …
19. The CLI/CLR Provides a VES A Virtual Execution System (VES) provides an environment for executing managed code
It provides a software layer between the managed code and the native operating system.
It is responsible for loading and running programs …
It provides the services needed to execute managed code and data …
Garbage collection, for example, is an aspect of the VES, not of a particular language …
20. Architectural Overview
21. The CLI/CLR Provides a CIL Each CLI language is compiled down into a Common Intermediate Language (CIL) based on a stack program model.
All tools ideally work off of the CIL, and are therefore shared across all languges … browsers, debuggers, and so on.
New tools target the CIL, not a language …
Metadata is generated in parallel describing both the program and its environment … this allows automation of many previously manual `plumbing’ …
An extensive object-oriented Base Class Library (BCL) framework is shared across all CLI languages …
22. Metadata The Lifeblood of the CLI
23. The CLI Provides a CTS The Common Language Infrastructure (CLI) defines a Common Type System (CTS) over which all CLI languages are built.
A unified type system rooted in an Object base class.
All types and literal values have an underlying class representation
All types are guaranteed to be a kind of Object and share a common set of operations
All types can be converted to an instance of type Object.
All types have an associated Type class that provides runtime reflection support.
24. The CLI Provides a CTS The Common Language Infrastructure (CLI) defines a Common Type System (CTS) over which all CLI languages are built.
Separation of class types based on behavior/design charactertistics:
Reference class is polymorphic: supports OO design
Value class is blitable: supports small, efficient independent types
Interface class is abstract: supports defining families of services
25. The CLI Provides a CTS A secondary set of types including numeric types, a delegate type, an event type, an array type, an enum type …
A single class inheritance model with support for multiple interface inheritance
Each CLI language generally exposes these to the programmer as built-in language facilities … this is a first order design aspect of building a CLI language.
26. The CLI Provides a CLS The Common Language Specification (CLS) defines a set of restrictions on the Common Type System (CTS) to ensure interoperability among CLI languages.
These rules apply to
types that are visible in assemblies other than those in which they are defined.
Members that are accessible outside the assembly.
CLS-compliant code is guaranteed to be both consumable and inheritable by all CLI languages.
The canonical example of a CLS constraint is to prohibit unsigned integral values as part of the public interface …
27. Adapting C++ to this New Environment
28. C++/CLI represents a tuple … C++
The first term, C++, refers of course to the C++ programming language invented by Bjarne Stroustrup at Bell Laboratories.
It supports a static object model that is optimized for the speed and size of its executables.
It does not support run-time modification of the program other than, of course, heap allocation.
It allows unlimited access to the underlying machine, but very little access to the types active in the running program and no real access to the associated infrastructure of that program.
29. C++/CLI represents a tuple … CLI
The third term, CLI, refers to the Common Language Infrastructure, a multi-tiered architecture supporting a dynamic component programming model.
In many ways, this represents a complete reversal of the C++ object model.
A runtime software layer, the virtual execution system, runs between the program and the underlying operating system.
Access to the underlying machine is fairly constrained.
Access to the types active in the executing program and the associated program infrastructure – both as discovery and construction – is supported.
30. C++/CLI represents a tuple … /
The second term, slash (/), represents a binding between C++ and the CLI.
So, a first approximation of an answer as to what is C++/CLI is to say that it is a binding of the static C++ object model with the dynamic component object model of the CLI.
The design of this binding is the focus of the rest of this talk.
31. The Architectural Underpinning The Design of C++/CLI
32. The 3 Elements of a CLI Language There are three aspects in the design of a CLI language that hold across all languages.
A mapping of language level syntax to the underlying Common Type System.
A choice of a level of detail to expose of the underlying CLI infrastructure to the direct manipulation of the programmer.
A choice of what additional functionality to provide over that supported directly by the CLI
Item #1 is largely the same across all CLI languages. Items #2 and #3 are what distinguish one CLI language from another.
I like to think of these three items as representing coordinates positioning each language in a three-dimensional design space supported by the CLI.
33. Mapping to the Common Type System This design aspect is common to all CLI languages – the syntax of course varies. So, for example,
public abstract class Shape {…} // C#
public ref class Shape abstract {…}; // C++/CLI
Shape s = new Cube(); // C#
Shape^ s = gcnew Cube; // C++/CLI
represents the C# and C++/CLI support to define an abstract CLI reference class and allocate a derived instance on the CLI heap.
Our choice of syntax is based on an attempt to closely integrate the CLI class types with that of ISO-C++.
34. The C++/CLI Types
ref class R abstract {};
value class V{};
interface class I{};
ref class R2 : R, I {};
enum class E : short { e1, e2 };
delegate void D( signature );
event D handler;
array< T, dim > a;
35. Level of Detail The second design aspect reflects the level of detail of the underlying CLR implementation model to incorporate into the language.
How does one go about determining this?
What are the kinds of problems the language is likely to be tasked to solve?
What are the kinds of programmers the language is likely to attract and be used by?
Let’s look at an example: the issue of value types occurring on the managed heap.
36. Value Types on the Managed Heap Value types can find themselves on the managed heap in a number of circumstances:
Implicit boxing
we assign an object of a value type to an Object
we invoke a virtual method through a value type that is not overridden
When a value type serves as a member of a reference class type.
When a value type is being stored as the element type of a CLI array.
The design question a CLI language has to ask is,
should we allow the programmer to manipulate the address of a value type of this sort?
37. What Are the Issues? Any object located on the managed heap is subject to relocation during the compaction phase of a sweep of the garbage collector.
Any pointers to that object must be tracked and updated by the runtime; the programmer has no way to manually track it herself.
Therefore, if we were to allow the programmer to take the address of a value type potentially resident on the managed heap, we would need to introduce a tracking form of pointer in addition to the existing native pointer.
38. What Are the Trade-Offs?Simplicity and Safety on the One Hand Directly introducing support in the language for one or a family of tracking pointers makes it a more complicated language.
By not supporting this, we expand the available pool of programmers to hire from by requiring less sophistication.
Allowing the programmer access to these ephemeral value types increases the possibility of programmer error – she may purposely or by accident do bad things to the memory.
By not supporting this, we create a potentially safer runtime environment.
39. What Are the Trade-Offs?Efficiency and Flexibility on the Other Hand Each time we assign the same Object with a value type, a new boxing of the value occurs …
Allowing access to the boxed value type allows in-memory update, which may provide significant performance …
Without a form of tracking pointer, we cannot iterate over a CLI array using pointer arithmetic. This means that the CLI array cannot participate in the STL iterator pattern and work with the generic algorithms.
Allowing access to the boxed value type allows significant design flexibility.
40. The Level of Detail Reflects the Target Programmer We chose to provide a collection of addressing modes that handle value types on the managed heap.
int ival = 1024;
int^ boxedi = ival;
array<int>^ ia = gcnew array<int>{1,1,2,3,5,8};
interior_ptr<int> begin = &ia[0];
value struct smallInt { int m_ival; … } si;
pin_ptr<int> ppi = &si.m_ival;
41. A Language Layer over the CLI A third design aspect is an language-specific layer of functionality over that directly supported by the CLI.
This requires a mapping between the language-level support and the underlying CLI …
Or it may be handled by tagging a type with a language-specific attribute discoverable at run-time …
In some cases, it just isn’t possible …
value types are blitted …
virtual function resolution within ctors/dtors …
So this is a compromise between what we might wish to do, and what we find ourselves able to do.
42. Three General Categories of Extension a form of Resource Acquisition is Initialization (RAII) for reference types. In particular, to provide an automated facility for deterministic finalization of garage collected types that hold scarce resources.
a form of deep-copy semantics associated with the C++ copy constructor and copy assignment operator – this could not be extended to value types.
Direct support C++ templates for CLI types in addition to the CLI generic support.
43. Non-Deterministic Finalization
Before the memory associated with an object is reclaimed by the garbage collector, an associated Finalize() method, if present, is invoked.
You can think of this method as a kind of super-destructor since it is not tied to the program lifetime of the object.
We refer to this as finalization. The timing of just when or even whether a Finalize() method is invoke is undefined.
This is what is meant when we say that garbage collection exhibits non-deterministic finalization.
44. The Problem of Scarce Resources
Non-deterministic finalization works well with dynamic memory management. When available memory gets sufficiently scarce, the garbage collector kicks in and things pretty much just work.
Non-deterministic finalization does not work well, however, when an object maintains a critical resource such as a database connection, a lock of some sort, or perhaps native heap memory.
In this case, we would like to release the resource as soon as it is no longer needed. The solution currently supported by the CLI is for a class to free the resources in its implementation of the Dispose() method of the IDisposable interface.
The problem here is that Dispose() requires an explicit invocation, and therefore is liable not to be invoked.
45. Automating Disposal …
A fundamental design pattern in C++ is spoken of as resource acquisition is initialization.
That is, a class acquires resources within its constructor.
Conversely, a class frees its resources within its destructor.
This is managed automatically within the lifetime of the class object.
This is what we would like to do with reference types in terms of the freeing of scarce resources:
Use the destructor to encapsulate the necessary code for the freeing of any resources associated with the class.
Have the destructor invoked automatically tied with the lifetime of the class object.
46. A Two-Step Solution … Step 1 Mapping the Destructor to Dispose()
The CLI has no notion of the class destructor for a reference type. So the destructor has to be mapped into something else in the underlying implementation.
Internally, then, the compiler does the following transformations:
the class has its base class list extended to inherit from the IDisposable interface.
the destructor is transformed into the Dispose() method of IDisposable.
That get us half the way to our goal. We still need a way to automate the invocation of the destructor.
47. A Two-Step Solution … Step 2 Mapping the object to a lifetime
A special stack-based notation for a reference type is supported; that is, one in which its lifetime is associated within the scope of its declaration.
Internally, the compiler transforms the notation to allocate the reference object on the managed heap.
With the termination of the scope, the compiler inserts a invocation of the Dispose() method – the user-defined destructor.
Reclamation of the actual memory associated with the object remains under the control of the garbage collector.
48. An Example ref class wrapper {
Native *pn;
public:
Wrapper( int val ) { pn = new Native( val ); } // RAII
~Wrapper(){ delete pn; }
void foo();
protected:
! Wrapper() { delete pn; }
};
void f1() {
Wrapper^ w1 = gcnew Wrapper( 1024 );
Wrapper w2( 2048 ); // no ^ token !
w1->foo(); w2.foo();
// …
// w2 is disposed of here
// w1 will be finalized at some point
}
49. Programming Languages are over-emphasized, much as national identity … The CLI Represents
a Language Framework
50. Language as a Unit of Deployment A language is often used as a vehicle for the deployment of a new programming model – that is, of a new paradigm.
It tends to demonstratively improve on existing models that have run into some bottleneck of scale.
Or it supports a new model either of technology or abstraction.
These languages tend to be pure – that is, to provide support for its program model only.
This makes the language both simpler and more elegant.
It requires a relinquishment of the past
51. In Like a Lion, Out Like a Lamb When there is a reinvention of the dominant program model, there is also a programming language extinction.
The current generation of languages has no vocabulary to directly express the new model.
Adding that vocabulary compromises the elegance of the original purity of design.
A pure language moves from a youthful development community to a acknowledged design influence.
This passionate sweeping in and hangdog slinking out of programming languages has taken its toll socially on the professional programmer class.
This is not really working, imo. What kind of solutions suggests themselves?
52. Where the CLI Comes in However, there is a possible language model we can glean from C++.
What has been surprisingly successful for C++ has been its ability to support multiple program models.
What has been less successful is the absence of a unifying architecture and crafted boundaries.
Well, perhaps what we need is a conscious design – a deliberate mosaic of component paradigms using a common type system and virtual machine model.
Oh, this is where the CLI comes in …
53. A Language Framework The CLI seems to offer the glimmerings of a framework for the design of a possible mosaic of component language gems.
I would like to see you guys come up with a new paradigm of how we should program – all two thousand of them.
There is so much hard work and invention that is required of us in the 21st century.
The university must delivery up the science; industry will deliver up the engineering.
54. Questions?
Concerns?
Criticisms?
slippman@microsoft.com