280 likes | 450 Views
On the design of CAS. Lecture 5. Programming languages and CA. If one follows the development of programming languages from a computer algebra point of view, one comes to realize that many issues modern programming languages try to address, have been encountered in the area of CA first.
E N D
On the design of CAS Lecture 5
Programming languages and CA If one follows the development of programming languages from a computer algebra point of view, one comes to realize that many issues modern programming languages try to address, have been encountered in the area of CA first. Among these issues are: • memory management, • program verification, • abstract data types, • modularization, • parallelization of systems and • the extensibility of systems.
Memory Management • Exact arithmetic for various areas of algebra requires data structures capable of a dynamic adaptation to the size of the represented algebraic object. • For practical algebraic computing, memory management is of prime importance when large computational problems cannot finish due to lack of sufficient memory – memory can be a more critical resource than processor speed. • The reference count method for memory management was mentioned in the literature for the first time in connection with the implementation of a polynomial system. • Reference counting nowadays is implemented in several object-oriented languages, most notably Java.
Program Verification and Abstract Data Types • Dynamic algebraic objects generally are implemented with pointers. • However, explicit pointer manipulation is known to be rather error prone. • CA, therefore, introduced the feature of data types to detect errors on the level of data structures early on, without having to de-reference pointers. • The method of specifying data types algebraically proved useful for general programming languages: • a well-defined set of values of objects and operations on them lays the foundation to address the verification problem successfully. • Abstract data types follow up on issues of program verification insofar they integrate an essential part of a type’s specification, the set of legal operations on it. • Along with encapsulation and the separation of interface and implementation, abstract data types support programming algorithms in an abstract form, a method in computer science usually referred to as object-based programming. • Only few CAS, for example Axiom support abstract data types.
The Concept of Types • Looking at the development of general purpose programming languages, we can see the proliferation of a general type and class concept in that area as well. • Algebraic algorithms realize objects from structures that have been systematically developed and well analyzed in algebra. • Carried over to the formalism of types in programming languages mathematical structures prove to go beyond the type concept introduced in programming languages. • To a certain degree, CA pioneers in the development of programming languages, both in a positive and in a negative sense. • On the positive side, we have the testing of new concepts like recursive and dependent types, expansion and reduction, composition and homomorphy of structures. • Negative issues are speed, user friendliness, and the safety of systems.
Genericity • Of the various aspects of genericity, overloading is the one that is directly inspired by algebra. • Known in particular in the context of algebraic operators, overloading allows to denote algorithms of a related functionality by the same identifier. • Since overloading helps to avoid artificial function identifiers almost every CAS and every programming language today makes use of operator overloading. • Declarative languages generalize the concept to overloading of functions, procedures, and constants. • Another aspect of genericity, parametric polymorphism, is realized in Lisp-based CAS. • Parametric polymorphism refers to functions that work uniformly on different types; recursive data types can be handled in an elegant way. • A third aspect of polymorphism applies to CAS based on object-oriented languages, which model relations between mathematical structures by an inheritance hierarchy. • Here, subtyping as a form of inclusion polymorphism makes functions applicable to objects not just of one class, but of any of its subclasses. • Recently developed CA packages in C++ exploit the C++ template feature to develop functions for parameterized types the instantiation of which can be statically checked.
Modularization • It is obvious that CAS should be modularized the same way algebraic structures are. • However there are differences to other program modules in CS: in algebraic algorithms, subroutines have more often their own purpose, functionality, and re-usability. • Consequently, the application interface of algebraic modules is more extensive than one might expect and the proper design of name spaces becomes significantly more sophisticated. • Only few CAS explicitly support some notion of modules.
Parallel Implementation • CA is a treasure trove of non-trivial and time consuming algorithms which are particularly suited for distributed and parallel computing. • An almost classic example is the family of algorithms based on Chinese remainder theorem. • They are often used to benchmark novel parallel computer systems.
Interfaces to Word Processors and Graphics • Most of the current systems are able to produce output in TEX or LATEX format. • Moreover, the document-style interfaces of both Maple and Mathematica are able to generate documents approaching publication quality. • A major development has been the adoption of XML by many software vendors, and the support for MathML. • A considerable improvement in the quality of the graphics generated by computer algebra systems has been registered in the recent years. • Some packages are exploiting standards for platform-independent graphical primitives as OpenGL and Vrml.
Interfaces to Numerical Software • In current CAS, there exists 3 different types of interfaces for linking them to numerical packages: • The first type allows a user to generate expressions in another programming language, like C. • The second type connects modules, originally created by the compiler of an arbitrary programming language, to computer algebra system. • With the third type, numerical and symbolic software are integrated into a common environment. • The system Gentran of Reduce • provides the generation and segmentation of large expressions, as well as for translating Reduce expressions, and generating entire programs from built-in program skeletons and templates. • It offers the target languages Fortran, C and Pascal. • It can be also linked to Scope system which is capable of optimizing code by common subexpression elimination and other reductions (e.g. for computing symbolic derivatives or Jacobians). • Gentran was originally written for Macsyma and an updated version has been ported to Common Lisp. • There exist similar packages for Axiom, Maple and Mathematica. • All these systems can be used to generate expressions and program fragments, and some can be used to create complete, runnable programs.
Interfaces to Numerical Software • The second sort of approach to linking symbolic and numerical systems is through a direct link between two such systems at runtime. • This is most easily achieved by starting two separate processes and allowing them to communicate via sockets, although with modern operating system technology it is increasingly the case that dynamic linking of shared objects is both practical and more efficient. • Currently Mathematica, Maple and MuPAD offer a general API for this kind of application, • However these are often clumsy to use because of the need to convert data to and from the computer algebra system’s native format. • The OpenMath project seeks to address this through creating a standard format for mathematical software packages to use for input and output of objects. • The most advanced approach usually build on this second kind of facility, and consists of developing high level interfaces to numerical libraries, particularly the NAG library • the Irena interface between Reduce and NAG library; • the interface incorporated in Axiom; • Maple includes over a hundred linear algebra routines from the NAG library • The obvious commercial success of closed systems is recognized. However it is desirable • to have tools freely available, • to promote continuing the development and adaptation of computer algebra techniques for novel applications and research in special fields. • Hereby the increasingly pressing problem of embedding CA components into other program systems (such as logic programming, data bases, expert systems, as well as numeric and graphic systems) would come somewhat closer to a satisfactory solution.
User Interfaces • Traditionally, CAS used to have rather rudimentary user interfaces: • an interpreter • would process commands, and • display the resulting mathematical formulas in an text representation which has the big advantage of being portable but can be ugly and hard to read. • More and more systems now make use of bitmaps for improved graphical representation of the expressions which they produce. • The widespread availability of components based on XML should make the production of high quality user interfaces much easier in future. • Experiments to recognize hand-writing as input data are underway!
Standardization • There are a number of interfaces between a computer algebra system to another one. • For example GB which is one of the fastest system for computing Groebner bases, uses Axiom and MuPAD to provide a user interface and other kinds of general functionality. • In many ways this is an example of how CA technology could develop in the future: • a few general systems using a large number of fast, highly-specialized servers to perform computations. • For this to be practical standard mechanisms for linking systems together are needed. • Any standards will need to address two issues: • the transport mechanism to be used to move data from one machine to another, • and the data representation to be used.
Standardization • In the case of transport mechanisms it is needed to adopt standard network technologies. • Data representation, on the other hand, does require a solution designed specifically for CA. • The objects manipulated by a CAS are often quite complicated, notation can vary and the precise semantics associated with an object can differ quite significantly from one system to another. • Thus there is a requirement for a framework in which one can describe both the abstract semantics and concrete representation of an object. • One general mechanism that has been developed to address this issue is MP. • Later the biggest activity in this area is the OpenMath project which included developers of Axiom, Maple and Reduce as well as representatives of the XML and electronic publishing communities.
MathML • MathML is an XML representation for mathematical objects, allowing expressions to be stored in databases, transmitted between applications and operated upon by programs. • MathML can be used to express mathematical content in web pages and digital libraries, and has become an accepted form for input and output of CAS. • MathML provides a vocabulary for such things as identifiers, numbers, operators, grouping etc. • There are two broad classes of constructions: • the set that describe the appearance, or notation, of an expression form what is called presentation MathML; • the elements that describe the meaning, or semantics, of an expression are known as content MathML.
Presentation MathML • MathML has a complete set of elements to describe mathematical notation. • There are primitives for various kinds of tokens, and others to describe relative position and grouping of subexpressions. • For example, the presentation MathML for the expression is <mrow> <msup> <mi>x</mi> <mn>2</mn> </msup> <mo>×</mo> <msup> <mi>y</mi> <mn>2</mn> </msup> </mrow>
Presentation MathML • <mi> elements give math identifiers (variables or parameters), • <mn> denote numbers • <mo> denotes an operator. • <msup> elements express superscripts • <mrow> is used for a horizontal sequence, • × is a named entity that expands to the Unicode character ×.
Presentation MathML <mfrac> <mrow><mi>a</mi><mo>±</mo> <msqrt><mi>b</mi></sqrt></mrow> <mi>c</mi> </mfrac> <mfenced open="[" close="]"> <mtable> <mtr><mtd><mi>a</mi></mtd> <mtd><mi>b</mi></mtd></mtr> <mtr><mtd><mi>c</mi></mtd> <mtd><mi>d</mi></mtd></mtr> </mtable></mfenced>
Presentation MathML <mrow> <msubsup> <mo>∑</mo> <mrow><mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>n</mi> </msubsup> <mrow> <msup><mi>e</mi><mi>i</mi></msup> <mo>⁢</mo> <msub><mi>ω</mi> <mi>i</mi></msub> </mrow> </mrow> <mrow> <msub> <mo>lim</mo> <mrow><mi>h</mi> <mo>→</mo> <mn>0</mn></mrow> </msub> <mfrac> <mrow> <mi>f</mi> <mo>⁡</mo> <mfenced><mi>t</mi><mo>+</mo><mi>h</mi></mfenced> </mrow> <mi>h</mi> </mfrac> </mrow>
Presentation MathML For a = b + c is used <mrow> <mi>a</mi> <mo>=</mo> <mrow><mi>b</mi><mo>+</mo><mi>c</mi></mrow> </mrow> and not <mrow> <mi>a</mi> <mo>=</mo> <mi>b</mi><mo>+</mo><mi>c</mi></mrow> This way, line breaking and sub-expression selection can be handled correctly.
Content MathML • MathML provides facilities to describe the meaning of mathematical expressions. • The subset designed for this purpose is content MathML. • This has a set of built-in tags to express the concepts which occur in elementary mathematics. • More advanced concepts are expressed using markup with external references. • Content MathML expressions consist typically of expressions with operators applied to arguments (reminiscent of Lisp expressions). • In content MathML the sum of two squares is represented by <apply> <plus/> <apply> <power/> <ci>x</ci> <cn>2</cn> </apply> <apply> <power/> <ci>y</ci> <cn>2</cn> </apply> </apply> The <ci> and <cn> give content markup for identifiers and numbers respectively. The <plus> and <power/> elements denote operators. Expressions are formed from these leaves by giving function application elements with <apply>.
Content MathML <apply> <power/> <apply> <sin/> <ci>θ</ci> </apply> <cn>2</cn> </apply> <apply> <eq/> <apply> <inverse/> <log/> </apply> <exp/> </apply>
Content MathML <apply> <int/> <bvar><ci>t</ci></bvar> <lowlimit><ci>a</ci></lowlimit> <uplimit><ci>b</ci></uplimit> <apply> <ci>f</ci> <ci>t</ci> </apply> </apply> <apply> <forall/> <bvar><ci>x</ci></bvar> <condition> <apply> <and/> <apply><in/> <ci>x</ci> <reals/></apply> <apply><gt/> <ci>x</ci> <cn>1</cn></apply> </apply> </condition> <apply> <gt/> <apply><power/><ci>x</ci><cn>2</cn></apply> <ci>x</ci> </apply> </apply>
Annotations • MathML objects can have additional associated information. • For example, if a CAS generates MathML output, it may desired to associate the system’s original expression with the MathML. • The <semantics> element is used for this purpose. • The first child of this element is an expression to be annotated, and the second and any subsequent children are annotations either in textual or XML form. • For example: <semantics> <mrow><mi>x</mi> <mo>×</mo> <mi>y</mi></mrow> <annotation encoding="Maple"> x * y </annotation> <annotation encoding="Tex"> x \times y </annotation> <annotation-xml encoding="OpenMath"> <OMOBJ xmls="http://www.openmath.org/OpenMath"> <OMA> <OMS cd="arith1" names="times"/> <OMV name="x"/> <OMV name="y"> </OMA> </OMOBJ> </annotation-xml> </semantics>
Combining Presentation and Content • It is not uncommon to work with both presentation and content for the same mathematical expression. • This can be done with a semantics element, giving either the presentation or the content as the first child and the other as the annotation. • For example: <semantics> <mrow><mi>a</mi> <mo>+</mo> <mi>b</mi></mrow> <annotation-xml encoding="MathML-Content"> <apply> <plus/><ci>a</ci> <ci>b</ci> </apply> </annotation-xml> </semantics> • Joining a presentation expression and content expression in this way gives what is known as top-level parallel markup.
Combining Presentation and Content • In many applications, it is desirable to be able to select subexpressions and to be able to find both their content and presentation markup. • Top-level parallel markup is insufficient for this purpose. • MathML provides id and xref attributes which may be used to cross-reference the subexpressions of content and presentation trees. • This gives what is known a fine-grained parallel markup. • For example, <semantics> <mrow id="G1"> <mi id="G2">a</mi> <mo id="G3">+</mo> <mi id="G4">b</mi> </mrow> <annotation-xml encoding="MathML-Content"> <apply xref="G1"> <plus xref="G3"/> <ci xref="G2">a</ci> <ci xref="G4">b</ci> </apply> </annotation-xml> </semantics>
MathML import and export • MathML can be imported and exported from major CAS, is supported natively or via plug-ins in the most popular web browsers, and handled by certain editors. • MathML may be imported or exported by both Maple and Mathematica. • Maple places greater emphasis on Content MathML • while Mathematica emphasizes presentation MathML. • The browsers Netscape, Mozilla, Amaya support MathML natively. • TechExplorer and MathPlayer can both be used to display MathML in Explorer. • It is possible to write browser-independent pages by making use of the MathML universal style sheet; for example, an XHTML page containing MathML: <?xml version="1.0"?> <?xml-stylesheet type="text/xsl” href="http://www.w3.org/Math/XSL/mathml.xsl"?> <html xmlns="http://www.w3.org/1999/xhtml"> <head> Example </head> <h1> Math text </h1> <math xmlms="http://www.w3.org/1998/Math/MathML"> <mrow> <mi>a</mi> <mo>+</mo> <mn>b</mn> </mrow> </math> </body> </html>
Hardware Implementation of CA Algorithms • The connection between CA and hardware lies mostly in the use of CA for the design or improvement of hardware with algebraic methods. • Sometimes special designed hardware can be used to speed up CA calculations. • Among the successful projects are machines for number theory (the Analytik constructed by Russian designers) and machines for finite field arithmetic. • In a more general sense, the Symbolic computer which uses a second processor for on-line support of memory management, also belongs to this category. • Cryptography, coding theory, and digital signal processing are areas in which major efforts to realize basic algorithms of CA on the chip level are underway. • Several implementations of long integer arithmetic, mostly based on the Karatsuba algorithm, originated as part of RSA hardware implementations. • Also arithmetic of elliptic curves has been realized by CMOS technology. • Various fast transforms of signals have been laid out in parametric VLSI designs. • The IDEAS environment (Intelligent Design Environment for Algorithms in Signal processing) is able • to optimize algorithms and to synthesize architectures by incorporating algebraic optimization algorithms; • these algorithms are used to determine good bases for finite field arithmetic or the • inherent parallelism in signal transformations; • according to technological parameters like gate sizes, specialization algorithms are used to optimize the gate count or chip size.