280 likes | 447 Views
Lazy Evaluation in Numeric Computing. 2 0 Октября 2006 Игорь Николаевич Скопин. Agenda. Elementary introduction to functional programming: reduction, composition and mapping functions Lazy Evaluation: Scheme Principles Imperative Examples Analysis Necessity in Procedure Calls
E N D
Lazy Evaluation in Numeric Computing 20 Октября 2006 Игорь Николаевич Скопин
Agenda • Elementary introduction to functional programming: reduction, composition and mappingfunctions • Lazy Evaluation: • Scheme • Principles • Imperative Examples • Analysis • Necessity in Procedure Calls • Dependencies of Lazy Evaluation on the Programming Style • Newton-Raphson Square Roots • Functional Program • Numerical Differentiation • FC++ library • “Lazy list” data structure in FC++ • Memoization • Lazy Evaluation in Boost/uBLAS: Vectors/Matrices Expressions • Vectors/matrices expressions and memoization • style of programming • The Myths about Lazy Evaluations • Q&A
append a b = reducecons b a append [1,2] [3,4] = reduce cons [3,4] [1,2] = (reduce cons[3,4]) (cons1 (cons 2 nil)) = cons1 (reduce cons(cons 3 (cons 4 (cons nil))))(cons2 nil)) = cons 1 (cons2(reduce cons(cons 3 (cons 4 (cons nil)))nil)) = cons 1 (cons 2 ((cons 3 (cons 4 (cons nil)))) f xal f xal Gluing Functions Together: reduce list of x = nil | cons x (list of x) [ ] nil [1] cons 1 nil [1,2,3] cons 1 (cons 2 (cons 3 nil)) specific to sum sum nil = 0 sum (cons num list) = num + sum list sum = reduce add 0 add x y = x + y (reduce f x) l reduce f x nil = x reduce f x (cons a l) = f a (reduce f x l) reduce sum 0 ( 1 ,2, 3, [ ] ) ⇒1+2+3+0 reduce multiply 1( 1 ,2,3 , [ ] ) ⇒1 * 2 * 3 * 1 product = reduce multiply 1 (reducecons nil) α⇒α //copy of α reduce (:) [ ] // — Haskell reduce — is a function of 3 arguments, but it applies to 2 only ⇒the result is a function!
Gluing Functions Together: Composition and Map A function to double all the elements of a list doubleall = reduce doubleandcons nil where doubleandcons num list = cons (2*num) list reduce f nil givesexpansion f to list specific to double Further: doubleandcons = fandcons double where double n = 2*n fandcons f el list = cons (f el) list An arbitrary function Function composition — standard operator “.”: (f . g) h = f (g h) So fandcons f = cons . f Next version of doubleall: doubleall = reduce (cons . double) nil This definition is correct: fandcons f el = (cons . f) el = cons (f el) fandcons f el list = cons (f el)list specific to double Functionmap (for all the elements of list): map f = reduce (cons. f) nil Final version of doubleall : doubleall = map double One more example: summatrix = sum . map sum
Lazy Evaluation: Scheme Using a temporary file F and G — programs: ( G.F ) input G ( F input )It is possible: F input → tF; G tF, but it is not good! The attractive approach is to make requests for computation: More precisely: Hold up Hold up G: G: Needed data produced by F Resume G ResumeF Hold up ResumeG F: Hold up Needed data Data are ready F: … Data are ready ResumeF
Lazy Evaluation: Principles Postulates: • Any computation is activated if and only if it is necessary for one or more other computations. This situation is named as necessityof computation. • An active computation is stopped if and only if its necessity vanishes (for example it has been satisfied). • The computation, as a whole, is activated forcibly as a request (necessity) to obtain the results of computational system execution. Consequences: • The activation of all computational units is driven by a dataflow started when a necessity of computation as a whole arises. • A control flow of the computation is not considered as a priori defined process. It is formed up dynamically by the necessities of computations only.
Lazy Evaluation: Imperative Examples Necessityof computation may not appear! without compu-tationof(…)! Arithmetic expression x = ( … ) * 0; ⇒ x = 0; Boolean expression ℜ ≡ α&β&γ:(α== false) ⇒ℜ== false (α== true) &(β== false) ⇒ℜ== false if ( (precond) && (init) && (run) && (close) ) { printf (“OK!”); } else … Vector/matrix expressions When we write vector a(n), b(n), c(n); a = b + c + d; Lazyand eager evaluation of files handling cat File_F | grep WWW | head -1 Subject of next slide • The compiler does the following: • Vector* _t1 = new Vector(n); • for(int i=0; i < n; i++)_t1(i) = b(i) + c(i); • Vector* _t2 = new Vector(n); • for(int i=0; i < n; i++)_t2(i) = _t1(i) + d(i); • for(int i=0; i < n; i++)a(i) = _t2; • delete _t2; • delete _t1; The same for matrices Not everything is so good! We’ll discuss this later So we have created and deleted two temporaries! for(int i=0; i < n; i++)a(i) = b(i) + c(i) + d(i);
Note: It is the same object in both cases! Analysis ofcat File_F | grep WWW | head -1Lazyand Eager Evaluation Eager variant of execution: stdin stdin stdout stdout stdout cat File_F grep WWW head -1 All strings of File_F Strings with ‘WWW’ One string Lazy variant of execution: stdin stdin stdout stdout stdout cat File_F grep WWW head -1 For strings of File_F One string with ‘WWW’ One string F A nice question: is this version more correct? Suppose that the needed string was not detected at first We omit intermediate steps here UNIX pipeline may be considered as optimization of lazy evaluation (in this case)!
a b c d a b c d I II + = t1 + t2 + a[i] + III d[i] I ⇒ II ⇒ III (or I ⇒ II U III) b[i] c[i] Analysis of Vector/matrix Example Necessityof computation (↓) appears only when a [i] is needed Consider the example more closely: a = b + c + d; t1 = b + c; t2 = t1 + d; a = t2; Order of calculations Traditional scheme Lazy evaluation = + a[i] b[i] c[i] d[i]
P ( in a, out b); { … a… … b = …; … } 6*8 X Necessityin Procedure Calls … r = x*5; … procedure P (in a, out b); … P (6*8, x); When does the necessityof computations appear? in parameters • Real and forced necessity • Thereare many details • Ingerman’s thunks (algol 60) out parameters • Only forced necessity is possible in imperative languages What hampers the real necessity? • Dependences on context • Possibility of reassignment for variables SISAL and others languages with a single assignment —this palliative seems to be not good Q ( in a ); { … a = 7; … … r = 9; … … t = 2 + a; … … } … r = 1; Q ( r + 5);
Functional programming Exactly needed necessities Automatic dataflow driven necessities Combining functions and compositionoriented approach Declarative programming Operational programming Forciblyarisingnecessities Manual control flow and agreement driven necessities Control flow and data transforming oriented approach Imperative programming Dependencies of Lazy Evaluation on the Programming Style Nevertheless, there are useful possibilities to apply lazy evaluation in both cases! The key notion is the definition of necessities. Functional style may be characterized as a style that allows automatic dataflow driven necessities.
Newton-Raphson Square Roots Functional programs are inefficient. Is it true? Algorithm: • starting from an initial approximation a0 • computing better approximation by the rule a(n+1) = (a(n) + N/a(n)) / 2 If the approximations converge to some limit a, then a = (a + N/a) / 2 so 2a = a + N/a, a = N/a, a*a = N ⇒a = squareroot(N) Imperative program (monolithic): X = A0 Y = A0 + 2.*EPS 100 IF (ABS(X-Y).LE.EPS) GOTO 200 Y = X X = (X + N/X) / 2. GOTO 100 200 CONTINUE • This program is indivisible in conventional languages. • We want to show that it is possible to obtain: • simple functional program • technique of its improving • The result is a very expressive program!
Newton-Raphson Square Roots: Functional Program • First version: next N x = (x + N/x) / 2 [a0, f a0, f(f a0), f(f(f a0)), ..] repeat f a = cons a (repeat f (f a)) repeat (next N) a0 within eps (cons a (cons b rest)) = b, if abs(a-b) <= eps = within eps (cons b rest), otherwise sqrt a0 eps N = within eps (repeat (next N) a0) • Improvement: relative eps (cons a (cons b rest)) = b, if abs(a-b) <= eps*abs(b) = relative eps (cons b rest), otherwise relativesqrt a0 eps N = relative eps (repeat (next N) a0)
Numerical Differentiation easydiff f x h = (f (x + h) - f (x)) / h A problem: small h ⇒ small (f ( x + h ) - f (x)) ⇒ error differentiate h0 f x = map (easydiff f x) (repeat halve h0) halve x = x/2 within eps ( differentiate h0 f x ) (1) But the sequence of approximations converges fairly slowly elimerror n (cons a (cons b rest)) = = cons ((b*(2**n)-a)/(2**n-1)) (elimerror n (cons b rest)) But n is unknown order (cons a (cons b (cons c rest))) = round(log2((a-c)/(b-c)-1)) So a general function to improve a sequence of approximations is improve s = elimerror (order s) s More efficient variants: within eps (improve (differentiate h0 f x)) (2) Using halveproperty of the sequence we obtain the fourth order method within eps (improve (improve (improve (differentiate h0 f x)))) (3) Using the following super s = map second (repeat improve s) second (cons a (cons b rest)) = b Here repeat improve is used to get a sequence of more and more improved sequences. So we obtain very difficult algorithm in easy manner within eps (super (differentiate h0 f x))(4) n ≈ log2( (ai+2 – ai) / (ai+1 – ai) – 1 ) Let A is the right answer and B is the error term B*hn. Thena(i) = A + B*2n*hn and a(i+1) = A + B*(h**n). A = (a(i+1)*(2**n) — a(i)) / 2**n – 1
FC++ library • High order functions —functions with functional arguments • FC++ library is a general framework for functional programming • Polymorphic functions—passing them as arguments to other functions and returning them as results. • Support higher-order polymorphic operators like compose(): a function that takes two functions as arguments and returns a (possibly polymorphic) result • Large part of the Haskell • Support for lazy evaluation • transforming FC++ data structures ⇔ data structures of the C++ Standard Template Library (STL) • operators for promoting normal functions into FC++ functoids. Finally, the library supplies • “indirect functoids”: run-time variables that can refer to any functoid with a given monomorphic type signature.
“Lazy list” data structure in FC++ List<int> integers = enumFrom(1); // infinite list of all the integers 1, 2, List<int> evens = filter(even, integers); // infinite list of all the integers 2, 4, bool prime( int x ) { ... } // simple ordinary algorithm filter( ptr_to_fun(&prime), integers ); // ptr_to_fun transform normal function to functoid plus ( x, y ) ⇒ x + y plus ( 2 ) ⇒ 2 + x Limitations • Lambda functions • Dependences on context (?) • Possibility of reassignment for variables (?) • Template technique is insufficient (blitz++)
Imperative scheme We need to call expressions explicitly only Procedure calls depend on the context Strong sequence of computation units (hard for parallelization) If we want to memoize previous results we should do this explicitly Memoization process is controlled by programmer Only manual transforming to a suitable scheme of data representation is possible Circle head and circle body are joined monolithicly Controlflow centric approach Require a difficult technique for def-usechains analysis Functional scheme Expressions do not depend on context Context independent procedure calls The sequence of computation units is chosen by execution system (more flexible for parallelization) Automatic memoization Memoization process is not controlled, but filters are allowed (indirect control) Stack technique of program representation and execution is not appropriate Constructions like reduce and composition of functions allow considering circle head and bodyindependently Dataflow centric approach Suitable for def-use chains analysis Memoization Fibonacci example: F (n) = F (n-1) + F (n-2)It is a classical bad case for imperative computations. Why? Previous functional programs are easier for development and understanding. Why?
Lazy Evaluation in Boost/uBLAS: Vectors/Matrices Expressions Example: A + prod (B,V) • Assignment activate evaluation • Indexes define evaluation of expression tree: if we write the example, we initiate the following computationfor all i { A [i] + ∑k (B [i,k] *V[k]) } (1) • “for all” means a compatibility of computations (order of computations is chosen by the execution system) • It is possible (but not necessary) to have the following representation: A +[ ∑k (B [i,k] *V[k])] (”[ ]“denotes a vector constructor) • Necessity of computation is defined by this fragment for each i (may be dynamically) • Postulate that “=“ operator always leads to appearance of necessity of computations: its left hand side should be computed and assigned to the right hand side: D =A + prod (B,V); • Common expression tree is used for computations for each i in (1) • Types coordination is hold: • Correct vectors/matrices expression includes constituents with types allowed by operators of expression (including prod and others)
Boost/uBLAS: vectors/matrices expressions: temporary and memoization problems • Let us consider x = A*x expression (this is an error from the functional style viewpoint, but correct for operational C++) • Naive implementation is for all i { x [i] =∑k (A [i,k] * x[k]); } It is not correct! • The suitable implementation should use a temporary t : for all i { t [i] =∑k (A [i,k] *x[k]); } x = t; The last assignment should not be a copy of the value, but a reference coping and deleting the previous value of x. • Let us consider A * ( B * x). • If we don’t use the lazy evaluation, we obtain an n2complexity (C1*n2 for B * x plus C2*n2 for other multiplication). • But in a straightforward lazy case the obtained complexity would be n3. • It is not a problem for a real functional language implementation because of a value propagation technique (automatic memoization) • Instead of the value propagation technique in C++ implementation, we can provide a temporary. Its assignment breaks the expression • We use temporaries in both cases. But what part of information should be really saved?
Boost/uBLAS: style of programming • Object-oriented style • Standard technique and patterns should be used • C++ style (as addition to the previous) • Standard template technique • Vectors/matrices expressions (as addition to the previous) • Tendency to use matrix and vector objects instead of variables with indexes • Tendency to write expressions instead of simple statements • Use uBLAS primitives as specializations of general templates • Don’t use direct classes extensions by multilevel inheritance
Boost/uBLAS. Example task: Jacobi method • Let us consider the Jacobi method of solving the linear system • We are able to write it using such formulas as • Instead of this, we should use it in matrix terms as • As the result, we obtain the following program (next slide)
X B For input data D * = For output data Diagonal Example task in matrix terms Diagonal-1 U 1 1 L L(A) 1 1 1 1 1 1 D-1 . . . . U(A) . . 1 1 Boost/uBLAS. Example program Jacobi method A matrix<double> A (n, n); vector<double> B (n); vector<double> X (n); matrix<double> D (n, n); matrix<double> D_1 (n, n); triangular_adaptor<matrix<double>, unit_lower> L (A); triangular_adaptor<matrix<double>, unit_upper> U (A); identity_matrix<double> I(n,n); … D = A - L - U + 2*I; D_1 = inverse_matrix(D); for (i = 0; i < count; ++ i) X = -prod(prod( D_1,L+U-2*I),X) + prod (D_1,B); Element of uBLAS data structure Diagonal receiving We need to write a special function!
Boost/uBLAS style of programming in comparison of indexes using We force a temporary using! for ( k = 0; k < count; ++ k ) { for (i = 0; i < A.size1 (); ++ i) { T (i) = 0; for (j = 0; j < i; ++ j) T (i) += A (i, j) * Y (j); for (j = i+1; j < A.size1 (); ++ j) T (i) += A (i, j) * Y (j); T (i) = ( B (i) – T (i) ) / A (i, i); } Y = T; } • Obviousness is lost • Sequence of computations is stated hard (losses of possibilities for compiler’s optimization) • It is harder for development of programs than in the alternative case • Possibilities of indexes using are not lost in vectors/matrices expressions • Vectors/matrices expressions are more suitable to finding patters needed for using special optimized external libraries We force to divide the process off for selecting an diagonal activities with diagonal Using D-1 (This case may be better than vector/matrix expression)
Jacoby method: X(k) X(k-1) X(k) X(k-1) Boost/uBLAS: Gauss-Seidel method Gauss-Seidel method: Jacoby method: … D_1 = inverse_matrix(D); for (i = 0; i < count; ++ i) X = - prod(prod( D_1,L + U-2*I),X) + prod (D_1,B); Gauss-Seidel method: … D_1 = inverse_matrix(D+L); for (i = 0; i < count; ++ i) X = prod( D_1, B - prod (U, X)); Recall x=Ax problem. It solves by temporary: t=Ax; x=t. But Seidel method may be consideras Jacoby method when the temporary is avoided! It is a new approach to organizing of computations!
Boost/uBLAS: style of programming (limitations) • Frequentlyarejection of the vectors/matrices style is required • A problem of styles compatibility • Not closed set of operators with matrices and vectors are presented • Instead of vectors/matrices operator “*” one should use a generic function prod (mv1, mv2) provided for all needed cases • uBLAS Vectors and Matrices operators are not presented as an algebraic system (for instance “-1”, “|| ||” are not offered because of the problems of vectors/matrices expression lazy evaluation) An acceptable approach may be proposed as follow: • The results of “-1”, “|| ||” and so on computation are considered as attributes of matrix and vector classes • These attributes are computed out of the expression computation by outside control if it is necessity • Expression’s constituents with these operators are replaced by extraction needed items from corresponded attributes The direction to Vectors and Matrices expression is very promising
The myths about lazy evaluations Examples above indicate that it is not right • Lazy evaluations is possible only in functional languages • Lazy evaluations and functional languages may be applied only in artificial intelligence area • We are able to realize lazy evaluations using an arbitrary programming language • Using lazy evaluations decreases performance As we have seen it is not right: high order functions using is very prospective in many cases Language states a lot of limitation for lazy evaluations using* This statement depends on quality of algorithms programming only * C++ is subjected to criticism from this point of view by blitz++ project developers