360 likes | 553 Views
Portability by Automatic Translation: Two Case Studies. Yishai A. Feldman The Interdisciplinary Center Herzliya, Israel. First Case: Bogart. Large-Scale Translation from Assembly Language to C Joint Work with Doron A. Friedman. The Problem. 400,000 lines of IBM 370 assembly code
E N D
Portability by Automatic Translation: Two Case Studies Yishai A. Feldman The Interdisciplinary Center Herzliya, Israel
First Case: Bogart Large-Scale Translation from Assembly Language to C Joint Work with Doron A. Friedman
The Problem • 400,000 lines of IBM 370 assembly code • Customers downsizing mainframes • Hand-optimized code over 15 calendar years • Live system
Success Criteria • Portability • Efficiency • Minimum manual work BUT • Readability is not important
Difficult Assembly Features • Registers • Condition code • Untyped language • Unstructured code • Large unstructured memory areas • Portability: • Different byte order • Different word size • Different pointer size
STM R14,R12,12(R13) LR R12,R15 USING HORNER,R12 ST R13,SAV+4 LA R13,SAV LA R7,COEF L R5,0(R7) LA R9,0 LOOP CR R9,R2 BNL OUT LA R9,1(R9) LA R7,4(R7) MR R4,R3 A R5,0(R7) B LOOP OUT LR R0,R5 LM R1,R12,24(R13) BR R14 void HORNER(tagSAPReg *Reg) { T_stm(14,12,((Reg[13].ucp+12)),Reg); Reg[12].sw = Reg[15].sw ; SAV[1] = Reg[13].sw ; Reg[13].pv = &(SAV[0]) ; Reg[7].pv = &(COEF[0]) ; Reg[5].sw = *(sWord *)Reg[7].ucp; Reg[9].sw = 0 ; LOOP: if ((Reg[9].sw) >= Reg[2].sw) goto OUT; Reg[9].sw += 1; Reg[7].sw += 4; T_mult(&Reg[4],Reg[3].sw) ; Reg[5].sw += *((sWord *)(Reg[7].ucp)); goto LOOP ; OUT: Reg[0].sw = Reg[5].sw; T_lm(1,12,((Reg[13].ucp+24)),Reg); return; } The Simulating Translator
BogartBetterOptimizing General-purpose AbstractRepresentation Translator
Translation byAbstraction, Transformation, and Reimplementation • Control-flow and data-flow analysis • Typing by constraint propagation • Automatic cliche recognition Abstraction Transformation Re-implementation
void HORNER(tagSAPReg *Reg) { T_stm(14,12,((Reg[13].ucp+12)),Reg); Reg[12].sw = Reg[15].sw ; SAV[1] = Reg[13].sw ; Reg[13].pv = &(SAV[0]) ; Reg[7].pv = &(COEF[0]) ; Reg[5].sw = *(sWord *)Reg[7].ucp; Reg[9].sw = 0 ; LOOP: if ((Reg[9].sw) >= Reg[2].sw) goto OUT; Reg[9].sw += 1; Reg[7].sw += 4; T_mult(&Reg[4],Reg[3].sw) ; Reg[5].sw += *((sWord *)(Reg[7].ucp)); goto LOOP ; OUT: Reg[0].sw = Reg[5].sw; T_lm(1,12,((Reg[13].ucp+24)),Reg); return; } sWord HORNER(sWord r2sw, sWord r3sw) { sWord r5sw; sWordPtr r7swp; sWord r9sw; r7swp = (sWord *)(&COEF[0]); r5sw = *r7swp; r9sw = 0; while (r9sw < r2sw) { r9sw++; r7swp++; r5sw = r5sw*r3sw + *r7swp; } return r5sw; } The Code Produced by Bogart
Accumulating Compound Expressions Assembly Simulating translator L R11,NIG L R5,0(R4) BAL R14,INCTAB LPR R7,R0 MR R2,R7 LR R0,R3 BR R14 Reg[11].sw = NIG; Reg[5].sw = *(sWord *)Reg[4].ucp; INCTAB(Reg); Reg[7].sw = labs(Reg[0].sw); T_mult(&Reg[2], Reg[7].sw); Reg[0].sw = Reg[3].sw; return; return r3sw * labs(INCTAB(NIG,(*r4swp))); Bogart
Example: Condition Code Support Assembly Bogart CGLOOP CH R7,0(R5,R4) SRL R2,1 BH CGADD BNH CGSUB r2sh >>= 1; temp = r4ucp + r5sh; if (r7sh > temp) goto CGADD; if (r7sh <= temp) goto CGSUB; Simulating translator if (Reg[7].sw == *((sHalf *)((Reg[4].ucp+Reg[5].sw)))) __CC = _CZero; else if (Reg[7].sw < *((sHalf *)((Reg[4].ucp+Reg[5].sw)))) __CC = _COne; else __CC = _CTwo; Reg[2].uw >>= 1; if (__CC & 0x4) goto CGADD; if (__CC & 0x3) goto CGSUB;
The Plan Representation CGR CH R2,GMF BL CONGR SRL R2,1 B CGR CONGR ....
Time and Space ComparisonBogart and Simulating Translator Simulator Bogart Time (sec.) Space (bytes) Time (sec.) Space (bytes) 63 4170 33 2802 BIN HORNER 10 3302 3 2465 RANDOM 9 5447 4 2741 SAPDBMS 18 41700 9 29073 (SAPDBMS is a central Sapiens module)
Time Performance on Several Platforms(For example routine BIN) IBM 370 RS/6000 AS/400 PC (DOS) Original Assembly 0.32 1.74 1.91 failed 1.26 Simulator 1 1 1 1 Bogart Hand Crafted 0.91 0.97 0.49 1.07
Results • Bogart produces more portable code • Bogart supports a larger portion of the source language • Bogart requires less manual work in code preparation • Bogart produces more efficient code in terms of time and space performance
Conclusions • Translation by abstraction produces better results than simulation on all criteria • Simulation is simpler and faster to implement • Simulated code is easier for the programmers to debug • The advantages of the abstraction approach grow in the long term • “Research-then-transfer” versus “Industry-as-laboratory” (Colin Potts, 1993)
Second Case: MIDAS Automatic High-Quality Reengineering of Database Programs by Temporal Abstraction Joint Work with Yossi Cohen
The Problem Legacy Database Software • Much legacy software is DP, many database-related programs • Conversion from older models (indexed-sequential, hierarchical, network) to relational/object-oriented databases • Need to convert: • Schema • Data • Software
Network Database Program (1) 01 MOVE 0 TO STATUS1. 02 PERFORM UNTIL STATUS1 IS NOT EQUAL TO ZERO 03FETCH NEXT STUDENT WITHIN DEPT-OF-STUDENT 04 AT END MOVE 1 TO STATUS1 05 IF STATUS1 IS EQUAL TO 0 THEN 06IF STUDENT-DEGREE IS EQUAL TO 2 THEN 07MOVE 0 TO GRADES-SUM 08 MOVE 0 TO GRADES-COUNT 09 PERFORM SUM-STUDENT-GRADES 10 DIVIDE GRADES-SUM BY GRADES-COUNT 11 GIVING GRADES-AVG 12IF GRADES-AVG > 95THEN 13 DISPLAY ..., GRADES-AVG 14 END-IF 15 END-IF 16 END-IF 17 END-PERFORM.
Network Database Program (2) 18 SUM-STUDENT-GRADES. 19 MOVE 0 TO STATUS2 20 PERFORM UNTIL STATUS2 IS NOT EQUAL TO ZERO 21FETCH NEXT GRADES WITHIN STUDENT-OF-GRADES 22 AT END MOVE 1 TO STATUS2 23 IF STATUS2 IS EQUAL TO 0 THEN 24ADD GRD-GRADE TO GRADES-SUM 25 ADD 1 TO GRADES-COUNT 26 END-IF 27 END-PERFORM.
Naive Translation (1) 01 EXEC SQL DECLARE CRS1 CURSOR FOR 02SELECT ... FROM STUDENT 03 WHERE DEPT-NAME = :DEPT-NAME 04 END-EXEC. 05 EXEC SQL DECLARE CRS2 CURSOR FOR 06SELECT ... FROM GRADES 07 WHERE STUDENT-ID = :STUDENT-ID 08 END-EXEC.
Naive Translation (2) 09 MOVE 0 TO STATUS1 10 EXEC SQL OPEN CRS1 END-EXEC 11 PERFORM UNTIL STATUS1 IS NOT EQUAL TO 0 12EXEC SQL FETCH CRS1 INTO ... END-EXEC. 13 IF SQL-STATUS = SQL-NOT-FOUND THEN MOVE 1 TO STATUS1. 14 IF STATUS1 IS EQUAL TO 0 THEN 15IF STUDENT-DEGREE IS EQUAL TO 2 THEN 16MOVE 0 TO GRADES-SUM 17 MOVE 0 TO GRADES-COUNT 18 PERFORM SUM-STUDENT-GRADES 19 DIVIDE GRADES-SUM INTO GRADES-COUNT GIVING GRADES-AVG 20IF GRADES-AVG > 95 THEN 21 DISPLAY ..., GRADES-AVG 22 END-IF 23 END-IF 24 END-IF 25 END-PERFORM. 26 EXEC SQL CLOSE CRS1 END-EXEC.
Naive Translation (3) 27 SUM-STUDENT-GRADES. 28 MOVE 0 TO STATUS2 29 EXEC SQL OPEN CRS2 END-EXEC. 30 PERFORM UNTIL STATUS2 IS NOT EQUAL TO 0 31EXEC SQL FETCH CRS2 INTO ... END-EXEC. 32 IF SQL-STATUS = SQL-NOT-FOUND 33 THEN MOVE 1 TO STATUS2. 34 IF STATUS2 IS EQUAL TO 0 THEN 35ADD GRD-GRADE TO GRADES-SUM 36 ADD 1 TO GRADES-COUNT 37 END-IF 38 END-PERFORM. 39 EXEC SQL CLOSE CRS2 END-EXEC.
MIDAS: Translation by Abstraction, Transformation, and Re-implementation Plan Temporal Plan Temporal Abstraction Abstraction Re-implementation Network DB code Relational DB code
MIDAS Translation 01 EXEC SQL DECLARE CRS1 CURSOR FOR 02 SELECT STUDENT.STUDENT-ID, FIRST-NAME, LAST-NAME, AVG(GRADE) 03 FROM STUDENT, GRADES 04 WHERE DEGREE = 2 05 AND DEPT-NAME = :DEPT-NAME 06 AND GRADES.STUDENT-ID = STUDENT.STUDENT-ID 07GROUP BY STUDENT.STUDENT-ID, FIRST-NAME, LAST-NAME 08HAVING AVG(GRADE) > 95 09 END-EXEC. 10 PERFORM UNTIL SQL-STATUS = SQL-NOT-FOUND 11 EXEC SQL FETCH CRS1 INTO ... END-EXEC. 12 DISPLAY ..., GRADES-AVG 13 END-PERFORM
The Internal Representation:Query Graphs • Temporal abstraction • Generate / Join • Filter • Map • Aggregate • Wide-spectrum formalism
Conclusions • Translation by abstraction, transformation, and re-implementation demonstrated in two domains • Query graphs as abstraction for database operations • Adapts to different schema transformations • Scalability
Conclusions and Future Work • Appropriate domain • Same host language • Few cliches give wide coverage • Important commercially • Generalizations • Other legacy models • OODB / 4GL as targets
Questions? Papers can be downloaded from http://www.idc.ac.il/yishai