330 likes | 513 Views
Who is Bull Information Systems. 8000 Employees Worldwide$1.6 Billion Annual RevenueHeadquartered in ParisServers / Services / Supercomputers / Open SourceR
E N D
1. Semantic Analysis in Action
2. Who is Bull Information Systems 8000 Employees Worldwide
$1.6 Billion Annual Revenue
Headquartered in Paris
Servers / Services / Supercomputers / Open Source
R&D Partners with Intel, IBM, EMC, Oracle, Microsoft,…..
Phoenix, Arizona R&D
Mainframe / Open System Architects
Intel / IBM chips, EMC Storage, Cisco networks, …..
Database Integration/Migration Services
Oracle, IMS/IDS2, DB2, PostgreSQL, SQL Server,…..
Targeting Fortune 500 customers
3. Large Customer Migrations this past year CNAF
Migrating from Mainframe Relational Database
CAIXA Bank and Banco do Brasil
Migrating from Oracle to PostgreSQL
Oklahoma Employment Security Commission (OESC)
Migrating from Network Model to PostgreSQL
4. Semantic Tools from Rever SA Rever (founded 2005)
DB-MAIN (1995, University of Namur Belgium)
Database Modeling
Source Code analysis for Programmatic Relationships
Semantic Schema – Logical Schema – Physical Schema
Generated Data Migration Automation
Application Code Conversion
Semantics of Conversion into Generated I/O Modules
Data Rules
Database specific rules
Application accesses target database through the rule generated modules
5. Why Rever Tools REVER approach is semantic
and for this reason the tools are generic
Source Environments
DBMS (IMS, ADABAS, IDMS, IDS/II,…)
languages (COBOL, PLI, NATURAL,…)
Target Environments
DBMS (ORACLE, DB2, Microsoft SQL-Server, PostgreSQL,…)
languages (COBOL, PLI, C, JAVA,...)
6. Evolving the Schema….. Rever Tools IDS/II Schema Capture
Subschema Capture
Copy Book Capture
Data Modeling (DB-MAIN) to create the Relational Schema
Automated Phase
Manual Override
Handling Redefines, for example
Cobol Analysis for Application-Created Relationships
Relational Schema Creation
All Tools are PC Based
All Results are Repository Contained
7. Identifying Data Relationships
8. Evolving the Schema
9. Evolving the Schema
10. Evolving the Data Automatically generated Cobol Programs to Unload the IDS/II Data
Maintain the relationships of the IDS/II data
Explicit Sets
Implicit DBKEY
Maintain the implied Order of Data
Analyze the Data with SQL
Clean the Data
Automatically Load Data into Postgres
Check the Content
11. Tool to Check Data Quality
12. Migrating the Cobol
13. Oklahoma Employment Security Commission (OESC) Match Jobs to Available Local Workers
Provide Compensation to Unemployed Workers
Increase the Skills of the Workforce to Meet Requirements
Disseminate Information on Labor Force to Improve Local Governmental Decisions
14. Project – Move the IDS/II Database to PostgreSQL 3 IDS/II Schemas
500 Record Types
560 TPRs
2000 Batch Programs
800 JCL
15. PostgreSQL “The World’s Most Advanced Open Source Database”
ANSI SQL Compliant
Including Embedded SQL Support
Performance Comparable to Oracle
All the Technical Features
Stored Procedures, Triggers, JDBC, Unicode, Native SSL support,…
No Licensing Cost
Better Support than Proprietary Vendors
Minimized Management Effort
Legendary Reliability
Designed for High-Volume Requirements
16. Applying the tools to the OESC Environment Customer desire
Convert the Database (IDS/II to PostgreSQL)
Maintain the Cobol for now
DB-MAIN - PC Based …bring the source to the PC
Database Modeling …IDS/II Schema/ Subschema / COPYs
Source Code analysis … It’s Cobol
Data Analysis ….Scan the data looking for Rules
Source Code Conversion … Scan the Cobol for Logical Relations
Semantics of Analysis forged into I/O Modules
Data Rules ….. Special meaning for ‘11111’
Database specific rules (Meaning of Current of Set in IDS/II)
Cobol Rules …Redefines / OCCURs from the Cobol Source
Application accesses target database through the rule generated modules…these are Cobol modules generated from the Semantic analysis
Data Migration …..
Cobol modules generated to gather the data from IDS/II database while maintaining the semantics
Relational Schema designed to maintain the IDS/II semantics expected by the existing programs
17. OESC Questions Addressed Will the automated tools handle a large complicated conversion
Will the performance be equivalent to IDS/II
TP
Batch
Will all Data Issues be covered
How will the testing be accomplished
Handling Database Save / Restore
….
18. Automated Tools
Tools automated 80% of the conversion, saving time
Choices with Schema Definition
Handling Data Issues
Shared responsibility for finalizing the schema
Programmers know the application
Bull provides IDS/II and Postgres knowledge
Dedicated Interface on both OESC and Bull sides
Frequent teleconferences (Go-to-Meeting style)
20. Performance for TP Cobol application did not change
Only the I/O routines changed
SQL is slower than IDS2 (micro seconds per call)
TP Performance Analysis
Less than 10 IDS2 calls per transaction
.250 seconds per transaction impacted by a few milliseconds
Open Partition with DBMS off-loads CPU from GCOS
No IDS2 execution
No Buffer Pool Management
Integrity Control in SQL Server
Reduced GCOS CPU improves TP Throughput
TP Performance is not impacted
21. Performance for Batch Issue of 1-to-1 IDS/II to SQL Conversion
Programs identified needing Performance improvement
Each was reduced from running with IDS/II in 5 or 10 minutes to run with Postgres in less than a minute
Replace IDS/II logic with a single SQL statement or two
Two approaches
Write SQL Procedure stored in Database
Maintained by DBA
Add SQL to Program logic
Maintained by programmer
22. Simple Optimization Examples
Program that repeatedly walks IDS/II structures to find specific record types to select a subset of the records. Change such a program to use a “JOIN” statement to save significant processing time
The use of a “WHERE” clause to select only the desired database records or requesting only the fields of interest can substantially reduce the amount of data returned to the GCOS program
23. Data Content Issues Many Date Fields in the OESC Databases
Date Fields in IDS/II were just character strings
Date Fields in Postgres
Allow for comparison of dates
Statement paid within a month
Allow for Subtraction of Dates
Average time to deliver
24. Data Content Issues Date Analysis:
Prior to migrating the database to SQL format, data analysis was done for specific fields.
Some date fields were found to have invalid data. (invalid month or day combinations, zero, etc).
Some dates were negative numbers. These were left as numeric.
Some dates were combined as datetime fields.
Some dates were defined in alternate keys and the Cobol code searched for records using data that was not a valid date. Examples of such search criteria might include zeros in the month or day fields. This caused issues even when the database fields were valid dates:
SQL cannot compare a Date column to an invalid Date value (< or > compares).
Comparing a valid date of 12/31/2009, to 02/29/2009, 12/00/2009
or 04/31/2009 is not a valid comparison
Data Analysis helped define the Relational Schema
25. How to clean the data Move a draft copy of the unloaded data into Postgres
Tools to scan the data based on rules
Are numerics valid
Are dates valid
Find special values to indicate ‘not known’
Un-initialized values from Cobol Working Storage
Reports generated by Record detail the issues
Fix the data during the load of Postgres
It’s in Flat Files and easy to correct
Fix it in IDS/II at your own risk
Applications are coexisting with the ‘bad’ data
26. Evolving the Customer Cobol IDS/II Calls are removed automatically
Access Modules are Generated automatically to do the equivalent of the IDS/II with Postgres
Cobol routines
Can be modified as needed
Specific to the Customer Schema
Transform all IDS/II Characteristics to SQL
Minimal Change to the Customer Cobol by Default
27. Testing A Duplication of the OESC Environment was setup at Bull in Phoenix
Gather SAVEs of IDS/II Data from OESC
Setup IDS/II Test Environment
TP Workstations
IDS/II Databases
JCL for batch
Train the Bull testers
Test with IDS/II
Duplicate the test with Postgres and the converted Cobol
28. Resolving Test Difficulties Application had been written and initially tested many years ago
No overall application change since Y2K
In the recent years, application changes were very focused and thus the testing was not usually for the whole system but for the changed areas of the application
Developed Plan for Testing
29. Basic Testing Identify Most important TPRs
Test with IDS/II
Test with Postgres
Identify Most Important Batch
Check the performance
Batch
Single TPRs
TP under load using Drivers
30. Extended Testing Extend the Scope of Tests
Monitor the Test Success
Multiple screen choices through TPRs
Arranging remote testing
Training the Staff
Review Fixes
31. Finalizing the Turnover
Training
Postgres
Maintaining the programs subsequent to the transformation
Trace Activity through I/O modules
Arranging for the customer to attempt database reloads
Save / Restore Approach
32. Summary Because of the large number of programs and large amount of data, tools are necessary to reduce the effort
Features of IDS/II accomplished with equivalent PostgreSQL technology
Performance of TP a given, performance of batch requires some effort
Testing is a major component in the evolution
Required overall plan and teamwork with OESC