160 likes | 171 Views
GENERIC ETL DESIGN. VARADARAJAN VASU. varadarajan.v@polaris.co.in. SENIOR PROJECT MGR/ARCHITECT. POLARIS SOFTWARE LAB. OBJECTIVE. Application area is bifurcated as ETL / Reporting . Major Operations Select/Insert/Update/Delete
E N D
GENERIC ETL DESIGN VARADARAJAN VASU varadarajan.v@polaris.co.in SENIOR PROJECT MGR/ARCHITECT POLARIS SOFTWARE LAB
OBJECTIVE • Application area is bifurcated as ETL / Reporting . • Major Operations Select/Insert/Update/Delete • To replace existing primitive methods used for ETL Design/ Automation • System should be highly intelligent to do all jobs on behalf of users • Build a comprehensive solution once and use it across verticals
PERT PROCESS • PERT Stands for PROGRAM EXECUTION on REMOTE TERMINALS • Different from Program Evaluation Review Technique used by SEI • Technology used in Client/Server architecture
PERT PROCESS FLOW PERT START FREE SPACE CHECK ORACLE PROCESSES CHECK
EXECUTABLE PRESENCE CHECK PROCEDURE VALIDITY CHECK CHECK FOR PARELLEL RUN CHECK FOR RESTARTABILITY
1. SYSTEM INTELLIGENT CHEKS - PARAMETERISED 2. DATE CHANGE - PARAMETERISED 3. DETERMINE STAGING RUN INFORMATION - PARAMETERISED 4. STAGE REFRESH LOADER 5. GATHER FINAL REFRESH INFORMATION - PARAMETERISED 6. FINAL REFRESH LOADER 7. DATA VALIDATION CHECKS - PARAMETERISED 8. MAKE SYSTEM READY FOR NEXT DAY RUN - PARAMETERISED SUCCESS PERT END
SYSTEM INTELLIGENT CHECKS - Examples • SPACE CHECK • OBJECTS VALIDITY CHECK • EXECUTABLES VALIDITY CHECK • PROCESS RUNNING CHECK • PREVENT SUCCESS RUN • PREVENT PARELLL RUN • RESTARTABILITY • HANDLE UNAVOIDABLE INTERRUPTS FROM OS
OPERATION READINESS- Examples • ARCHIVE • INDEXING • COMMUNICATING WITH EXTERNAL PARTIES • MAILING • COMPILING ETL EXECUTION STATISTICS • MOVING OBJECTS TO RESPECTIVE LOCATION • ANALYZING • CLEANUP EXERCISE
SALIENT FEATURES OF PERT • SPACE CHECK • PROCEDURE OBJECTS VALIDITY CHECK • EXECUTABLES VALIDITY CHECK • PREVENT SUCCESS RUN • PREVENT PARELLL RUN • RESTARTABILITY • PROVISION TO SCHEDULE FOR UPCOMING RUN FREQUENCIES • BETTER ERROR LOGGING • HANDLE UNAVOIDABLE INTERRUPTS FROM OS • Load check for staging , Final • Provision for manual run
ADVANTAGES • Design is dynamic in nature • Limited time availability to plug in new facility • Avoid redundancy in coding & testing efforts • Sleeping beauty is cost effectiveness • Restart facility to start from the aborted place during data extraction and population • ETL solution can be used for other similar ETL applications.
CHALLENGES • Requirements Gathering • Database Design • Performance in Execution