1 / 33

Large Database and What to Do Bigger is Better!

Large Database and What to Do Bigger is Better!. Floria Hanspard-Foote. Bigger Is Better Agenda. New Limits Estimating File Size ACCESS File Horizontal Partitioning Vertical Partitioning Intelligent USES. EMPLOYEE Lots of Employees. Bigger Is Better Employee File. EMP_ID

jensen
Download Presentation

Large Database and What to Do Bigger is Better!

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Large Database and What to DoBigger is Better! Floria Hanspard-Foote

  2. Bigger Is BetterAgenda • New Limits • Estimating File Size • ACCESS File • Horizontal Partitioning • Vertical Partitioning • Intelligent USES

  3. EMPLOYEE Lots of Employees

  4. Bigger Is BetterEmployee File EMP_ID LAST_NAME FIRST_NAME HIRE_DATE I PAY_DATE GROSS NET CHECK_NO BANK_NAME BANK_CODE BANK_ACCT EFFECT_DATE DAT_INC PCT_INC SALARY JOBCODE I

  5. Bigger Is BetterLimits for FOCUS/FUSION files • Number of Pages: • FUSION -- 512K pages • FOCUS -- • Release 7.2 512K pages 2gig • Release 7.1 512K pages 2gig • Release 7.0 256K pages 1gig • Number of Segments 64 • Number of Indexes + text fields + segments 189 • Number of Fields 3072 • Segment Size (data + pointers) 3968 bytes

  6. Bigger is BetterEstimating File Size • Each Instance is comprised of Data and Pointers • Data • An n bytes of Storage • In 4 bytes • Dn.m 8 bytes • Fn.m 4 bytes • Pn.m (n <= 15) 8 bytes • Pn.m (n > 15) 16 bytes • Smart Dates 4 bytes • PLUS filler to pad to full word (4 bytes)

  7. Bigger is BetterEstimating File Size • Pointers • 4 Bytes: type, page #, Word Offset • Types: • Parent to Real Child • Parent to KU Child • Parent to KM Child • Child to Parent • Forward Chain • Deleted, Free, End-of-Chain

  8. Bigger is BetterEstimating File Size • Index = I • Internal Index – Updated as File is Updated • Per Entry: • Value • Address of Data Instance • Pages may be ½ full

  9. Bigger is BetterEstimating File Size SEGNAME=EMPINFO, SEGTYPE=S1 FIELDNAME=EMP_ID, ALIAS=EID, FORMAT=A9, INDEX=I, $ FIELDNAME=LAST_NAME, ALIAS=LN, FORMAT=A15, $ FIELDNAME=FIRST_NAME, ALIAS=FN, FORMAT=A10, $ FIELDNAME=HIRE_DATE, ALIAS=HDT, FORMAT=YYMD , $ FIELDNAME=DEPARTMENT, ALIAS=DPT, FORMAT=A10, $ FIELDNAME=CURR_SAL, ALIAS=CSAL, FORMAT=D12.2M, $ FIELDNAME=CURR_JOBCODE, ALIAS=CJC, FORMAT=A3, $ FIELDNAME=ED_HRS, ALIAS=OJT, FORMAT=F6.2, $ Pointers: 3 – Parent – to – Child ; 1 – Chain = 16 bytes Data: 63 Bytes + 1 “filler” = 64 bytes Total: 80 bytes Instances/Page = 49

  10. Bigger is BetterEX CALCFILE SEGMENT NUM FOCUS TOTAL TOTAL TOTAL SEG /INDEX PER 4K BYTES POINTER DATA NO NAME PAGE INSTANCES PAGES USED BYTES BYTES --- ------- ---- --------- ----- ----- ------- ----- SEGMENT SPACE ============== 1 EMPINFO 49 10000 205 800000 160000 640000 2 FUNDTRAN 99 10000 102 400000 80000 320000 3 PAYINFO 141 50000 355 1400000 400000 1000000 4 SALINFO 99 1040000 10506 41600000 8320000 33280000 INDEX SPACE (ASSUMES HALF FULL PAGES) ============== 1 EMP_ID 151 10000 68 130000 40000 90000 4 CHECK_NO 151 1040000 6889 13520000 4160000 9360000 TOTAL 2160000 18125 57850000 13160000 44690000

  11. Bigger is BetterEX CALCFILE SEGMENT NUM FOCUS TOTAL TOTAL TOTAL SEG /INDEX PER 4K BYTES POINTER DATA NO NAME PAGE INSTANCES PAGES USED BYTES BYTES --- ------- ---- --------- ----- ----- ------- ----- SEGMENT SPACE ============== 1 EMPINFO 49 100000 2041 8000000 1600000 6400000 2 FUNDTRAN 99 100000 1011 4000000 800000 3200000 3 PAYINFO 141 1000000 7093 28000000 8000000 20000000 4 SALINFO 99 10400000 105051 416000000 83200000 332800000 INDEX SPACE (ASSUMES HALF FULL PAGES) ============== 1 EMP_ID 151 100000 664 1300000 400000 900000 4 CHECK_NO 151 10400000 68876 135200000 41600000 93600000 TOTAL 22100000 184736 592500000 135600000 456900000

  12. Bigger is BetterACCESS File Points to the FOCUS Files MASTERNAME EMPLOYEE DATANAME 'EMPLOYEE FOCUS A' MASTERNAME JOBFILE DATANAME 'JOBFILE FOCUS F' FILENAME=EMPLOYEE, SUFFIX=FOC,ACCESS=EMPLOYEE,$ FILENAME=JOBFILE ,SUFFIX=FOC, ACCESS = EMPLOYEE,$

  13. Horizontal Partitioning

  14. Bigger Is BetterHEmploye File EMP_ID LAST_NAME FIRST_NAME HIRE_DATE I PAY_DATE GROSS NET CHECK_NO DAT_INC PCT_INC SALARY JOBCODE BANK_NAME BANK_CODE BANK_ACCT EFFECT_DATE I

  15. Bigger is BetterACCESS File Points to the FOCUS Files • MASTERNAME HEMPLOYE • DATANAME 'HEMPLOYE FOCUS A' • LOCATION FUNDS • DATANAME ‘FUNDS FOCUS B’ • LOCATION SALS • DATANAME ‘SALS FOCUS C’ • MASTERNAME JOBFILE • DATANAME 'JOBFILE FOCUS F'

  16. Bigger is BetterACCESS File Points to the FOCUS Files FILENAME=HEMPLOYE, SUFFIX=FOC,ACCESS=EMPLOYEE,$ SEGNAME=EMPINFO, SEGTYPE=S1 FIELDNAME=EMP_ID, ALIAS=EID, FORMAT=A9, INDEX=I,$ . . . SEGNAME=FUNDTRAN, SEGTYPE=U, PARENT=EMPINFO , LOCATION = FUNDS FIELDNAME=BANK_NAME, ALIAS=BN, FORMAT=A20, $ . . . SEGNAME=PAYINFO, SEGTYPE=SH1, PARENT=EMPINFO , LOCATION = HEMPLOYE FIELDNAME=DAT_INC, ALIAS=DI, FORMAT=MDYY , $ . . . SEGNAME=SALINFO, SEGTYPE=SH1, PARENT=EMPINFO , LOCATION = SALS FIELDNAME=PAY_DATE, ALIAS=PD, FORMAT=MDYY , $ . . .

  17. Bigger is BetterACCESS File Points to the FOCUS Files STATUS OF FOCUS FILE: HEMPLOYE FOCUS A ON 04/18/2001 AT 15.21.22 ACTIVE DELETED DATE OF TIME OF LAST TRANS SEGNAME COUNT COUNT LAST CHG LAST CHG NUMBER EMPINFO 12 04/18/2001 15.21.22 70 FUNDTRAN 00.00.00 PAYINFO 19 04/18/2001 15.21.22 70 SALINFO 00.00.00 *INDEXES* EMP_ID 04/18/2001 15.21.22 70 CHECK_NO 00.00.00 TOTAL SEGS 31 TOTAL CHAR 1148 TOTAL PAGES 3 LAST CHANGE 04/18/2001 15.21.22 70

  18. Bigger is BetterACCESS File Points to the FOCUS Files STATUS OF FOCUS FILE: FUNDS FOCUS B1 ON 04/18/2001 AT 15.21.22 ACTIVE DELETED DATE OF TIME OF LAST TRANS SEGNAME COUNT COUNT LAST CHG LAST CHG NUMBER EMPINFO 00.00.00 FUNDTRAN 12 04/18/2001 15.21.22 70 PAYINFO 00.00.00 SALINFO 00.00.00 EMP_ID 00.00.00 CHECK_NO 00.00.00 TOTAL SEGS 12 TOTAL CHAR 384 TOTAL PAGES 1 LAST CHANGE 04/18/2001 15.21.22 70

  19. Bigger is BetterACCESS File Points to the FOCUS Files STATUS OF FOCUS FILE: SALS FOCUS C1 ON 04/18/2001 AT 15.21.22 ACTIVE DELETED DATE OF TIME OF LAST TRANS SEGNAME COUNT COUNT LAST CHG LAST CHG NUMBER EMPINFO 00.00.00 FUNDTRAN 00.00.00 PAYINFO 00.00.00 SALINFO 70 04/18/2001 15.21.22 70 EMP_ID 00.00.00 *INDEXES* CHECK_NO 04/18/2001 15.21.22 70 TOTAL SEGS 70 TOTAL CHAR 2240 TOTAL PAGES 2 LAST CHANGE 04/18/2001 15.21.22 70

  20. Vertical Partitioning

  21. Bigger Is BetterUSEmploye File EMP_ID LAST_NAME FIRST_NAME HIRE_DATE I USEMP FOCUS PAY_DATE GROSS NET CHECK_NO DAT_INC PCT_INC SALARY JOBCODE BANK_NAME BANK_CODE BANK_ACCT EFFECT_DATE I USFUND FOCUS USSALS FOCUS

  22. Bigger Is BetterCAEmploye File EMP_ID LAST_NAME FIRST_NAME HIRE_DATE I CAEMP FOCUS PAY_DATE GROSS NET CHECK_NO DAT_INC PCT_INC SALARY JOBCODE BANK_NAME BANK_CODE BANK_ACCT EFFECT_DATE I CAFUND FOCUS CASALS FOCUS

  23. Bigger Is BetterEUEmploye File EMP_ID LAST_NAME FIRST_NAME HIRE_DATE I EUEMP FOCUS PAY_DATE GROSS NET CHECK_NO DAT_INC PCT_INC SALARY JOBCODE BANK_NAME BANK_CODE BANK_ACCT EFFECT_DATE I EUFUND FOCUS EUSALS FOCUS

  24. Bigger is BetterACCESS File Points to the FOCUS Files MASTERNAME HEMPLOYE DATANAME 'USEMP FOCUS A' LOCATION FUNDS DATANAME 'USFUND FOCUS A' LOCATION SALS DATANAME 'USSALS FOCUS A' DATANAME 'CAEMP FOCUS B' LOCATION FUNDS DATANAME 'CAFUND FOCUS B' LOCATION SALS DATANAME 'CASALS FOCUS B' DATANAME 'EUEMP FOCUS C' LOCATION FUNDS DATANAME 'EUFUND FOCUS C' LOCATION SALS DATANAME 'EUSALS FOCUS C'

  25. Bigger is BetterACCESS File Points to the FOCUS Files MASTERNAME HEMPLOYE DATANAME 'USEMP FOCUS A' WHERE DEPARTMENT EQ ' PRODUCTION' OR 'MIS' ; LOCATION FUNDS DATANAME 'USFUND FOCUS A' LOCATION SALS DATANAME 'USSALS FOCUS A' DATANAME 'CAEMP FOCUS B WHERE DEPARTMENT EQ ‘CANADA’; LOCATION FUNDS DATANAME 'CAFUND FOCUS B' LOCATION SALS DATANAME 'CASALS FOCUS B' DATANAME 'EUEMP FOCUS C' WHERE DEPARTMENT EQ ‘EUROPE’; LOCATION FUNDS DATANAME 'EUFUND FOCUS C' LOCATION SALS DATANAME 'EUSALS FOCUS C'

  26. Bigger is BetterIntelligent Partitions • Without WHERE test in ACCESS File • Each Partition is read in turn • IF/WHERE test applied to real data • WHERE test in ACCESS File • IF/WHERE test applied before partition is opened • Alternative: Build USEs

  27. Bigger is BetterIntelligent USE – no ACCESS File -SET &LIST =‘CANADA,EUROPE,MIS,PRODUCTION,ALL’; -PROMPT &DEPARTMENT.(&LIST).ENTER DEPARTMENT OR ALL. -IF &DEPARTMENT NE ‘ALL’ GOTO SMUSE; -ALLUSE USE USEMP FOCUS A AS HEMPLOYE USFUND FOCUS A AS FUNDS USSALS FOCUS A AS SALS CAEMP FOCUS B AS HEMPLOYE CAFUND FOCUS B AS FUNDS CASALS FOCUS B AS SALS

  28. Bigger is BetterIntelligent USE – no ACCESS File EUEMP FOCUS C AS HEMPLOYE EUFUND FOCUS C AS FUNDS EUSALS FOCUS C AS SALS END -GOTO DOIT --SMUSE USE -SET &PREF = IF &DEPARTMENT EQ ‘CANADA’ THEN ‘CA’ ELSE - IF &DEPARTMENT EQ ‘EUROPE’ THEN ‘EU’ ELSE ‘US’; &PREF|EMP FOCUS * AS HEMPLOYE &PREF|FUND FOCUS * AS FUNDS &PREF|SALS FOCUS * AS SALS END

  29. Bigger is BetterIntelligent USE – no ACCESS File -DOIT JOIN CJC IN HEMPLOYE TO JOBCODE IN JOBFILE AS AJ TABLE FILE HEMPLOYE PRINT PAY_DATE BY EID BY JOB_DESC -IF &DEPARTMENT EQ ‘ALL’ GOTO NOIF; WHERE DEPARTMENT EQ ‘&DEPARTMENT’ -NOIF END

  30. Bigger is BetterMaybe • Since each physical file can be 2 gig (512K pages) • Up to 250 partitions (horizontal and vertical) • Each partition can be on its own disk • MVS – span multiple packs • Maximum size of a database is now 500 gigabytes of data. • Intelligent Partitioning will speed access, even if the size is not the issue. • Remember FOCSORT can now be 2 gigabytes also.

  31. Bigger is BetterBut there are Seven of Them • Which is better? • One big database • Seven smaller ones – intelligently partitioned • What’s the predominant access • Data Maintenance - BIG database • Alternatives – Separate MASTERs for each smaller file; COMBINE or MAINTAIN • Reporting -- Seven smaller files

  32. Before You leave! Be Sure To Visit Our Problem Isolation Debugging Tool Site http://techsupport.informationbuilders.com/app/css_web_tool/default.htm

  33. QUESTIONS Thanks for Coming!

More Related