VSAM

VSAM • Virtual Storage Access Method • Allows for interactive updates (adds, changes, and deletes)

Three Types of VSAM files • KSDS • Keyed Sequence Data Set • RRDS • Relative Record Data Set • ESDS • Entry Sequence Data Set

KSDS • Allows for users to access records sequentially or randomly • Includes an index and data components • A CLUSTER consists of both the index and data components together • Index relates a value in a key field to the actual location of the record on disk • Index is only used in random (or dynamic) processing

VSAM Data Component Storage Concepts • Control Interval (CI) • Fixed amount of storage; Must be multiple of 512 • Usually 2048 or 4096 bytes • Data records are stored in a CI • A CI of 4096 bytes can store forty 100-byte records • Control Areas (CA) • One cylinder in size • Size of a cylinder varies by disk drive • Contains numerous CIs

Approximate number of records in a CI (CI size - 10) / record length

Records in a CI • Calculate the number of records in a CI formula on previous slide • Determine the percentage of records in a CI to be added or insert in an average CI • Concept of FreeSpace • Space where records can be added “on the fly”

CI in a CA • A fixed number based on CI size and disk drive • For disk drive with ACD001: • CI of 2048: 315 CI in a CA • CI of 4096: 180 CI in a CA • Determine the percent of the CI to be left open for additions • Known as CA freespace

Freespace in a CI • Is used to add records which belong in the CI • Records in a CI are shifted automatically within the CI to accommodate the inserted record

CI Splits • If there is no freespace in the CI in which the record is to be inserted: • CI split • Half of the records in the CI are moved to a free CI in the same CA • The inserted record is then inserted in the proper CI • These happen routinely and are accomplished quickly

CA splits • If a CI needs to split and there is no free CI in the CA: CA split • Half of the CI are moved to a free CA (usually at the end of the file, there is unused space) • Therefore, each of the two CI have 50% free space • The original CI can now split • CA splits have much overhead and should be avoided!!!

Avoid CA splits • Reorganization of files • Import: Copy the VSAM file to a sequential dataset • Export: Delete and reload the VSAM file from the sequential data set: Resets the freespace as in the define cluster • Allocate adequate freespace • Analyze the primary key

Primary Key Analysis • Is the pattern in the PK • Example: The Financial Aid office must keep three years of data on-line. • Previous year: State reporting • Current year: Distributing aid to students • Next year: Granting/guaranteeing aid for next year

Primary Key Analysis • Key options for the financial aid file • 1-digit year + SSN • Little on-line activity on first third of file • Most “adds” are in last third: CI/CS split likely • SSN + 1 digit year • Activity is spread evenly throughout the file • Recommended

VSAM Index Component • Every KSDS as an index component for each primary key AND each foreign key • Base cluster: Primary key index and Data components • Foreign key index is known as the alternate index

Primary Key Index and Data • Two parts of the index: • Sequence set • Index set

Primary Key Index and Data • Sequence Set • Lowest level of the index component • Contains information that relates key values to a specific CI • Links the highest PK in a CI to the address of that CI • Stores all the key-address pairs for the Cis in a CA in one CI of the sequence set • There is a separate CI in the sequence set for each CA in the data

Primary Key Index and Data • Index Set • Highest level of the index component • Key-address pairs stored in one CI (can be stored in main memory for processing efficiency) • Address Pointer links to address of the appropriate sequence set • Based on size of data component (number of cylinders or CA needed to store data component), you may need a intermediate index

Alternate Index and Data Component • Alternate index relates alternate key to primary key • Then uses the primary key index to locate the data • Can have unique or non-unique AI • Figure 2-4 (unique) • Figure 2-5 (non-unique)

Alternate Index • Systems analyst determines whether to update the AI each time the cluster is changed • Can cause much overhead to update all indices especially in the case of a CA split • Other alternative is to re-build the index periodically (every night)

Relative Record Data Set (RRDS) • Lets the user access each record at random without the overhead of maintaining an index • Instead each record in a RRDS is numbered, starting with 1 for the first record • RRDS consists of a specified number of areas or slots • Known as the relative record number (RRN)

RRDS • May need a routine to convert the PK of a record to a relative record number • Hashing • Most common hashing routine: The remainder option on the divide • Can cause empty slots. Can waste storage; But we avoid CI/CA splits • Difficult to HASH if PK is non-numeric

RRDS • Collision • If hashing routine results is same RRN for two different PK • Must set secondary searching technique in case of collisions • Usually linear probing: check the next record up to a maximum number of tries. • Needed to know if record to be added already exists • Needed to determine if record to be retrieved exists without reading entire file

RRDS • Advantages • No index overhead • Direct relationship between data and location of the data • Permits both random and sequential processing • If good hashing routine with minimal collisions: performance efficiency is excellent

RRDS • Disadvantages • Storage efficiency • Collisions • Difficulty in determining good hashing technique • Difficulty with alphabetic key • Does NOT support the concept of FK or AI • Not widely used

ESDS • Entry Sequenced Data Set • The simplest type of VSAM file • Records are stored sequentially at time of entry

ESDS • Similar to sequential processing • Does allow to OPEN EXTEND to add records to the end of the file

ESDS • Author does not recommend it • Says it is restricted to sequential processing • BUT you can build an AI for a FK • BUT as of my current COBOL manuals, COBOL cannot use the AI and processing must be sequential. • This is my current topic for career day questions

VSAM

VSAM

Presentation Transcript

VSAM ACCESS METHODS

VSAM ESDS and RRDS

VSAM Alternate Indexes

Moving COBOL/CICS/VSAM to iPhone

Working with Datasets Part 1, non VSAM

Overview of VSAM and Defining a Cluster

VSAM KSDS and COBOL

CICS VSAM Transparency