Object Based Disk: the key to intelligent I/O

Object Based Disk: the key to intelligent I/O George Gorbatenko Data Machine International St Paul, MN 55115 gorby@ece.umn.edu

Why are we interested? • faster • transportable • more accessible • cheaper • facilitates holistic design • improve reliability DMI

I/O is considered the weak link in systems architecture • I/O problem • memory wall • bottle neck DMI

Issues • randomness is painful • mechanical time vs electronic time • ratio of times is about 200:1 • operating system obscures the disk DMI

Operating System • seamless view of space • legacy of data storage goes back to punched card • accommodates all applications DMI

Data evolution • tape reflected a 80 column card image • disk reflected tape DMI

In short… • nothing much has changed data format-wise since the 1930’s • we are pretty much dealing with records in a linear format, one record after the next DMI

The advantage of object based design is • encapsulate the data • define the application subset • don’t have the operating system getting in the way DMI

SQL object is good choice • broad user base • de facto standard for data bases • high enough to exploit the power in the I/O • yesterdays CPU in today’s disk (controller) • aggregate compute power exceeds the host DMI

Researchers in Intelligent Disks are motivated by… • exploiting the latent processing potential • filtering data in place DMI

Consider a disk farm… DMI

But where do we place the intelligence? • host • I/O controller • disk DMI

many platters (ea fixed head) 10 many concentric tracks / platter 10k each track holds many sectors 100 Total number of 512 byte sectors 10M ____ disk capacity: 5GB Disk basics DMI

To access a random block • seek to track 10-15 us • wait for block to roll around 4 –5 us • read block 80 us hence… 200:1 DMI

Design Goals • synchronous operation • next data you want is beneath head • process data in place (filter) • touch the min amount of data • for what you touch you pay in time and space • exploit locality • amortize random access read over large data block DMI

Access strategies… • Amortize the (inefficient) access over large block of data • Make sure the data has utility DMI

Optimum Block Size DMI

Select name, address,salary where salary >22K DMI

Data Utility… DMI

Consider the travels of an inchworm… A1 B1 C1 D1 E1 A2 B2 C2 D2 E2 A3 B3 C3 D3 E3 A4 B4 C4 D4 E4 A5 B5 C5 D5 E5 DMI

Travels of an inchworm… A1 B1 C1 D1 E1 A2 B2 C2 D2 E2 A3 B3C3D3 E3 A4 B4 C4 D4 E4 A5 B5 C5 D5 E5 DMI

Locality of Reference A1 B1 C1 D1 E1 A2 B2 C2 D2 E2 A3 B3C3D3 E3 A4 B4 C4 D4 E4 A5 B5 C5 D5 E5 (a) Logical view of two dimensional table. A1 B1 C1 D1 E1 A2 B2 C2 D2 E2 A3 B3 C3 D3…… (b) Row ordered mapping (physical). A1 A2 A3 A4 A5B1 B2 B3 B4 B5 C1 C2 C3 C4…… (c) Column ordered mapping (physical) DMI

Preservation of Logical Topology To preserve the logical topology of n dimensional logical data space, the physical space must at least be of like dimension. - for a 2D table (rows and columns) we need to view disk as two dimensional DMI

Observations: • SQL can be decomposed in two operations • select - favored by column order • extract – favored by row order • granular access permits touching min data • map data so as to preserve topology when going from logical to physical medium • reading a tracks worth of data appears reasonable DMI

Treating disk as 2D space • data objects are 2D spaces • solves “design boundaries” • disk is basically a 3D medium • cylinder-track-sector DMI

The disk is 3 dimensional DMI

Consider the first cylinder of the set… DMI

Examining a single cylinder… DMI

which has tracks and sectors… DMI

track for each head… DMI

track read… DMI

diagonal (sector block) read… DMI

sector block shadow DMI

Unwrap a cylinder… DMI

2 dimensional space: hd x sector DMI

track read or sector block read… DMI

Physical Sector Block Organization… Physical sector (512) Logical sector size (lss) DMI

Logical Sector Block Organization… Physical sector (512) Logical sector size (lss) DMI

record structure… typedef struct _record { char employee_no [8]; // employee number; field A char name [12]; // name; field B char address [24]; // address; field C char zip [5]; // zip code; field D char salary [6]; // salary; field E char doh [6]; // data of hire; field F char dept [3]; // department; field G char tbd [16]; // reserved for future use; field H } Record; DMI

modified best fit algorithm LSS (8 bytes) LSS = ceil (rec_len / num_hds) = ceil (64 /10) = 4n = 8 rec_space = LSS * num_hds = 80 bytes DMI

modified best fit algorithm LSS (8 bytes) A typedef struct _record { char employee_no [8]; // field A char name [12]; // field B char address [24]; // field C char zip [5]; // field D char salary [6]; // field E char doh [6]; // field F char dept [3]; // field G char tbd [16]; // field H } Record; B C D G E F DMI

SQL Decomposition… • Select records • scan the salary field • stores ordinal position in bit vector • Extract records • optimizer decides strategy (trk or sb read) DMI

Comparison Results… DMI

Prototype • two 4 GB Seagate Baracudas • 21 heads (29 zones) • 40 KLOC • skew = 5 sectors • Solaris 2.51 OS • emulated intelligence in IOP • context sw every 60 ms DMI

Data particulars… • 168 byte records • LSS = 8 bytes • 63 records per Sector Block • 7,749 records per cylinder • 3 fields (2 heads) involved in query • 2 records extracted from disjoint blocks DMI

Test Runs • write cyl worth data w/o optimizer • write same with optimizer enabled • scan cyl involving 3 col; extract 2 blks • repeat operation (c) DMI

Results… ObservedCalculated case (a) 2.5 sec 2.427 sec case (b) 196 ms 216 ± 4 ms case (c) 51 ms 54.5 ± 4ms case (d) 42 ms 37.6 ± 4ms DMI

Benchmark Analysis • 3 Benchmarks selected - Wisconsin - Set Query - TPC D/H • selected non-join cases • reversed engineered the I/O detail DMI

Wisconsin results… WISCONSIN BENCHMARK 1000 WIS 2D 100.0 Time ( seconds) 10.00 1.000 0.100 Q1 Q2 Q3 Q4 Q5 Benchmarks DMI

Object Based Disk: the key to intelligent I/O

Object Based Disk: the key to intelligent I/O

Presentation Transcript

PPT Presentation

Download the Powerpoint Presentation - PowerPoint Presentation

talk-ppt - PowerPoint Presentation

Disk I/O Performance focusing on Caching

I/O Management and Disk Scheduling

Lecture 22: I/O, Disk Systems

SQL 2005 Disk I/O Performance

Introduction to I/O and Disk Management

Training I nformation Powerpoint Presentation

I/O Management and Disk Scheduling

Disk and I/O Management

I/O Management and Disk Scheduling

Introduction to PowerPoint 2007 (PPT)

I/O Management and Disk schedule

Compiler-Based Code Partitioning for Intelligent Embedded Disk Processing

I/O Management and Disk Scheduling

Disk I/O

DISK I/O

Disk Based Storage

I/O Management and Disk Scheduling

I/O Management and Disk Scheduling

Storing Data: Disk Organization and I/O