350 likes | 366 Views
Learn about hardware components, storage management, sorting techniques, physical query optimization, file organization, and index types in a Database Management Systems lecture.
E N D
C20.0046: Database Management SystemsLecture #25 Matthew P. Johnson Stern School of Business, NYU Spring, 2004 M.P. Johnson, DBMS, Stern/NYU, Sp2004
Agenda • Previously: Hardware & sorting • Next: • Indices • Failover/recovery • Data warehousing & mining • Websearch • Hw3 due Thursday • no extensions! • 1-minute responses • XML links up M.P. Johnson, DBMS, Stern/NYU, Sp2004
Let’s get physical Query update User/ Application Query compiler/optimizer Query execution plan Transaction commands Record, index requests Execution engine Index/record mgr. • Transaction manager: • Concurrency control • Logging/recovery Page commands Buffer manager Read/write pages Storage manager storage M.P. Johnson, DBMS, Stern/NYU, Sp2004
Hardware/memory review • DBs won’t fit in RAM • Disk access is O(100,000) times slower than RAM • RAM Model of Computation • Single ops about same as single memory access • I/O Model of Computation • We read/write one block (4k) at a time • Measure time in # disk accesses • Ignore processor operations – O(100,000) times faster • Regular Mergesort • Divide in half each time and recurse M.P. Johnson, DBMS, Stern/NYU, Sp2004
Hardware/memory review • Big problem: how to sort 1GB with 1MB of RAM? • Can use MS but must read/write all data 19+ times • Soln: TPMMS (External MergeSort) • Sort data in 1MB chunks • Sort 249 of the chunks into a 249MB chunk • Sort 249 of the 249MB chunks… • Each iteration: • RAM size/blocksize * last-chunk-size M.P. Johnson, DBMS, Stern/NYU, Sp2004
M/R records . . . . . . Disk Disk M bytes of main memory External Merge-Sort • Phase one: load 1MB in memory, sort • Result: SIZE/M lists of length M bytes (1MB) M.P. Johnson, DBMS, Stern/NYU, Sp2004
Phase Two • Merge M/B – 1 lists into a new list • M/B-1 = 1MB / 4kb -1 = 250 • Result: lists of size M *(M/B – 1) bytes • 249 * 1MB ~= 250 MB Input 1 . . . . . . Input 2 Output . . . . Input M/B Disk Disk M bytes of main memory M.P. Johnson, DBMS, Stern/NYU, Sp2004
Input 1 . . . . . . Input 2 Output . . . . Input M/B Disk Disk M bytes of main memory Phase Three • Merge M/B – 1 lists into a new list • Result: lists of size M*(M/B – 1)2 bytes • 249 * 250 MB ~= 62,500 MB = 625 GB M.P. Johnson, DBMS, Stern/NYU, Sp2004
Next topic: File organization 1 • Heap files: unordered list of rows • One damn row after another. • All row queries are easy: • SELECT * FROM T; • Insert is easy: just add to end • Unique/subset queries are hard: • Must test each row M.P. Johnson, DBMS, Stern/NYU, Sp2004
File organization 2 • Sorted file: sort rows on some fields • Since datafile likely to be large, must use an external sort like external MS • Equality, range select now easier: • Do binary search to find first • Walk through rows until one fails test • Insert, delete now hard • Must move avg of half rows forward or back • Possible solns: • Leave empty space • Use “overflow” pages M.P. Johnson, DBMS, Stern/NYU, Sp2004
Modifications • Insert: File is unsorted easy • File is sorted: • Is there space in the right block? • Then store it there • If anything else fails, create overflow block • Delete: Free space in block • Maybe be able to eliminate an overflow block • If not, use a tombstone (null record) • Update: new rec is shorter than prev. easy • If it’s longer, need to shift records, create overflow blocks M.P. Johnson, DBMS, Stern/NYU, Sp2004
Overflow Blocks • After a while the file starts being dominated by overflow blocks: time to reorganize Blockn-1 Blockn Blockn+1 Overflow M.P. Johnson, DBMS, Stern/NYU, Sp2004
File organization 3 • Datafile (un/sorted) + index • Speeds searches based on its fields • Any subset/list of table’s fields • these called search key • not to be confused with table’s keys/superkeys • Idea: trade disk space for disk time • also may cost processor/RAM time • Downsides: • Takes up more space • Must reflect changes in data M.P. Johnson, DBMS, Stern/NYU, Sp2004
Classification of indices • Primary v. secondary • Clustered v. unclustered • Dense v. sparse • Index data structures: • B-trees • Hash tables • More advanced types: • Function-based indices • R-trees • Bitmap indices M.P. Johnson, DBMS, Stern/NYU, Sp2004
Dense indices • Index has entry for each row • NB: index entries are smaller than rows • more index entries per block than rows M.P. Johnson, DBMS, Stern/NYU, Sp2004
Sparse indices • Why make sparse? • Fewer disk accesses • Bin search on shorter list – log(shorter N) • Analogy: “thumb” index in large dictionaries • Trade disk space for RAM space and comp. Time • May fit in RAM M.P. Johnson, DBMS, Stern/NYU, Sp2004
Secondary/unclustered indices • To index other attributes than primary key • Always dense (why?) M.P. Johnson, DBMS, Stern/NYU, Sp2004
Clustered v. unclustered • Clustered means: data and index sorted same way • Sorted on the fields the index is indexing • Each index entry stored “near” data entry • Sparse indices must be clustered • Unclustered indices must be dense • Clustered indices can reduce disk latency • Related data stored together – less far to go • Good for range queries M.P. Johnson, DBMS, Stern/NYU, Sp2004
Primary v. secondary • Primary indexes • usually clustered • Only one per table • Use PRIMARY KEY • Secondary indexes • usually unclustered • many allowed per table • Use UNIQUE or CREATE INDEX M.P. Johnson, DBMS, Stern/NYU, Sp2004
Partial key searches • Situ: index on fields a1,a2,a3; we search on fields ai, aj • When will this work? • i and j must be 1 and 2 (in either order) • Searched fields must be a prefix of the indexed fields • E.g.: lastname,firstname in phone book • Index must be clustered M.P. Johnson, DBMS, Stern/NYU, Sp2004
New topic: Hash Tables • I/O model hash tables are much like main memory ones • Hash basics: • There are n buckets • A hash function f(k) maps a key k to {0, 1, …, n-1} • Store in bucket f(k) a pointer to record with key k • Difference for I/O model/DBMS: • bucket size = 1 block • use overflow blocks when needed M.P. Johnson, DBMS, Stern/NYU, Sp2004
Example hash table • Assume: 10 buckets, each storing 5 keys and pointers (only 2 shown) • h(0)=0 • h(25)=h(5)=5 • h(83)=h(43)=3 • h(99)=h(9)=9 0 1 2 3 M.P. Johnson, DBMS, Stern/NYU, Sp2004
Hash table search • Search for 82: • Compute h(82)=2 • Read bucket 2 • 1 disk access 0 1 2 3 M.P. Johnson, DBMS, Stern/NYU, Sp2004
Hash table insertion • Place in corresponding bucket, if space • Insert 42… 0 1 2 3 M.P. Johnson, DBMS, Stern/NYU, Sp2004
Hash table insertion • Create overflow block, if no space • Insert 91… • More over-flow blocksmay be added as necessary 0 1 2 3 M.P. Johnson, DBMS, Stern/NYU, Sp2004
Hash table performance • Excellent if no overflow blocks • For in-memory indices, hash tables usually preferred • Performance degrades as ratio of keys/(n*blocksize) increases M.P. Johnson, DBMS, Stern/NYU, Sp2004
Hash functions • Lots of ideas for “good” functions, depending on situation • One obvious idea: h(x) = x mod n • Every x mapped to one of 0, 1, …, n-1 • Roughly 1/nth of x’s mapped to each bucket • Does this work for equality search? • Does this work for range search? • Does this work for partial-key search? • Good functions of hashing passwords? • What was the point of hashing in that case? M.P. Johnson, DBMS, Stern/NYU, Sp2004
Extensible hash table • Number of buckets grows to prevent overflows • Also used for crypto, hashing passwords, etc. • And: Java’s HashMap and object.hashCode() M.P. Johnson, DBMS, Stern/NYU, Sp2004
New topic: B-trees • Saw connected, rooted graphs before: XML graphs • Trees are connected, acyclic graphs • Saw rooted trees before: • XML docs • directory structure on hard drive • Organizational/management charts • B-trees are one kind of rooted tree M.P. Johnson, DBMS, Stern/NYU, Sp2004
Twenty Questions • What am I thinking of? • Large space of possible choices • Can ask only yes/no questions • Each gives <=1 bit • Strategy: • ask questions that divide searchspace in half • gain full bit from each question • log2(1,000,000 ~= 220) = 20 M.P. Johnson, DBMS, Stern/NYU, Sp2004
BSTs • Very simple data structure in CS: BSTs • Binary Search Trees • Keep balanced • Each node ~ one item • Each node has two children: • Left subtree: < • Right subtree: >= • Can search, insert, delete in log time • log2(1MB = 220) = 20 M.P. Johnson, DBMS, Stern/NYU, Sp2004
Search for DBMS • Big improvement: log2(1MB) = 20 • Each op divides remaining range in half! • But recall: all that matters is #disk accesses • 20 is better than 220 but: Can we do better? M.P. Johnson, DBMS, Stern/NYU, Sp2004
BSTs B-trees • Like BSTs except each node ~ one block • Branching factor is >> 2 • Each access divides remaining range by, say, 300 • B-trees = BSTs + blocks • B+ trees are a variant of B-trees • Data stored only in leaves • Leaves form a (sorted) linked list • Better supports range queries • Consequences: • Much shorter depth Many fewer disk reads • Must find element within node • Trades CPU/RAM time for disk time M.P. Johnson, DBMS, Stern/NYU, Sp2004
B-tree search efficiency • With params: • block=4k • integer = 4b, • pointer = 8b • the largest n satisfying 4n+8(n+1) <= 4096 is n=340 • Each node has 170..340 keys • assume on avg has (170+340)/2=255 • Then: • 255 rows depth = 1 • 2552 = 64k rows depth = 2 • 2553 = 16M rows depth = 3 • 2554 = 4G rows depth = 4 M.P. Johnson, DBMS, Stern/NYU, Sp2004
Next time • Next: Failover • For next time: reading online • Hw3 due next time • no extensions! • Now: one-minute responses M.P. Johnson, DBMS, Stern/NYU, Sp2004