730 likes | 953 Views
Unit-V. Heap sort. The MAX-HEAPIFY procedure, which runs in O ( lg n ) time, is the key to maintaining the max-heap property. The BUILD-MAX-HEAP procedure, which runs in linear time, produces a max-heap from an unordered input array.
E N D
Heap sort The MAX-HEAPIFY procedure, which runs in O(lgn) time, is the key to maintaining the max-heap property. The BUILD-MAX-HEAP procedure, which runs in linear time, produces a max-heap from an unordered input array. The HEAPSORT procedure, which runs in O(nlgn) time, sorts an array in place. Exercises -Is the sequence〈23, 17, 14, 6, 13, 10, 1, 5, 7, 12〉 a max-heap?
Building a heap • BUILD-MAX-HEAP on the array- A = 〈5, 3, 17, 10, 84, 19, 6, 22, 9〉.
Heapsort algorithmworst-case running time of heapsort is Ω(nlgn). Ex-HEAPSORT on the array A = 〈5, 13, 2, 25, 7, 17, 20, 8, 4〉
Sorting on different keys- Ex-Illustrate the operation of HEAP-EXTRACT-MAX on the heap A = (15, 13, 9, 5, 12, 8, 7, 4, 0, 6, 2, 1) Exercises 6.5-2 Ex-Illustrate the operation of MAX-HEAP-INSERT(A, 10) on the heap A = (15, 13, 9, 5, 12, 8, 7, 4, 0, 6, 2, 1)
Quicksort Example- Demonstrate the operation of HOARE-PARTITION on the array- 1-A=(2,8,7,1,3,5,6,4) 2-A =(13, 19, 9, 5, 12, 8, 7, 4, 11, 2, 6, 21)
Quicksort Analysis of Algorithms
Quicksort – Two Partioning Algorithms Analysis of Algorithms
Hoares’ Partitioning Algorithm Analysis of Algorithms
Quicksort Analysis of Algorithms
Hoare’s Partitioning Algorithm Analysis of Algorithms
Hoare’s Partitioning Algorithm Analysis of Algorithms
Hoare’s Partitioning Algorithm Analysis of Algorithms
Hoare’s Partitioning Algorithm - Ex1 (pivot=5) Analysis of Algorithms
Figure : The operation of merge sort on the array A = 〈5, 2, 4, 7, 1, 3, 2, 6〉. The lengths of the sorted sequences being merged increase as the algorithm progresses from bottom to top.
Storage Devices Fig:Storage-devicehierarchy.
Storage Devices • Cache. The cache is the fastest and most costly form of storage. Cache memory is small; its use is managed by the computer system hardware. • Main memory. The storage medium used for data that are available to be operated on is main memory. The general-purpose machine instructions operate on main memory.
Storage Devices • Flash memory. Also known as electrically erasable programmable read-only memory(EEPROM), flash memory differs from main memory in that data survive power failure. • Reading data from flash memory takes less than 100 nano seconds (a nano second is1/1000of a microsecond), which is roughly as fast asreadingdata from main memory.
Storage Devices • Magnetic-disk storage. The primary medium for the long-term on-line stor-age of data is the magnetic disk. Usually, the entire database is stored on mag-netic disk. The system must move the data from disk to main memory so that they can be accessed. After the system has performed the designated opera-tions, the data that have been modified must be written to disk.
Storage Devices • Optical storage. The most popular forms of optical storage are the compact disk(CD), which can hold about 640 megabytes of data, and the digital video disk(DVD) which can hold 4.7 or 8.5 gigabytes of data per side of the disk (or up to 17 gigabytes on a two-sided disk). Data are stored optically on a disk, and are read by a laser.
Storage Devices • Tape storage. Tape storage is used primarily for backup and archival data. Although magnetic tape is much cheaper than disks, access to data is much slower, because the tape must be accessed sequentially from the beginning. For this reason, tape storage is referred to as sequential-access storage. In contrast, disk storage is referred to as direct-access storage because it is possible to read data from any location on disk.
Storage Devices • The fastest storage media — for example, cache and main memory — are referred to as primary storage. • The media in the next level in the hierarchy — for example, magnetic disks — are referred to as secondary storage, or online storage. • The media in the lowest level in the hierarchy — for example, magnetic tape and optical-disk jukeboxes — are referred to as tertiary storage, or offline storage.
Magnetic Disks • Each disk platter has a flat circular shape. Its two surfaces are covered with a magnetic material, and information is recorded on the surfaces. • Platters are made from rigid metal or glass and are covered (usually on both sides) with magnetic recording material. • We call such magnetic disks hard disks, to distinguish them from floppy disks, which are made from flexible material.
Magnetic Disks • The disk surface is logically divided into tracks, which are subdivided into sectors. • A sector is the smallest unit of information that can be read from or written to the disk. • The read-write head stores information on a sector magnetically as reversals of the direction of magnetization of the magnetic material. There may be hundreds of concentric tracks on a disk surface, containing thousands of sectors.
Magnetic Disks • Each side of a platter of a disk has a read – write head, which moves across the platter to access different tracks. • A disk typically contains many platters, and the read – write heads of all the tracks are mounted on a single assembly called a disk-arm, and move together.
Magnetic Disks • The disk platters mounted on a spindle and the heads mounted on a disk arm are together known as head – disk assemblies. • Since the heads on all the platters move together, when the head on one platter is on the ithtrack, the heads on all other platters are also on the ithtrack of their respective platters. • Hence, the ithtracks of all the platters together are called the ith cylinder.
Performance Measures of Disks • Access time is the time from when a read or write request is issued to when data transfer begins. • The time for repositioning the arm is called the seek time, and it increases with the distance that the arm must move. • The average seek time is the average of the seek times, measured over a sequence of (uniformly distributed) random requests.
Performance Measures of Disks • Once the seek has started, the time spent waiting for the sector to be accessed to appear under the head is called the rotational latency time. • The data-transfer rate is the rate at which data can be retrieved from or stored to the disk. • The final commonly used measure of a disk is the mean time to failure (MTTF), which is a measure of the reliability of the disk. • A block is a contiguous sequence of sectors from a single track of one platter.
File organization • A file is organized logically as a sequence of records. These records are mapped onto disk blocks.
File Organization • Choosing a file organization is a design decision, hence it must be done having in mind the achievement of good performance with respect to the most likely usage of the file. The criteria usually considered important are: • Fast access to single record or collection of related records. • Easy record adding/update/removal, without disrupting. • Storage efficiency. • Redundance as a warranty against data corruption.
File Organization • Five organization models will be considered: • Pile. • Sequential. • Indexed-sequential. • Indexed. • Hashed.
Purpose of Data Indexing • It is a data structure that is added to a file to provide faster access to the data. • It reduces the number of blocks that the DBMS has to check.
Properties of Data Index • It contains a search key and a pointer. • Search key - an attribute or set of attributes that is used to look up the records in a file. • Pointer - contains the address of where the data is stored in memory. • It can be compared to the card catalog system used in public libraries of the past.
Two Types of Indices • Ordered index • (Primary index or clustering index) – which is used to access data sorted by order of values. • Hash index • (secondary index or non-clustering index) - used to access data that is distributed uniformly across a range of buckets.
Choosing Indexing Technique • Five Factors involved when choosing the indexing technique: • access type • access time • insertion time • deletion time • space overhead
Indexing Definitions • Access type- is the type of access being used. • Access time - time required to locate the data. • Insertion time - time required to insert the new data. • Deletion time - time required to delete the data. • Space overhead - the additional space occupied by the added data structure.