250 likes | 479 Views
EXTERNAL SORTING ALGORITHMS AND IMPLEMENTATIONS. 05011004 Ayhan KARGIN 05011027 Ahmet MERAL. Contents. External Sorting Needs and Usage Areas External Sorting Algorithms Environments for Implementing External Sorting Algorithms Used Technologies Phases of External Sorting
E N D
EXTERNAL SORTING ALGORITHMSAND IMPLEMENTATIONS 05011004 Ayhan KARGIN 05011027 Ahmet MERAL
Contents • External Sorting Needs and Usage Areas • External Sorting Algorithms • Environments for ImplementingExternal SortingAlgorithms • UsedTechnologies • Phases of ExternalSorting • K-Way Merge Sort • Multi-Step K-Way Merge Sort • Replacement Selection Sort • SimpleDB • Layered Components of SimpleDB • Classes of SimpleDB • Query Layer of SimpleDB • Relational Algebra,that SimpleDB Supports • RelationalAlgebra • PreparatoryWork • ExternalSorting on SimpleDB
External Sorting Needsand Usage Areas • DBMS • Group By, Join, Order By • Data Warehouse (ETL) • Data Mining • Data Processing
ExternalSorting Algorithms • K-Way Merge Sort • Multi-Step K-Way Merge Sort • Replacement Selection Sort
Environments for ImplementingExternal Sorting Algorithms • MinSQL • PosgreSQL • SimpleDB
Used Technologies • Java VM • Java SE • JDBC • Java RMI • Eclipse
Phases of ExternalSorting • Run GenerationPhase • MergePhase
ReplacementSelection Test resultsdepending on stagearea size Note: Main memory is 8x.
SimpleDB • The Client Side • Thatcontainsthe JDBC interfacesandimplementsthe JDBC driver • The Basic Server • Whichprovidescompletefuncionalityof DB but ignoresefficiencyissues • Extensions • Tothebasic server thatsupportefficientqueryprocessing.
Layered Components of SimpleDB • Remote • Perform JDBC requests received from clients. • Parse • Extract the tables, fields, and predicate mentioned in an SQL statement. • Planner • Create an execution strategy for an SQL statement, and translate it to a relational algebra plan. • Query • Implementqueriesexpressed in relationalalgebra. • Metadata • Maintain metadata about the tables in the database, so that its records and fields are accessible.
LayeredComponents of SimpleDB • Record • Provide methods for storing data records in pages • Transaction • Support concurrency by restricting page access. Enable recovery by logging changes to pages. • Buffer • Maintain a cache of pages in memory to hold recently-accessed user data. • Log • Append log records to the log file, and scan the records in the log file. • File • Read and write between file blocks and memory pages.
Query Layer of SimpleDB • RelationalAlgebra • Select • Project • Product • Sort
RelationalAlgebra, that SimpleDB Supports RelationalAlgebra Select Project Product Sort SQL Where Select Join OrderBy
RelationalAlgebra • Select SId, SName, DNamefrom STUDENT, DEPTwhere MajorId=DIdand DName='math' • Q1: Product (STUDENT, DEPT) Q2: Select (Q2, MajorId=DId ) Q3: Select (Q2, DName='math') Q4: Project (Q3, {SId, SName,DName})
PreparatoryWork • We have to introduce order by operator (in SQL) to parser layer of SimpleDB. • The columns, which will be sort, must be presenced in QueryData class in structural form. • If sorting will be happened, sorting plans must be called by BasicQueryPlanner class. • Calculating number of randomaccesses per file on FileMgr layer. • Calculating duration of transaction with adding start time and end time to Transaction class. • Calculating number of unsorted records on each transaction. • We may have to compare, copy and exchange congeneric records on different scans. So, RecordExchange class is written for these purposes. • StepCalculatorclass is written. It splits any number to two close integer multipliers. For example; from 29, 6 and 5; from 49, 7 and 7; from 50, 8 and 7; etc.
ExternalSorting on SimpleDB • Materialize Plan • Edward Sciore’sMergeSortPlan&Scan • K-WayMergeSortPlan&Scan • Multi-Step K-WayMergeSortPlan&Scan • ReplacementSelectionSortPlan&Scan • Multi-Step ReplacementSelectionSortPlan&Scan