370 likes | 437 Views
COP 4710 Databases Fall, 2000 Today’s Topic Chapter 9: Java and Web Sites Chapter 11: File Structures. David A. Gaitros October 16th, 2000 Department of Computer Science. Current Events. Java DB Connectivity (JDBC). Figure 8.4 Strategies for implementing JDBC packages.
E N D
COP 4710 DatabasesFall, 2000Today’s TopicChapter 9: Java and Web SitesChapter 11: File Structures David A. Gaitros October 16th, 2000 Department of Computer Science
Java DB Connectivity (JDBC) • Figure 8.4 Strategies for implementing JDBC packages
Examples of JDBC Applications • See SqlFilter.java • See Web sites • http://enp01.enp.fsu.edu • See code in examples directory
Executing Insert and Update Statements • Create new customer, using String + int rowcount = stmt.executeUpdate( ”insert into Customer ” +”(accountId,lastName,firstName) ” +”values (1239,’Brown’,’Mary’)”);if (rowcount == 0) // insert failed • Update • String updateSQL = “update TimeCard set “ +”TimeCard.paid = 'yes’ where “ +”paid<>'yes’”;int count = stmt.execute(updateSQL);// count is number of rows affected
Executing unknown SQL • Arbitrary SQL may return table (ResultSet) or row count (int) • Statement.execute method stmt.execute(sqlStatement); result = stmt.getResultSet(); while (true) {// loop through all results if (result != null) // process result else {// result is not a ResultSet rowcount = stmt.getUpdateCount(); if (rowcount == -1) break // no more results else // process row count } result = stmt.getMoreResults()) }
Chap. 11 Managing Information in Files • Hardware Characteristics of File Systems • Hierarchy of storage • Disk drives • Hardware architecture of storage systems • Software architecture of storage systems • Manipulating Streams and Files in Java • Java classes for input and output • Readers and writers • File classes • Methods for Representing Values in Streams • Length-based fields • Fixed-length fields
Fundamental File Structure Concepts • Information is stored in application memory • Information is made persistent in files • The format of information in memory is different from its format in files • Examples of changes in format • Pointers (references in Java) have no information content • Access to files is in large chunks (pages) • cannot read or write a byte or word • The performance of applications is determined by the file performance
Disk performance, 1998 and 2000 • Capacity Seagate 8 GB (Seagate 73) • 512 bytes per sector • 170 sectors per track (423) • 16 tracks per cylinder (24) • 6526 Cylinders (14,100) • Speed • Latency • seek min 0.78 (.6), max 19, avg 8 msec (6) • rotational delay? see spindle speed (3 msec) • Transfer rate • 10,000 rpm spindle speed • Maximum transfer rate? see above • 6 msec/track, 14 KB/msec (24 KB/msec) • IBM Travelstar 6, 4200 rpm, laptop • 268 sectors/track, 2 platters, 11,648 cylinders
Physical organization of hardware • Disk drive • Disk controller, attached to drive • buffer of data to be moved to/from • controls head position • transfer of data when sector is under head • I/O controller, attached to memory bus • DMA, direct memory access • moves data from memory to/from disk controller buffer • Memory • Contains application data and I/O buffers • CPU • moves data to/from I/O buffers
Java Classes for Input and Output • See JDK (Java Development Kit) API (application programming interface) • http://java.sun.com/products/jdk/1.1/docs/api/packages.html • class InputStream { public abstract int read() throws IOException; public int read (byte[]) throws IOException; public long skip(long n) throws IOException; public int available() throws IOException; • class PrintStream extends OutputStream { public void print(String); public void println(String); public void print(int); public void print(byte); • Notice • overloading, arrays, strings, exceptions
Samples from Chapter 11 • Look at examples in package dbjava.files • class Video in Video.java • Video has main method • notice use of ‘static’, argument list • Video supports reading and writing in a variety of styles • print, printLabeled: simple text • read, write: length-based • readObject, writeObject: serializable
Simple Text Output and Input • Labeled text output, one label and field per line • public void printLabeled (PrintStream out) { out.println ("name\t"+name); out.println ("id\t"+id); out.println ("genre\t"+genre); • Look at result using “od -xc -A x video.txt • Reader and Writer classes • take advantage of Unicode characters (2 byte) • Readers support readLine • Unlabeled text input, one field per line • public void read (BufferedReader in) throws IOException { // read a Video object into the calling object name = in.readLine(); id = Integer.parseInt(in.readLine()); genre = in.readLine();
Java Exceptions and Exception Handling • Java supports exceptions with • exception class definitions class IOException extends Exception {… }; • throw statement, signals exceptional situation • try/catch blocks, contain and handle exceptions • Some methods may throw exceptions: • FileInputStream(String name) throws IOException • If there is no such file, or it cannot be opened • Sample from java.io try {file = new FileInputStream(“video.txt”);} catch (IOException e) { e.printStackTrace (System.err); }
Storing Objects in Files • Class Video {String name; int id; String genre;} • Representation in memory uses memory addresses (picture on board) • Representation in file must be different • Memory addresses have no meaning after program quits execution • File representation must preserve meaning • Length-based Fields • Store String as length plus value • 12Men in Black0312313action comedy
Data Input and Output Streams class DataOutputStream {public final void writeBoolean (boolean v) throws IOException; public final void writeByte (int v) throws IOException; public final void writeShort (int v) throws IOException; public final void writeInt (int v) throws IOException; public final void writeLong (long v) throws IOException; public final void writeFloat (float v) throws IOException; • Features • Writes binary values of Java values • Platform independent • See methods read and write of class Video
Write Length-based Fields in Binary public class Video { … public void write(DataOutputStream out) throws IOException { // write object as length-based and // fixed-length fields out.writeShort(name.length()); out.writeBytes(name); out.writeInt(id); out.writeShort(genre.length()); out.writeBytes(genre); } }; • Look at output with od (octal dump)
Class FieldOps class FieldOps { // a class that supports static operations on length-based fields public static String readLength(DataInput in) throws IOException {...} public static void writeLength(DataOutput out, String str) throws IOException{...} • Class FieldOps encapsulates the representation of a String value as length-based field • uses static methods • Consider • class FieldInputStream extends DataInputStream • this causes big problems
Styles of Representing Fields • Delimited text • text followed by delimiter • transform value into text representation • eliminate delimiter ambiguity • Length-based binary • binary length field • block of bytes, uninterpreted • determine length of block, write length first • Fixed-length • block of bytes of known length • must know length to read field • Identified field • name, delimiter, value, delimiter
Using Files in Java • File classes • class FileInputStream extends InputStream • class FileReader extends Reader • new FileInputStream (“video.txt”); • opens a file named “video.txt” in input mode • creates a connection with OS file • Method close breaks connection between stream and OS file • To create a data stream from a file • new DataInputStream(new FileInputStream(…));
Java Serializable • Java supports fully self-describing files of objects • Two data structures make this possible • interface Serializable {}; • no methods, so no effort for programmer • class Video implements Serializable { ... • classes ObjectInputStream, and ObjectOutputStream • Video v; ObjectInputStream in; • in.read (v); // gets an object from object stream • Sample in class Video • java dbjava.app_a.Video owrite video.obj • analyze program execution and output
Octal dump of Video.obj 0000000 aced 0005 7372 0005 5669 6465 6fbc cd0b ¬ í \0 005 s r \0 005 V i d e o ¼ Í \v 0000010 077f 1bda ba02 0003 4900 0269 644c 0005 \a 177 033 Ú º 002 \0 003 I \0 002 i d L \0 005 0000020 6765 6e72 6574 0012 4c6a 6176 612f 6c61 g e n r e t \0 022 L j a v a / l a 0000030 6e67 2f53 7472 696e 673b 4c00 046e 616d n g / S t r i n g ; L \0 004 n a m 0000040 6574 0012 4c6a 6176 612f 6c61 6e67 2f53 e t \0 022 L j a v a / l a n g / S 0000050 7472 696e 673b 7870 0000 007b 74000d61 t r i n g ; x p \0 \0 \0 { t \0 \r a 0000060 6374 696f 6e20 636f 6d65 6479 7400 0c4d c t i o n c o m e d y t \0 \f M 0000070 656e 2069 6e20 426c 6163 6b00 e n i n B l a c k Note self-describing with type defs, fixed length and length-based fields
Files of Records (Section 7.6) • Unit of access for applications is the record • logical record: • storage for a single object (or collection) • physical record: unit of physical access to file • Usually enforced by OS, not application • Possible organizations • Fixed-length records • each record the same size • binary search available if sorted • Variable-length records • size depends on information content • sequential search only
Random Access Files • Class RandomAccessFile • includes read, write, and seeking • implements DataInput, DataOutput • hence acts like a data input stream and a data output stream • class RecordFile • part of package dbjava.app_a • Supports read and write of any record that supports read and write to DataInput and DataOutput • See interface InputOutputRecord
COP 4710 DatabasesFall, 2000Today’s TopicChapter 13 Query Processing and Query Optimization David A. Gaitros Department of Computer Science
Chapter 13 Query Processing • Strategies for processing queries • Query optimization • First: How to represent relational DB? • Each table is a file • Record structure to store tuples • File is a random access collection of records • Query is executed by reading records from files • Read record, create object in memory • Process object • Write result as a file of records or keep in memory
Indexing from Chapter 12 • Figure 12.1 A data file and associated index • Figure 12.9 A three-level index for simple key-reference pairs
Indexed Sequential File • Fig. 12.12 Example of an indexed sequential file • B+ tree index • Ordered sequential file, linked list of records ordered by primary key • Good for range queries • Search index for first record in range • Read data file sequentially (linked list order) for rest of records
Table entries Number of entries per block Number of blocks Index fields Index type Keys per node Depth of B+ tree Customer 10,000 10 1,000 accountId B+ tree and ordered sequential file 100 3 lastName B+ tree 50 3 zipcode hash 100 Rental 1,000,000 100 10,000 accountId B+ tree 100 3 movieId B+ tree 100 3 date B+ tree 100 2 Physical characteristics of sample tables • Table 13.1
Processing a range query • Figure 13.3 Illustration of query processing for query • select * from Customer where accountId >= 101 and accountId < 300
Using hashing to eliminate duplicates • A hash function partitions values so that • All values that are the same are in the same partition • Values that are different are often in different partitions • We can find duplicates by hashing • For each tuple in the table • Mash all attribute values in the tuple into a single value • Apply hash function • For each partition • Compare all pairs of tuples • Eliminate duplicates • Why does this work?
Processing join queries with nested loops • select * from Customer c, Rental r where c.accountId = r.accountId • Simple nested loops join while (not customer.eof()) { c = customer.read(); rental.reset(); while (not rental.eof()) { r = rental.read(); if (c.accountId==r.accountId) { result.write(c, r); } } • Cost is Bc + Rc × Br • Can reduce cost by comparing all records in block of customer against all rentals
Processing join queries with indexes • Indexed nested loop join while (not customer.eof()) { Customer c= customer.read(); rental.reset(); while (not rental.eof()) { Rental r[] = rental.readByAcctId(c.accountId); for (int i=0; i<r.length; i++) { result.write(c,r[i]); result.write(c,r[I]); } } } • Cost is Bc + Rrinstead of Bc + Rc × Br without index • Reduce cost by processing a block at a time?
Hash join • Much like hashing for duplicates • Using the shared attribute R.a and S.b • Hash R.a into one hash file • Hash S.b into another file • Look at the corresponding partitions of the two files • All duplicates will be found!
Complex queries • Figure 13.9 A more complex query • Can apply equivalence rules to improve the query plan
Equivalence rule equivalent expressions cascading selections c1 ⋀ c2⋀ c3 ⋀ c4(R) c1(c2(c3(c4 (R)))) commuting selections c1 (c2 (R)) c2(c1(R)) cascading projections, for attributes sets A1 ⊆ A2 ⊆ A3 A1(A2 (A3(R))) A1(R) commuting selection and projection, for condition c uses attributes in set A c(A(R)) A(c(R)) commuting joins R⋈S S⋈R distributing selection and join, for condition c1 uses attributes in R and c2 uses attributes of S c1 ⋀ c2(R⋈S) (c1(R))⋈ (c2(S)) commuting projection and join, for attribute set A1 has attributes from R and A2 has attributes from S A1∪A2(R⋈S) A1(R))⋈(A2(S)) associating joins R⋈(S⋈T) (R⋈S)⋈T Equivalence rules • Table 13.7