1 / 37

David A. Gaitros October 16th, 2000 Department of Computer Science

COP 4710 Databases Fall, 2000 Today’s Topic Chapter 9: Java and Web Sites Chapter 11: File Structures. David A. Gaitros October 16th, 2000 Department of Computer Science. Current Events. Java DB Connectivity (JDBC). Figure 8.4 Strategies for implementing JDBC packages.

Download Presentation

David A. Gaitros October 16th, 2000 Department of Computer Science

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. COP 4710 DatabasesFall, 2000Today’s TopicChapter 9: Java and Web SitesChapter 11: File Structures David A. Gaitros October 16th, 2000 Department of Computer Science

  2. Current Events

  3. Java DB Connectivity (JDBC) • Figure 8.4 Strategies for implementing JDBC packages

  4. Examples of JDBC Applications • See SqlFilter.java • See Web sites • http://enp01.enp.fsu.edu • See code in examples directory

  5. Executing Insert and Update Statements • Create new customer, using String + int rowcount = stmt.executeUpdate( ”insert into Customer ” +”(accountId,lastName,firstName) ” +”values (1239,’Brown’,’Mary’)”);if (rowcount == 0) // insert failed • Update • String updateSQL = “update TimeCard set “ +”TimeCard.paid = 'yes’ where “ +”paid<>'yes’”;int count = stmt.execute(updateSQL);// count is number of rows affected

  6. Executing unknown SQL • Arbitrary SQL may return table (ResultSet) or row count (int) • Statement.execute method stmt.execute(sqlStatement); result = stmt.getResultSet(); while (true) {// loop through all results if (result != null) // process result else {// result is not a ResultSet rowcount = stmt.getUpdateCount(); if (rowcount == -1) break // no more results else // process row count } result = stmt.getMoreResults()) }

  7. Chap. 11 Managing Information in Files • Hardware Characteristics of File Systems • Hierarchy of storage • Disk drives • Hardware architecture of storage systems • Software architecture of storage systems • Manipulating Streams and Files in Java • Java classes for input and output • Readers and writers • File classes • Methods for Representing Values in Streams • Length-based fields • Fixed-length fields

  8. Fundamental File Structure Concepts • Information is stored in application memory • Information is made persistent in files • The format of information in memory is different from its format in files • Examples of changes in format • Pointers (references in Java) have no information content • Access to files is in large chunks (pages) • cannot read or write a byte or word • The performance of applications is determined by the file performance

  9. Disk performance, 1998 and 2000 • Capacity Seagate 8 GB (Seagate 73) • 512 bytes per sector • 170 sectors per track (423) • 16 tracks per cylinder (24) • 6526 Cylinders (14,100) • Speed • Latency • seek min 0.78 (.6), max 19, avg 8 msec (6) • rotational delay? see spindle speed (3 msec) • Transfer rate • 10,000 rpm spindle speed • Maximum transfer rate? see above • 6 msec/track, 14 KB/msec (24 KB/msec) • IBM Travelstar 6, 4200 rpm, laptop • 268 sectors/track, 2 platters, 11,648 cylinders

  10. Physical organization of hardware • Disk drive • Disk controller, attached to drive • buffer of data to be moved to/from • controls head position • transfer of data when sector is under head • I/O controller, attached to memory bus • DMA, direct memory access • moves data from memory to/from disk controller buffer • Memory • Contains application data and I/O buffers • CPU • moves data to/from I/O buffers

  11. Java Classes for Input and Output • See JDK (Java Development Kit) API (application programming interface) • http://java.sun.com/products/jdk/1.1/docs/api/packages.html • class InputStream { public abstract int read() throws IOException; public int read (byte[]) throws IOException; public long skip(long n) throws IOException; public int available() throws IOException; • class PrintStream extends OutputStream { public void print(String); public void println(String); public void print(int); public void print(byte); • Notice • overloading, arrays, strings, exceptions

  12. Samples from Chapter 11 • Look at examples in package dbjava.files • class Video in Video.java • Video has main method • notice use of ‘static’, argument list • Video supports reading and writing in a variety of styles • print, printLabeled: simple text • read, write: length-based • readObject, writeObject: serializable

  13. Simple Text Output and Input • Labeled text output, one label and field per line • public void printLabeled (PrintStream out) { out.println ("name\t"+name); out.println ("id\t"+id); out.println ("genre\t"+genre); • Look at result using “od -xc -A x video.txt • Reader and Writer classes • take advantage of Unicode characters (2 byte) • Readers support readLine • Unlabeled text input, one field per line • public void read (BufferedReader in) throws IOException { // read a Video object into the calling object name = in.readLine(); id = Integer.parseInt(in.readLine()); genre = in.readLine();

  14. Java Exceptions and Exception Handling • Java supports exceptions with • exception class definitions class IOException extends Exception {… }; • throw statement, signals exceptional situation • try/catch blocks, contain and handle exceptions • Some methods may throw exceptions: • FileInputStream(String name) throws IOException • If there is no such file, or it cannot be opened • Sample from java.io try {file = new FileInputStream(“video.txt”);} catch (IOException e) { e.printStackTrace (System.err); }

  15. Storing Objects in Files • Class Video {String name; int id; String genre;} • Representation in memory uses memory addresses (picture on board) • Representation in file must be different • Memory addresses have no meaning after program quits execution • File representation must preserve meaning • Length-based Fields • Store String as length plus value • 12Men in Black0312313action comedy

  16. Data Input and Output Streams class DataOutputStream {public final void writeBoolean (boolean v) throws IOException; public final void writeByte (int v) throws IOException; public final void writeShort (int v) throws IOException; public final void writeInt (int v) throws IOException; public final void writeLong (long v) throws IOException; public final void writeFloat (float v) throws IOException; • Features • Writes binary values of Java values • Platform independent • See methods read and write of class Video

  17. Write Length-based Fields in Binary public class Video { … public void write(DataOutputStream out) throws IOException { // write object as length-based and // fixed-length fields out.writeShort(name.length()); out.writeBytes(name); out.writeInt(id); out.writeShort(genre.length()); out.writeBytes(genre); } }; • Look at output with od (octal dump)

  18. Class FieldOps class FieldOps { // a class that supports static operations on length-based fields public static String readLength(DataInput in) throws IOException {...} public static void writeLength(DataOutput out, String str) throws IOException{...} • Class FieldOps encapsulates the representation of a String value as length-based field • uses static methods • Consider • class FieldInputStream extends DataInputStream • this causes big problems

  19. Styles of Representing Fields • Delimited text • text followed by delimiter • transform value into text representation • eliminate delimiter ambiguity • Length-based binary • binary length field • block of bytes, uninterpreted • determine length of block, write length first • Fixed-length • block of bytes of known length • must know length to read field • Identified field • name, delimiter, value, delimiter

  20. Using Files in Java • File classes • class FileInputStream extends InputStream • class FileReader extends Reader • new FileInputStream (“video.txt”); • opens a file named “video.txt” in input mode • creates a connection with OS file • Method close breaks connection between stream and OS file • To create a data stream from a file • new DataInputStream(new FileInputStream(…));

  21. Java Serializable • Java supports fully self-describing files of objects • Two data structures make this possible • interface Serializable {}; • no methods, so no effort for programmer • class Video implements Serializable { ... • classes ObjectInputStream, and ObjectOutputStream • Video v; ObjectInputStream in; • in.read (v); // gets an object from object stream • Sample in class Video • java dbjava.app_a.Video owrite video.obj • analyze program execution and output

  22. Octal dump of Video.obj 0000000 aced 0005 7372 0005 5669 6465 6fbc cd0b ¬ í \0 005 s r \0 005 V i d e o ¼ Í \v 0000010 077f 1bda ba02 0003 4900 0269 644c 0005 \a 177 033 Ú º 002 \0 003 I \0 002 i d L \0 005 0000020 6765 6e72 6574 0012 4c6a 6176 612f 6c61 g e n r e t \0 022 L j a v a / l a 0000030 6e67 2f53 7472 696e 673b 4c00 046e 616d n g / S t r i n g ; L \0 004 n a m 0000040 6574 0012 4c6a 6176 612f 6c61 6e67 2f53 e t \0 022 L j a v a / l a n g / S 0000050 7472 696e 673b 7870 0000 007b 74000d61 t r i n g ; x p \0 \0 \0 { t \0 \r a 0000060 6374 696f 6e20 636f 6d65 6479 7400 0c4d c t i o n c o m e d y t \0 \f M 0000070 656e 2069 6e20 426c 6163 6b00 e n i n B l a c k Note self-describing with type defs, fixed length and length-based fields

  23. Files of Records (Section 7.6) • Unit of access for applications is the record • logical record: • storage for a single object (or collection) • physical record: unit of physical access to file • Usually enforced by OS, not application • Possible organizations • Fixed-length records • each record the same size • binary search available if sorted • Variable-length records • size depends on information content • sequential search only

  24. Random Access Files • Class RandomAccessFile • includes read, write, and seeking • implements DataInput, DataOutput • hence acts like a data input stream and a data output stream • class RecordFile • part of package dbjava.app_a • Supports read and write of any record that supports read and write to DataInput and DataOutput • See interface InputOutputRecord

  25. COP 4710 DatabasesFall, 2000Today’s TopicChapter 13 Query Processing and Query Optimization David A. Gaitros Department of Computer Science

  26. Current Events

  27. Chapter 13 Query Processing • Strategies for processing queries • Query optimization • First: How to represent relational DB? • Each table is a file • Record structure to store tuples • File is a random access collection of records • Query is executed by reading records from files • Read record, create object in memory • Process object • Write result as a file of records or keep in memory

  28. Indexing from Chapter 12 • Figure 12.1 A data file and associated index • Figure 12.9 A three-level index for simple key-reference pairs

  29. Indexed Sequential File • Fig. 12.12 Example of an indexed sequential file • B+ tree index • Ordered sequential file, linked list of records ordered by primary key • Good for range queries • Search index for first record in range • Read data file sequentially (linked list order) for rest of records

  30. Table entries Number of entries per block Number of blocks Index fields Index type Keys per node Depth of B+ tree Customer 10,000 10 1,000 accountId B+ tree and ordered sequential file 100 3 lastName B+ tree 50 3 zipcode hash 100 Rental 1,000,000 100 10,000 accountId B+ tree 100 3 movieId B+ tree 100 3 date B+ tree 100 2 Physical characteristics of sample tables • Table 13.1

  31. Processing a range query • Figure 13.3 Illustration of query processing for query • select * from Customer where accountId >= 101 and accountId < 300

  32. Using hashing to eliminate duplicates • A hash function partitions values so that • All values that are the same are in the same partition • Values that are different are often in different partitions • We can find duplicates by hashing • For each tuple in the table • Mash all attribute values in the tuple into a single value • Apply hash function • For each partition • Compare all pairs of tuples • Eliminate duplicates • Why does this work?

  33. Processing join queries with nested loops • select * from Customer c, Rental r where c.accountId = r.accountId • Simple nested loops join while (not customer.eof()) { c = customer.read(); rental.reset(); while (not rental.eof()) { r = rental.read(); if (c.accountId==r.accountId) { result.write(c, r); } } • Cost is Bc + Rc × Br • Can reduce cost by comparing all records in block of customer against all rentals

  34. Processing join queries with indexes • Indexed nested loop join while (not customer.eof()) { Customer c= customer.read(); rental.reset(); while (not rental.eof()) { Rental r[] = rental.readByAcctId(c.accountId); for (int i=0; i<r.length; i++) { result.write(c,r[i]); result.write(c,r[I]); } } } • Cost is Bc + Rrinstead of Bc + Rc × Br without index • Reduce cost by processing a block at a time?

  35. Hash join • Much like hashing for duplicates • Using the shared attribute R.a and S.b • Hash R.a into one hash file • Hash S.b into another file • Look at the corresponding partitions of the two files • All duplicates will be found!

  36. Complex queries • Figure 13.9 A more complex query • Can apply equivalence rules to improve the query plan

  37. Equivalence rule equivalent expressions cascading selections c1 ⋀ c2⋀ c3 ⋀ c4(R) c1(c2(c3(c4 (R)))) commuting selections c1 (c2 (R)) c2(c1(R)) cascading projections, for attributes sets A1 ⊆ A2 ⊆ A3 A1(A2 (A3(R))) A1(R) commuting selection and projection, for condition c uses attributes in set A c(A(R)) A(c(R)) commuting joins R⋈S S⋈R distributing selection and join, for condition c1 uses attributes in R and c2 uses attributes of S c1 ⋀ c2(R⋈S) (c1(R))⋈ (c2(S)) commuting projection and join, for attribute set A1 has attributes from R and A2 has attributes from S A1∪A2(R⋈S) A1(R))⋈(A2(S)) associating joins R⋈(S⋈T) (R⋈S)⋈T Equivalence rules • Table 13.7

More Related