500 likes | 681 Views
DB-15: Developing Performance-oriented ODBC/JDBC OpenEdge™ Applications. John Goodson Vice President, DataDirect, R&D. Agenda. Topic: Understanding how drivers work and what factors influence performance Topic: Guidelines for improving performance Topic: Review.
E N D
DB-15: Developing Performance-oriented ODBC/JDBC OpenEdge™ Applications John Goodson Vice President, DataDirect, R&D
Agenda • Topic: Understanding how drivers work and what factors influence performance • Topic: Guidelines for improving performance • Topic: Review DB-15 ODBC/JDBC OpenEdge™ Applications
At The End, You Should Understand: • How driver implementations trade off performance for functionality • How to design for performance • How to avoid pitfalls in some drivers DB-15 ODBC/JDBC OpenEdge™ Applications
ODBC • Provides a common API to all data stores • Cross platform • Initially a Microsoft standard based on a draft pre-release of the X/Open SQL Access Group CLI • ODBC 3.0 aligns with the ISO and ANSI CLI standards DB-15 ODBC/JDBC OpenEdge™ Applications
JDBC • JDBC is a Java™ interface for accessing SQL data sources • Based on ODBC, Open Group SQL CLI, and ANSI/ISO CLI • JDBC 1.0 designed for basic usage with emphasis on ease of use • JDBC 2.0 enhanced for advanced features and server-side use • JDBC 3.0 goal was to “round out” API by providing missing functionality DB-15 ODBC/JDBC OpenEdge™ Applications
Performance Issues • Developing code that works is hard enough without having to make it work fast! • Tough to know all the specifics of a single backend • No exceptions are thrown to say your code is running too slow • Most performance “help” is DBMS server focused … not middleware focused DB-15 ODBC/JDBC OpenEdge™ Applications
How Do Drivers Work? • Each database vendor defines a communication protocol of how the database server will communicate with clients • This communication protocol is defined in terms of packets that will be transferred using a network transport (i.e. information will be encoded into a specific database defined format that will be sent over the network (tcp-ip) to/from clients) • The database server only understands requests that are sent in this format DB-15 ODBC/JDBC OpenEdge™ Applications
How Do Drivers Work – Part 2 • Every database vendor defines their packet descriptions and their model differently • Oracle and DB2 are cursor based models • SQL Server and Sybase are streaming models • Progress OpenEdge is a cursor based model • Every JDBC request will most likely result in many packets being sent to/from the database • Why? • (1) packets are fixed size and data might not fit in one packet • (2) JDBC API does not map 1:1 with database communication protocol DB-15 ODBC/JDBC OpenEdge™ Applications
Why Do We Care How Drivers Work? • The two slowest things in a database system are • Network i/o • Disk i/o • The goal is to keep applications using the CPU and not waiting on movement of data across a network or to/from a storage device. DB-15 ODBC/JDBC OpenEdge™ Applications
What Will Applications Do? • Connect • Look for data • Read data • Change data • Delete data • Make changes permanent (commit) • Disconnect DB-15 ODBC/JDBC OpenEdge™ Applications
What Happens When We Connect? • Login credentials verified by dbms • Shared memory (allocated and) assigned to connection • Depending on db between 24k and 64k • Various queues updated with connection presence • Connection recorded in audit files on disk • Database context established • Code page set • Transaction information initialized • Connection properties applied • Initialization statements executed DB-15 ODBC/JDBC OpenEdge™ Applications
Connecting • Connecting is the single slowest thing you can do in a well tuned application • Lots of network i/o’s … disk i/o’s … CPU cycles • Using connection pooling, we can make the slowest (network i/o and disk i/o intensive) operation in your application one of the fastest (using no network i/o or disk i/o) • With JDBC, Connection pooling requires use of JDBC DataSources … don’t connect via Driver Manager DB-15 ODBC/JDBC OpenEdge™ Applications
What Will Applications Do? • Connect • Look for data • Read data • Change data • Delete data • Make changes permanent (commit) • Disconnect DB-15 ODBC/JDBC OpenEdge™ Applications
Looking for Data – Catalog Functions/DatabaseMetaData • Compared to other xDBC methods, catalog calls/DatabaseMetaData methods that return ResultSet objects are slow • getTables, getColumns, getProcedures, … • xDBC drivers have to execute complex queries – and many times multiple queries – in order to return the results mandated by the xDBC specs. DB-15 ODBC/JDBC OpenEdge™ Applications
Getting a List of Tables • What you want: a list of all tables that I have starting with the prefix NYC • ResultSet WSrs = WSc.getTables (null, null, “NYC%”, null); DB-15 ODBC/JDBC OpenEdge™ Applications
Getting a List of Tables (part 2) • What a driver thinks you want: A list of all tables, views, synonyms, system tables, aliases, or temporary tables owned by any schema in any catalog/database starting with the prefix NYC • ResultSet WSrs = WSc.getTables (“cat1”, “johng”, “NYC%”, “TABLE”); DB-15 ODBC/JDBC OpenEdge™ Applications
Avoid Search Patterns for DatabaseMetaData Methods • Using null arguments or search patterns to Catalog/DatabaseMetaData methods results in time-consuming queries and, potentially, increases network traffic due to unwanted results. • Common places to see null that should be “fixed” • Schema name (most all) • Table type (getTables/SQLTables) • Catalog name (most all) DB-15 ODBC/JDBC OpenEdge™ Applications
select a.owner, a.object_name, INSTR(a.object_type, 'PROC'), '1' from sys.all_objects a where a.object_type IN ('FUNCTION','PROCEDURE') and a.status = 'VALID' union all select c.owner, c.synonym_name, INSTR(a.object_type, 'PROC'), '1' from sys.all_objects a, sys.all_synonyms c where c.table_owner = a.owner and c.table_name = a.object_name and a.object_type IN ('FUNCTION','PROCEDURE') and a.status = 'VALID' union all select distinct b.owner, CONCAT(b.package_name, '.' || b.object_name), min(b.position), max(b.overload) from sys.all_arguments b where b.package_name IS NOT NULL group by b.owner, CONCAT(b.package_name, '.' || b.object_name) union all select distinct c.owner, CONCAT(c.synonym_name, '.' || b.object_name), min(b.position), max(b.overload) from sys.all_arguments b, sys.all_synonyms c where c.table_owner = b.owner and c.table_name = b.package_name and b.package_name IS NOT NULL group by c.owner, CONCAT(c.synonym_name, '.' || b.object_name) union all select distinct c.owner, c.synonym_name, min(b.position), max(b.overload) from sys.all_arguments b, sys.all_synonyms c where c.owner = b.owner and c.table_owner=b.package_name and c.table_name=b.object_name group by c.owner, c.synonym_name Oracle SQL … getProcedures … yuck! DB-15 ODBC/JDBC OpenEdge™ Applications
getTypeInfo • Even getTypeInfo may result in a network round-trip • Do this once and cache the information DB-15 ODBC/JDBC OpenEdge™ Applications
Use Prepared Statement Metadata Instead of getColumns (when possible) • getColumns/SQLColumns • driver sends a potentially complex SQL query to the server • SQL is preprocessed and executed on the server • driver must then obtain result set descriptions, bind results, and retrieve rowset from the server • Usually at least 3 network i/o’s (probably more) • ResultSetMetaData/SQLDescribeCol • Simple query, which evaluates to no rows, sent to server – NOT EXECUTED • Only result set information obtained • Usually 2 network i/o (prepare + close) DB-15 ODBC/JDBC OpenEdge™ Applications
Example PreparedStatement pstmt = WScon.prepareStatement (“select * from StockPurchasePlan where 0 = 1”); ResultSetMetaData rsmd = pstmt.getMetaData(); int count = rsmd.getColumnCount; // this code would replace // getColumns (“ct1”, “johng”, // “StockPurchasePlan”, “%”); DB-15 ODBC/JDBC OpenEdge™ Applications
What Will Applications Do? • Connect • Look for data • Read data • Change data • Delete data • Make changes permanent (commit) • Disconnect DB-15 ODBC/JDBC OpenEdge™ Applications
Prepared Statements • Many times our applications do the same things over and over again by specifying SQL that is parameterized • SELECT firstname, lastname, address, phone FROM employees WHERE empid = ? • UPDATE balance SET balance = balance + ? WHERE acctid = ? DB-15 ODBC/JDBC OpenEdge™ Applications
Prepared Statements (2) • Many times prepared statements are not as efficient as executing the SQL directly (using the Statement object in JDBC/SQLExecDirect in ODBC) • Use prepared statements if the query is to be executed more than twice in this session OR if it will be executed frequently by the system (JDBC) • Execute directly if the query is to be executed once or twice DB-15 ODBC/JDBC OpenEdge™ Applications
Prepared Statement Side Effect • Side effect: There is no way to control what gets put in the prepared statement pool. If you create a prepared statement, then it’s going into the pool and might remove some other statement you want to stay in the pool. DB-15 ODBC/JDBC OpenEdge™ Applications
Even More on Prepared Statements • Prepare, Execute, Describe • Always preferred • Saves network roundtrip with most databases • Prepare, Describe, Execute • Always slower DB-15 ODBC/JDBC OpenEdge™ Applications
Stored Procedure Invocation • Use parameter markers for arguments to stored procedures … do not use literal arguments {call getFunds (?,?,?)} // good {call getFunds (1234, ‘JPG’, 53.23)} // bad DB-15 ODBC/JDBC OpenEdge™ Applications
Why Not Use Literal Arguments? • The spec says not to use literals • More code for the programmer, but executions can be done via RPCs (remote procedure calls) instead of executed as language events • Use of RPCs improves performance by • Eliminating parameter processing on server • Parsing done on server DB-15 ODBC/JDBC OpenEdge™ Applications
Example: • {call foo (12,’abc’)} • Server gets a generic SQL packet • Parse SQL • Semantically validate • Does procedure foo exist? • Is foo defined to have 2 arguments? • SQL is always a Unicode character string • Look up data type for foo’s parameter 1 (int) • Convert ’12’ to an integer • Look up data type for foo’s parameter 2 (varchar) • Convert ‘abc’ to varchar • Execute DB-15 ODBC/JDBC OpenEdge™ Applications
Example: • {call foo (?,?)} • Server gets a specific stored procedure packet where the packet structure contains the arguments in the defined server format for the parameters • Execute only DB-15 ODBC/JDBC OpenEdge™ Applications
Understand Cursor Types • Forward only cursors are always the fastest cursor type • (most of the time) static/insensitive scrollable cursors will retrieve all rows for the query to the client • (most of the time) keyset/sensitive scrollable cursors will retrieve all keys to the client, then each fetch (next) will result in an execute DB-15 ODBC/JDBC OpenEdge™ Applications
Forward-Only Cursors • driver will pre-fetch rows from the server • For small result sets, can fit entire contents of the result set in one network roundtrip • Side effects: • First fetch will be slower than subsequent fetches for application • Cursor position is not where you “think” it is • Read Committed data might have already been read DB-15 ODBC/JDBC OpenEdge™ Applications
Scroll Insensitive/Static Cursors • drivers will usually cache the entire contents of the result set on the client • “Lazy fetch” model … only fetch the data when you have to. • If you request a scroll sensitive cursor and the driver can’t support a scroll sensitive cursor, then it will downgrade to scroll insensitive • Not scalable! DB-15 ODBC/JDBC OpenEdge™ Applications
Keyset/Scroll Sensitive Cursors • drivers usually cache a set of keys to the client • Example, ‘select * from table’ with Oracle • Results in the driver executing: ‘select rowid from table’ • Retrieves all rowids • Each call to SQLFetch/rs.next() will result in the driver executing ‘select * from table where rowid = ?’ and retrieving the result DB-15 ODBC/JDBC OpenEdge™ Applications
Use Batches Instead of Prepared Statements • Bulk load of data has typically been done by preparing an INSERT statement and executing many times. • Preparing an INSERT statement and executing 100 times: • 101 network round-trips • Using batch queries/addBatch() to compound 100 inserts: • 1 network round-trip • Network round-trips are more expensive than DBMS CPU cycles DB-15 ODBC/JDBC OpenEdge™ Applications
Arrays • In ODBC, use SQLParamOptions to execute multiple parameterized queries with one network roundtrip • Not as easy in JDBC; however, use batch queries and eventually all drivers will optimize to use parameter arrays DB-15 ODBC/JDBC OpenEdge™ Applications
Retrieving JDBC Data • The hard work is not just positioning to the data, follow a few more guidelines when putting data into buffers • Use getter methods with result column ordinals instead of with column names. String fullname = rs.getString (3); // good String fullname = rs.getString (“lname”); // bad DB-15 ODBC/JDBC OpenEdge™ Applications
JDBC Retrieving Data (2) • Only use getObject() unless it’s absolutely necessary. String children = (String) rs.getObject (2); // bad Int children = rs.getInt (2); // fast DB-15 ODBC/JDBC OpenEdge™ Applications
Avoid Long Data (Including LOBs) • Select * from table • Easy! Most people do this! • Slow! • Even if you don’t ask for column data, a performance penalty has been paid as the data has already been moved across the network to the client • Particularly bad when the table contains long varchar/long varbinary/LOB data • Compounded because the driver is trying to array fetch … which is might not be able to do DB-15 ODBC/JDBC OpenEdge™ Applications
CLOB/BLOB • Using CLOB and BLOB objects sparingly • Although you might think that only parts of the LOB are being transported across the network, most database wire protocols retrofit LOBs into existing packet structures • JDBC drivers emulate getClob/getBlob by retrieving the entire object locally DB-15 ODBC/JDBC OpenEdge™ Applications
Various other Guidelines • Use getGeneratedKeys() • Saves network i/o’s • Don’t call rs.last() or fetch absolute last • Don’t write code that relies on the number of result rows from a query • Drivers have no idea how many rows a query will return • Drivers have to fetch all rows to know how many rows the query will return DB-15 ODBC/JDBC OpenEdge™ Applications
More guidelines • Understand: • setMaxRows • setMaxFieldSize • Both limit network i/o by requesting that the server not send data across the wire • Side effects: • setMaxRows also applies to updates/deletes in some databases • These options will persist in the connection DB-15 ODBC/JDBC OpenEdge™ Applications
What Will Applications Do? • Connect • Look for data • Read data • Change data • Delete data • Make changes permanent (commit) • Disconnect DB-15 ODBC/JDBC OpenEdge™ Applications
Transactions • Committing transactions is expensive! • Network i/o’s • Numerous non-sequential disk i/o’s • Don’t run in auto-commit mode • Distributed txs are at least 4 times as slow as local transactions • Use DTC/JTA only if required! DB-15 ODBC/JDBC OpenEdge™ Applications
What Will Applications Do? • Connect • Look for data • Read data • Change data • Delete data • Make changes permanent (commit) • Disconnect DB-15 ODBC/JDBC OpenEdge™ Applications
Summary • Avoid: • Network i/o • Disk i/o • Think about what drivers are doing to process your requests • Stop lazy programming ... Fill in the null arguments DB-15 ODBC/JDBC OpenEdge™ Applications
Summary • Avoid network I/O • Avoid disk I/O • Avoid sloppy programming DB-15 ODBC/JDBC OpenEdge™ Applications
Questions? DB-15 ODBC/JDBC OpenEdge™ Applications
Thank you for your time! DB-15 ODBC/JDBC OpenEdge™ Applications