EM407 Sound Application Development with SQL Anywhere

EM407Sound Application Development with SQL Anywhere Mark Culp Software Development Manager iAnywhere Solutions mark.culp@ianywhere.com

To help you develop applications that are robust, well designed, have good performance, and can scale with your database and number of users To highlight some differences between 7.0 (and prior) and 8.0 that can affect can affect your application’s design Goals of this presentation

Contents General considerations • Schema design tips • Application development tips Technical overview of • Isolation levels • Join semantics • Cursor support in SQL Anywhere – 7.0 vs. 8.0

When should you think about performance and scalability? • During the design and planning stages • Capacity planning • Improving performance for deployed databases Earlier is better!

Common areas for performance problems • Physical database organization • Database file characteristics • Indexing considerations • Schema design • Server characteristics • CPU, disk activity • Network characteristics • Application design • Query complexity • Trigger design • Locking • Workstation processing (CPU, disk)

Schema Design

Schema design • Define your tables • Normalize your data • Entity/Relationship (ER) modeling • Define appropriate primary keys for all tables • Helps in replication environments (reduces amount of data in log file) • Define appropriate foreign key relationships • Relationships needed to generate efficient join strategies • Define appropriate indexes • Don't need to create indexes for PKs or FKs • Don't over-do it ! Only define ones that are useful

Schema design:primary keys Data administration issues: • Ensure your application has complete control over key assignment and usage • Usually a very bad idea to update a primary key (especially in a replication environment) • It is exceedingly difficult to “hide” primary keys from users (or your customers) • Key formats are very difficult to change after deployment

Schema design:primary key generation • Common problem: large, composite primary keys that are difficult to search efficiently • Both retrieval and update performance can suffer • Consider surrogate primary keys; change existing keys into unique constraints or secondary indexes • Integer representation is the most efficient, for both storage and indexing • Permits the use of autoincrement PK column; hence the server does all the work • Global autoincrement can generate unique PK values in a replicated system (now in UltraLite!)

Schema design:PK generation - autoincrement • Autoincrement scales very well; useful in many situations; highly recommended • Some disadvantages of autoincrement: • Often wish to differentiate keys of different business objects • May desire randomized key generation for some applications • Alphanumeric values can aid in data consistency during data entry • Autoincrement cannot support self-checking identifiers • Think about these tradeoffs when deciding on identifier data types

Schema design:PK generation – manual Two other options: • Manually assign key ranges • Cumbersome, but can work in some business environments • Most prone to data entry errors • consider self-checking or alphanumeric identifiers to reduce data entry problems • Example: Canadian postal codes • e.g. N2L 3X2

Schema design:PK generation – key generation table • Create a separate “key generation” table, with one row per business object • To add a new key: • Initiate a new connection • Compute the next key using the existing one as its basis • Update the table, COMMIT immediately • Several disadvantages: logging, locking, possible contention • Avoid design that serializes lengthy transactions

Application Design

Application design:coding SQL statements • How you write a SQL statement does matter • Watch for join conditions involving user-defined functions, expressions, or type conversion • User defined functions that have queries in them tie the hands of the optimizer and can be inefficient

Application design:coding SQL statements • Simplify the query’s syntax if at all possible • Select list aliases are useful to identify common subexpressions (including subqueries) • e.g. Select (X+10)/2 as quotient • Eliminate unnecessary predicates, DISTINCT processing, etc. • Rewrite the WHERE clause to contain as many conjunctive conditions as possible • Critical for 7.0 • 8.0 now automatically rewrites where clause • Don't replace outer joins with a subselect

Application design:programming considerations • PREPARE/DESCRIBE once during initialization (or on first use) • Remember to drop statements at termination - de-allocate statements with SQLFreeStmt() • Bind columns whenever possible • use SQLBindCol() instead of SQLGetData() • Avoid COMMITing after every statement • This is the default behavior for both JDBC and ODBC • Avoid DDL in applications (including TRUNCATE TABLE) to avoid implicit COMMITs or CHECKPOINTs

Application design:locking & blocking • Use cursor type appropriate to application function to reduce unnecessary locks • Use SQLSetStmtOption() to set cursor attributes • SQL_CONCURRENCY to read only • SQL_CURSOR_TYPE to dynamic or forward-only • Use the BLOCKING option to specify whether or not an application blocks on a locking conflict, or receives an error

Application design:open … with hold • OPEN ... WITH HOLD only where appropriate • All locks (except on the current row) are released upon COMMIT; no guarantees about the state of the other rows • Semantics are unclear if ROLLBACK was issued • Consider setting the option ANSI_CLOSE_CURSORS_ON_ROLLBACK to force the closure of all cursors on a ROLLBACK statement • Usually there is little point to using WITH HOLD cursors at isolation levels 2 or 3

Application design:estimating result set size • Avoid doing so if at all possible • Results will not be consistent in the face of concurrent updates • At OPEN, SQLCA (sqlerrd[2]) contains an estimate of the result set size from the optimizer • Use SQLRowCount() in ODBC • If positive, estimate is accurate at the time the query was executed (i.e. single table scan) • If negative, estimate is from the optimizer

Application design:estimating result set size • Use the ROW_COUNTS option to return an accurate result • For DYNAMIC cursors, query is executed twice • Result may still change due to concurrent updates • Consider SCROLL or INSENSITIVE cursors instead • Result is computed only once • INSENSITIVE: result set size is fixed at OPEN • SCROLL: perform a FETCH ABSOLUTE n where n is “large enough” to force materialization of the entire result • sqlerrd[2] contains (n – result set size)

Application design:use of prefetch • Prefetch is designed to reduce communication in a client-server environment by transferring sets of rows to the client in advance of a FETCH request • Prefetch is ON by default • To disable outright: use the DisableMultiRowFetch connection parameter or set the Prefetch option to OFF • 8.0 turns off prefetch on cursors declared sensitive

Application design:use of prefetch • Avoid the use of prefetch for updateable cursors at isolation level 1 • In 7.0, cursor stability is not guaranteed, as the client and the server are positioned on different rows • In 8.0, new options to guarantee cursor stability • In ESQL, use BLOCK n to limit the number of rows prefetched for each FETCH request • If n is 0, prefetch is disabled

Application design:use of prefetch • Two new connection string parameters in 7.0.1: PreFetchRows and PreFetchBuffer • can specify prefetch row limit and prefetch buffer size, respectively, on a per-connection basis • Prefetch may decrease performance if: • Application requires fewer rows than the prefetched amount • Application performs FETCH ABSOLUTE or scrolls randomly through the rowset

Application design:bulk fetches • For relatively large result sets, use wide (multi-row) fetches • each request to the server obtains several rows • 6.x, 7.x, and 8.0 prefetching automatically improves bulk fetch performance • Application gives up cursor stability if prefetching is used at isolation level 1(7.0 and earlier)

Application design:bulk inserts • Use LOAD TABLE where appropriate • Use wide (multi-row) inserts • COMMIT at regular intervals to limit size of rollback log • 7.x & 8.0 uses large I/O’s to grow database and transaction log

Application design:improving performance • Objective is to reduce the work the engine is being asked to perform • Reduce number of requests sent to the server • Use procedures for multi-part transactions • Consider populating and then join to a temporary table if needing to select multiple specific rows • SELECT list should only contain attributes required by the application • Use declared temporary tables instead of permanent tables to avoid logging, locking

Application design:improving performance • Java in the database • Tradeoff how Java objects are fetched • often more flexible: fetch entire objects • often more efficient: fetch individual object values • Minimize the need to deserialize Java columns • use indexes to reduce the scan factor of tables containing Java objects • For OLTP workloads, minimize the number of rows that need to be processed • ensure that indexes can be used for ORDER BY, GROUP BY and DISTINCT

Application design:improving scalability • Be aware of the length of time your application will hold a lock on any specific row or table • Request-level logging tool can assist in pinpointing performance problems • PERL script to analyze log output • Can identify the most expensive SQL queries on a server-wide basis • Use TRANTEST to simulate production workload • Use DBTRAN to review the transaction log to identify unnecessary updates, places to use temporary tables

Application design:use of compression (new in 8.0) • Communication compression may improve between client and server over a modem or WAN: • Enable using Compress=YES in client connection string, or –pc server command line switch • Packets are compressed before encryption • Compressed data can be less than 10% of original size, but depends completely on data and the application • Consider increasing packet size to achieve greater compression and less number of packets • Compression requires additional ~46K per connection • You must analyze your application's throughput

Some Technical Details

Isolation levels • Isolation levels only affect behavior of read requests from other connections/transactions; writes are always locked • Isolation levels for read requests: • 0 (default) - no locking; a latch ensures that the entire row is consistent when retrieved from the disk page • 1,2 - lock rows in the query’s result, but with level 1 the lock is held only while the cursor is on that row • 3 - lock every row read and every insertion point crossed during query execution • Add/remove foreign key row requires a read lock on the primary row

Isolation levels: recommendations • Use the isolation level that offers your application the best trade-off of consistency with concurrency • NB. nothing is guaranteed at level 0 (“dirty read”) • For isolation level 3, ensure the server can exploit the use of indexes to limit the amount of locking performed • Specify ISOLATION LEVEL on a cursor basis instead of modifying the option setting

Join semantics • ASA supports: • INNER • LEFT OUTER • RIGHT OUTER • CROSS (Cartesian product) • FULL OUTER (new to 8.0) • Variants: • KEY: generates FK-PK equijoin condition • NATURAL: generates equijoin condition over columns with identical names

Join semantics: some details • 7.x release introduces restrictions on permitted syntax • Restrictions on outer references in an ON condition • New option: Extended_join_syntax • By default, table expressions are left-deep nested • R Join S Join T = (R Join S) Join T • “Star” joins are specified by using the same table reference (correlation name) more than once Select * From R Left Outer Join S On (R.x = S.x), R Left Outer Join T On (R.x = T.x and T.y = 25)

Join semantics: some details • Semantics of KEY joins are complex • Verify that your query is producing the intended result • See documentation for details

Join semantics: new to 8.0 • Support for FULL OUTER JOIN • In addition to naïve nested loop join, 8.0 introduces new join operators • nested block, hash, sort merge • if no ORDER BY by clause specified, order of results may vary depending on chosen plan; so don't make any assumptions • dbisql's query plan viewer can be used to visually display plan chosen for complex joins See AM33: ASA Internals

Cursors • ESQL cursor types: • no scroll, dynamic scroll (default), scroll, insensitive • sensitive (new to 8.0) • ODBC cursor types • static, dynamic, keyset, mixed, forward-only (default)

Cursors semantics • Cursor semantics are dependent on: • Membership sensitivity • Value sensitivity • Scrollability (forward only or scrollable) • Updatability (read-only or updateable)

Cursors – membership sensitivity Membership sensitivity: • Insensitive: result rows are fixed at open; no changes • Repeatable: result rows will not change once fetched • Sensitive: result rows will change with respect to concurrent inserts, deletes, and updates • Asensitive: result rows may or may not change depending on update activity and chosen plan

Cursors – some definitions Value-sensitivity (VS): • Insensitive: data values will not change once fetched • Sensitive: data values will change with respect to concurrent updates • Asensitive: data values may or may not change depending on update activity and chosen plan

Cursor combinations Cursor type can be altered by server to be more restrictive than what was requested

iAnywhere Solutions Highlights • Ask the Experts - about Mobile & Wireless Solutions • -Mezzanine Level Room 15B • Mon./Tues. 11:30 am - 3:30 pm; Wed. 11:30 - 1:30; • Thurs. 9 am - 12 noon • -Exhibit Hall - Demo Center(truck) exhibit hall hours • SIG (Special Interest Group) • - Tuesday 5:30pm Mobile & Wireless SDCC, Upper level, Room 11 • Keynote - Enabling m-Business Solutions • Wednesday 1:30 pm - 3:00 pm • iAnywhere Solutions Developer Community • -Excellent resource for commonly asked questions, newsgroups, bug • fixes, newsletters, event listings - visit www.ianywhere.com/developer

Thank you Questions ? mark.culp@ianywhere.com www.ianywhere.com

EM407 Sound Application Development with SQL Anywhere

EM407 Sound Application Development with SQL Anywhere

Presentation Transcript

SQL921: SQL Anywhere Roadmap

SAS916: ADO.NET in SQL Anywhere Studio

SQL Anywhere 9 Developer Tour

Mobile Application Development with ANDROID

Accelerating Application Development with Objects

Anywhere With Jesus

EM410 Upgrading Your SQL Anywhere Studio Applications

Application Development with BRL-CAD

MES905: Building Mobile Web Applications with SQL Anywhere

MES908: Mobile Messaging with SQL Anywhere

Application Development with VB.NET

SQL 929: SQL Anywhere Performance Evaluation and Benchmarking

Application Development with HTML5

Mobile Application Development with ANDROID

Anywhere With Jesus

SAS914 – SQL Anywhere 9 New Feature Briefing

Application development with Java

Application development with Java