830 likes | 838 Views
Learn about the scalability and reliability of MobiLink for supporting a large number of remote databases and maximizing performance. Discover performance tips and recommendations for optimizing throughput.
E N D
EM419MobiLink Advanced Scalability and Reliability • Reg Domaratzki • Sustaining Engineering • iAnywhere Solutions • Reg.Domaratzki@sybase.com
Which version of MobiLink? Which type of clients? Adaptive Server Anywhere (ASA) or UltraLite Which type of consolidated database? ASA, ASE, MS SQL Server, Oracle8, IBM DB2 UDB How many remote databases? Do you use MobiLink yet?
We are often asked the following: What is the maximum number of remote users? How scalable is MobiLink? Common questions
I will try to convince you that MobiLink: Scales ideally with increasing remote databases Makes efficient use of hardware Has modest hardware requirements I want you to: Use MobiLink for large number of remote databases Get the best performance Goals of this presentation
You can: Support a large number of remote databases Predict performance for a large number of remote databases from tests with a small number Maximize throughput by following performance tips Benefits
MobiLink overview What takes time in a MobiLink synchronization? How performance was measured Results of performance testing Optimum number of worker threads Number of clients Size of synchronizations Parallel efficiency Recommendations and next steps Performance of MobiLink
A two-way synchronization technology for large scale mobile database deployment remote database (mobile, embedded, or workgroup server database) consolidated database (enterprise, workgroup, or desktop database) A server that processes synchronization requests from mobile databases What is MobiLink?
Heterogeneous consolidated database Scalable and robust(tens of thousands of remote databases) Manageable in large deployments Support handheld and wireless devices Flexible MobiLink design goals
Connection pooling Worker threads Little or no disk access Almost no contention in MobiLink Designed for scalability
MobiLink overview What takes time in a MobiLink synchronization? How performance was measured Results of performance testing Optimum number of worker threads Number of clients Size of synchronizations Parallel efficiency Recommendations and next steps Performance of MobiLink
Connections Upload Download What takes time in a synchronization?
Remote database (client) to MobiLink Overhead of creating network connection Client may have to wait for available MobiLink worker thread. MobiLink to consolidated database Worker thread uses database connection from pool Each database connection is tied to a sync version Reconnection on error or change in sync version Tip: # db connections # versions # workers Connections
Data transfer from client to MobiLink worker thread upload size bandwidth packing reduces transfer of zero-valued bytes some client processing with UltraLite clients worker does character set translation to Unicode all in memory, unless upload or BLOB cache overflow to disk Tip: upload cache (-u) largest upload # workers Tip: BLOB cache (-bc) 2 largest BLOB data in a row # workers Upload: client to MobiLink
MobiLink worker thread applies upload to consolidated database via your upload synchronization scripts time dictated by consolidated database performance simultaneous connections concurrency size of transactions network bandwidth Upload: MobiLink to consolidated
MobiLink worker thread fetches data to be downloaded via your download synchronization scripts time dictated by consolidated database performance MobiLink uses same BLOB cache as for upload Download: consolidated to MobiLink
Data transferred from MobiLink worker thread to client worker does character set translation from Unicode more client processing than in upload download size bandwidth client processing MobiLink worker thread waits for client acknowledgement This is optional in v8 We’ve found that with very slow clients, that a MobiLink worker thread would spend a majority of it’s time waiting for an acknowledgement of the download stream Download: MobiLink to client
More worker threads allow more simultaneous syncs Ideally: total time single sync time # clients # workers (assuming # clients # workers) Neglects contention and multitasking overhead In practice, should hit limit where increasing worker threads does not reduce total time Scaling up to more clients
Throughput may be limited by: client processing speed bandwidth for client-to-MobiLink communications speed of the computer running MobiLink number of MobiLink worker threads bandwidth for communication between MobiLink and the consolidated database performance of the consolidated database contention in your synchronization scripts Potential bottlenecks
MobiLink overview What takes time in a MobiLink synchronization? How performance was measured Results of performance testing Optimum number of worker threads Number of clients Size of synchronizations Parallel efficiency Recommendations and next steps Performance of MobiLink
Determine performance characteristics of MobiLink optimal number of worker threads for many clients differing number of clients synchronization size parallelism Testing methodology vary one thing at a time stress MobiLink and/or consolidated database keep it simple Performance tests
Single table Two-column primary key to avoid primary key pool Representative data types CREATE TABLE Purchase ( emp_id INT NOT NULL, purch_id INT NOT NULL, cust_id INT NOT NULL, cost NUMERIC NOT NULL, order_date TIMESTAMP NOT NULL, notes VARCHAR(64), PRIMARY KEY ( emp_id, purch_id ), ) Schema
Emp_id maps to remote client via employee table (which is not synchronized) Mutually exclusive partitioning of data between clients (to avoid contention and conflicts) Large values chosen for integer data(so packing would not shrink data transferred) Each row is 92 bytes when transferred Values
Extra tables in consolidated database MobiLink synchronization scripts Small, efficient client application Win32 console application Spawns multiple child processes that act as clients UltraLite with no file-based persistent storage Supervisor program Coordinates clients on different computers Timing framework
Clients kept in step via gates At a gate, each client waits for all the others Win32 event objects for clients on one computer Named pipes to supervisor for multiple computers Efficient (1 to 2 seconds for 1000 clients on 10 PCs) Gates before and after each synchronization Times recorded between gates and synchronization on both client and server Ensuring simultaneous synchronizations
1. Client: prepare for synchronization 2. Client: wait for all other clients (“gate”) 3. Client: record client start time 4. Client: start synchronization, via ULSynchronize() 5. ML: record start (begin_synchronization script) 6. Perform synchronization 7. ML: record end (end_synchronization script) 8. Client: record client end 9. Client: wait for all other clients (“gate”) Timing a synchronization
Client-measured time (for a single synchronization): tclient_end - tclient_start Server-measured time (for a single synchronization): tserver_end - tserver_start Total server time (for a set of simultaneous syncs): max(tserver_end) - min(tserver_start) Throughput: total # rows total server time Times and throughput definitions
Sybase SQL Anywhere Studio 7.0.1 and 8.0.0 Isolated test rack MobiLink and ASA on Dell PowerEdge 6300/550(4P3-550, 512 MB, database file on array drive, database log file on separate drive) Clients on 10 Dell Optiplex GXa 266Mbr(P2-266, 64 MB) 100 Mbps Fast Ethernet hub (with utilization gauge) Test environment
MobiLink overview What takes time in a MobiLink synchronization? How performance was measured Results of performance testing Optimum number of worker threads Number of clients Size of synchronizations Parallel efficiency Recommendations and next steps Performance of MobiLink
Four main tests: Number of worker threads Fast clients Slower Clients Slowest Clients Upload Cap Number of clients Size of synchronizations Number of server processors Results of performance testing
Constants: 1,000 clients 1,000 rows per client synchronization(92 bytes per row) total of 1,000 synchronizations Varied ML worker threads 2, 4, 5, 10, 20, 50 Test 1-A: Varying worker threads
Throughput rises then drops with increasing workers Two likely causes for drop: Hardware contention due to CPU or disk access saturation on server computer Software contention between connections in the consolidated database (blocking) In this case, 100% CPU utilization reached with 5 worker threads Clients fast enough to saturate ML/ASA (no difference increasing from 10 to 12 computers running clients) Optimal number of worker threads
0.5% of syncs active at any time with 5 worker threads Rest are either queued waiting, or already finished Client times: Longest client time total server time Average client time ½ total server time Maximizing throughput also minimizes average and longest client sync times Client perspective
Constants as before, except client hardware and network clients now run on 15 P-75 computers 10 Mbps Ethernet hub Varied ML worker threads 5, 10, 20, 50, 100 Test 1-B : Varying worker threads with slower clients
All types of synchronization slowed Downloads depend more on client speed than uploads With 5 MobiLink worker threads, downloads slowed by 46%, deletes slowed by 18%, updates and inserts slowed by 10% Adding worker threads reduces shortfall Uploads best with 10, download best with 50 High variability for downloads 25-30% instead of usual 2% Effects of slower clients
Timings vs. worker threadsfor slower clients • You may not want to optimize for download • add ~400 s to upload to save 20 s in download!
Wanted to simulate 1000 Palm devices on wireless WAN network Actual timings with Palm IIIx connected at 4800 baud Single Win32 client slowed to match or exceed Palm timings (using special UL runtime with optional delays) Use same delays for 1000 Win32 clients to simulate 1000 Palm devices connecting at 4800 baud Test 1-C : Simulating very slow clients
Constants: 1,000 clients (delayed to match Palm timings) 1,000 rows per client synchronization(92 bytes per row) total of 1,000 synchronizations Varied ML worker threads 5, 10, 20, 50, 100, 200, 500 Varying worker threads with very slow clients
Download improves almost linearly Long times to apply downloads are overlapped more with more workers Uploads best at 100 or 200 worker threads Optimal # of workers very different for upload and download! Optimal number of worker threads for very slow clients
Limits number of worker threads that can apply uploads simultaneously Referred to as “uploaders” Other worker threads can still download or receive upload Allows independent optimization of worker threads for upload and download throughput Upload cap
Constants: 1,000 clients (delayed to match Palm timings) 1,000 rows per client synchronization(92 bytes per row) total of 1,000 synchronizations 500 ML worker threads Varied ML upload cap 2, 5, 10, 20, 50, 100 Test 1-D : Varying uploaders with very slow clients
Constants: 1,000 clients (delayed to match Palm timings) 1,000 rows per client synchronization(92 bytes per row) total of 1,000 synchronizations 5 for upload cap (i.e. 5 uploaders) Varied ML worker threads 50, 100, 200, 334, 500 Test 1-E : Varying worker threads with upload cap and very slow clients
Throughput vs. worker threadsfor upload cap and very slow clients