710 likes | 824 Views
9.1. Proof of Concepts and Benchmarks etc. Definitions. Benchmark The customer may know the product works, but are we the best?? Maybe speed rather than facilities Proof-of-Concept (PoC) Does the product work ? (basic tick in the box) Can it do what I want it to do ? (facilities)
E N D
9.1 Proof of Concepts and Benchmarks etc.
Definitions • Benchmark • The customer may know the product works, but are we the best?? • Maybe speed rather than facilities • Proof-of-Concept (PoC) • Does the product work ? (basic tick in the box) • Can it do what I want it to do ? (facilities) • Can it handle my data ? (Volumes)
Differences • Generally there is (usually) a greater sense of urgency with a Benchmark • A benchmark is practically always competitive • A rule of thumb could be Proof of Concept - Customer is with you Benchmark - Customer is not generally with you
Benchmark as a Last Resort Benchmarks can be very risky • Competition uses losses as proof points in future deals, advertisements etc. • Every loss can require Five wins to compensate • Prematurely exposes Minor Shortcomings • Each may be minor and easily dealt with, but together, may contribute to customer feeling “buyer’s remorse” - before they buy!
Questions • Do we do the work ? • Can we do this technically ? • Do we want to do this commercially ? • The first question is ours to answer • The second we can give “advise” on - but should not answer He who fights and runs away, lives to fight another day
Resource-Intensive • Other Vendors May Be Able to Throw More Bodies at it • This definitely was Oracle’s strategy with Early Sybase System 10 • Ensure Adequate Resource Commitment From Sybase andCustomer • Play up the partnership commitment and long-term value to the customer • Inadequate resources and preparation almost guarantees failure
Requirements • Plan, Plan and Plan • Resources • Technical buy in from both the Customer and your company • Your time • The plan - including decision points • Hardware/Software Availability
The Plan • What we must have to make this work • People (customer and company), computers, software • How long is this going to take (multiply by 2 at least!) • Decision Points • Where can we stop, survey and decide to continue or cut our losses
Time Richard’s First Law of Benchmarks Everything has to be done 4 times • 1st Time - It will not work • 2nd Time - It partially works - then crashes • 3rd Time - It works, but no-one believes it, and you forgot to time it • 4th Time - It works and you did time it.
Resources - Your company • Your Company • Your time • Technical Support • Sales Person • Management buy-in • Technical stand-in
The Sales Person (yes they do have uses) • This person is worth their weight in gold • Well… Maybe silver • Their job is to shield you from interference from • The Customer • The Company • Their job is also to get you more resources or time if you need it
Resources - The Customer • Customer • Technical Assistance • Someone who KNOWS the system (not the guy the Technical Director first thought of) • Data and Schema (or at least some form of data definition) • Queries, or at least list of questions • Timescale (when is the finish date for the project)
Customer Technical Assistance • You must have someone from the Customer at your side during the PoC • Phone calls to the Customer eat a lot of time • Trying the find the “right” person to speak to takes even longer! Beware the phrase “Oh, didn’t I mention that” • Treat all given information as unproven (if not actually wrong)
Success/Failure SUCCESS CRITERIA • Without the above • How do we know if we have failed? • How do we know if we have succeeded? • What is the next step if we have succeeded? • Criteria also mean we have a target to aim for, and we limit the work required
Time and Success Good enough, in time = good Perfect, too late = bad • You must stick to the timing plan and aim ONLY for the success criteria • “Wouldn’t it be nice if we could run …” is the most horrible phrase ever to be heard in a benchmark
Good Benchmark Practice - 1 • Take Notes • Have a complete list of what you did and when you did it. • It will save time in the long run and will allow you to write up the project • Script Everything you do on the system • You WILL have to do everything more than once!
Good Benchmark Practice - 2 • Be aware of the clock • If the timing is looking tight, or impossible • Discuss with the Sales Person, he may be able to buy you more time, or extra resources • Don’t be afraid to ask for help • You cannot do everything yourself • You do not know everything • Ask - not asking means lost time and lost sales
Finally • OK we have • A Plan • Resources • Customer and Management Buy in • A Sales Person • A target • Computers and Time to do it • Let’s go do Step 2
The System • Processors • Memory • Disk (sub-system) inc. RAID • Operating System • IQ 12 • Other Software
The Free Hand • If we are not constrained and have a free hand • If we oversize - the customer may consider IQ is too hardware hungry • If we undersize then the queries will run slowly or not at all • We have to get this about right
CPU’s • Proof of Concept • More is better • Benchmark • IQ 12 is not parallel so if competitive with small number of users - small number of CPUs • If competitive, with large number of users - as many CPUs as you can get in the box!
Memory • More is better • We can always use more memory • Consider 15MB per user (that is a very bad generalisation - but almost accurate!)
Disk - 1 • Two sorts • Simple Disk • Disk Farm (Storage Array) • RAID x • Suggestion • Disk Farm - as many spindles as possible • RAID 0/1 (Mirror/Stripe) - fast and reliable
Disk - 2 • How Much…. • IQ Main • 90% Raw - max. • IQ Temp • 25% Raw - max. • Staging Area • How long is piece of string….. • You made need more than one “copy” of the data
Disk Farm • EMC, SSA, MTI etc. • These may have complex set-up routines • If possible, let the H/W supplier (or the customer) set up the disks • Watch ! And Take Notes!
Other Hardware • Extra Ethernet Adapters • 1 100Mbps is worth more than 10 10Mbps • 2 100Mbps is better still • Dedicated LAN/Hardware • if not, be aware of other users - especially when running timed tests • Tape Units • Are you testing Backup?
Operating System • The correct Revision to run IQ • All the require OS Patches • Has it been installed correctly • Do you need a Hardware Rep. To help with the install? • If this is a “system” benchmark, maybe you should plan a hardware rep. To be on site
Software • IQ • Have you the latest revision? • Have you read the release notes? • Are there any new EBFs? • Speak with Tech. Support PSE or Engineering get the latest revision (that works!) • Other Software • Replication Server, Distribution Director etc. • Are these all the latest revision? • Does all the software work together?
Installation • Install IQ • Decide on IQ Page Size • Build the Database • Create the IQ Main Store • Create the IQ Temp. Store
IQ Page Size • 64 Kbytes, unless • Big database then 128 Kbytes • Big, Big database then 256 Kbytes • Not 512 – remember the bug…
Catalogue Store • Nearly always forgotten • More space needed for general “ASA” staging space • If larger use RAW otherwise use Filesystem • This store was intended to fit into memory
IQ Store Questions • RAW or Filesystem • Unless there overwhelming reasons, and I can’t think of any, then RAW • Few Bigger, or Many Smaller • Many Smaller is better, but you may not have the choice
After Install and DB Create • Test using sp_iqstatus • Is the database the correct size, did all the dbspaces create OK • Test using sp_iqcheckdb • If we have the time, let’s make sure that we have no errors at this stage • Re-check - are you sure we have enough space in the database?
What to Do • Create the tables • Decide on the the “fast” indexes • Create the “fast” indexes • Decide on the HNG indexes • Create the HNG indexes • Test the installation
Table Creation • Strip out ALL constraints except • PRIMARY KEY on single columns • FOREIGN KEY on single columns • UNIQUE • IQ UNIQUE must must use • Generally in a PoC or Benchmark do not use constraints or permissions • In addition run everything from user DBA unless the customer has any real problems with this
Fast Index Decision • A fast index is the primary performance index on an IQ system • Low Fast - Low Cardinality • High Group - High Cardinality • Join Columns – HG Index • Cardinality breakpoint defined at around 1000-2000 for this case
For EVERY Column • Is the column EVER going to be used for more than just projection? • If the column fails the above test then do not waste time and space applying any more indexes on this column Remember the column you do not index is the one that will be used as a search column come the day of the presentation!
Cardinality • If the column is to have an index on it, decide on the cardinality • You may not have this information • I use the WAG method, Wild Asses Guess • If you have the time and disk spaces create a High Group on the column, load the data and perform a Select Distinct, this gets the exact cardinality (don’t do this on the 2.9 billion row table)
Fast Index • For High Cardinality put a High Group • For Low Cardinality put a Low Fast • Warning : Treat the Customer information as unproven “I never knew we had that many suppliers” • If you have to drop and recreate the index so what? You did allow time in the plan for reloading the data - didn’t you?
High Non Group Index • For EVERY Column that has a “fast” index • Is this column going to be used in the following • range searches • between search • avg(), sum() • root string searches (like “Syb%”) • If the answer is yes (regardless of the cardinality) add a High Non Group index • Remember the IQ UNIQUE clause
Test • Check by looking at the sys.sysobjects table in the catalogue server • You did script everything didn’t you? • We may have to retrofit indexes at a later time, but let’s TRY and get most of them built now
Loading the data • Configure the server for load • Pre-fix the data • Stage the data • Load the data • Test the installation • Do it all again (Probably!)
Configure for Load • Not much required here for IQ 12 • Ensure that you use sp_iqstatus to check that you have allocated the memory you thought you had • Consider increasing Temp and decreasing Main
Where does the data come from • Another database • Consider conversions here, unload, modify and reload may be faster than CONVERT() • Generally UNIX commands like AWK and SED can run quicker than CONVERT() and certainly are quicker than aggregate statements from general RDBMS products • If you do an unload and reload, you will need staging space (twice the size of the data)
Flat Files • This is where the most fun is • I will state as a “fact” • Most of the time the customer cannot tell you what the input file format is exactly • Print the file - in ASCII and HEX • Row and column delimiters generally are killers • The 7 millionth 400 thousandth record will be different from all the others - always • Same advice on CONVERT() applies here • Remember load performance switches (row delimited by etc.)