Estimating – with confidence!

Estimating – with confidence!

Case study… • The amount of potassium in the blood varies slightly from day to day and this fact (along with routine measurement errors) means that successive readings of a patient’s potassium level will show a small variation with a standard deviation of s = 0.2 mmol/L with a range of 3.5 – 5.0 mmol/L being cosidered normal. A significantly different value could indicate possible renal failure. • A patient has 1 K-test performed and presents a reading of 3.4 mmol/L. How reliable is this 1 measure? Can we quantify our confidence in this reading?

A confidence Interval • If we assume that the errors in measuring K are normally distributed then we can use z-scores to help quantify our confidence in this reading: • The reading of X=3.4 mmol/L is our estimate of the true value of the parameter “K blood-level concentration” • A 90% confidence interval would represent a range of readings that we would expect to get 90% of the time.

Solution… • Use the correct z-value for 90% 95% of area left of this point 5% of area left of this point The correct z values are -1.645 and +1.645 and are usually denoted z* to indicate that these are special ones chosen with a particluar confidence level “C” in mind. In this example C = 90%

Using the z-score formula we get: 90% of the readings will be expected to fall in the range (3.1,3.7) mmol/L Suppose 5 tests were performed over a number of days and eachtime gave a result of 3.4 mmol/L. How would that change the range of numbers in the confidence interval?

Using Confidence Intervals when Determining the True value of a Population Mean • We rarely ever know the population mean – instead we can construct SRS’s and measure sample means. • A confidence interval gives us a measure of how precisely we know the underlying population mean • We assume 3 things: • We can construct “n” SRS’s • The underlying population of sample means is Normal • We know the standard deviation

This gives … Confidence interval for a population mean: Number of samples or tests We infer this We measure this

Example: Fish or Cut Bait? A biologist is trying to determine how many rainbow trout are in an interior BC lake. To do this he uses a large net that filters 6000 m3 of lake water in each trial. He drops the net in a specific area and records the mean number of fish caught in 10 trials. This represents one SRS. From this he is able to determine a mean and standard deviation for the number of fish in 100 SRS’s. Each SRS has the same s = 9.3 fish with a sample mean of 17.5 fish. How precisely does he know the true mean of fish/6000 m3? Use C = 90% If the volume of the lake is 60 million m3, how many trout are in the lake?

Solution: • Since C = 0.90, z* = 1.645 There is a 90% chance that the true mean number of fish/6000 m3 lies in the range (16.0,19.0) Total number of fish: He is 90% confident that there are between 160 000 and 190 000 fish in the lake. Why should you be skeptical of this result?

Margin of Error • When testing confidence limits you are saying that your statistical measure of the mean is: • ie: X = 3.2 cm +/- 1.1 cm with a 90% confidence estimate +/- the margin of error

Math view… • Mathematically the margin of error is: • You can reduce the margin of error by • increasing the number of samples you test • making more precise measurements (makes s smaller)

Matching Sample Size to Margin of Error • An IT department in a large company is testing the failure rate of a new high-end graphics card in 200 of its work stations. 5 cards were chosen at random with the following lifetime per failure (measured in 1000’s of hours) and s = 0.5: Provide a 90% confidence level for the mean lifetime of these boards.

IT is 90% confident that the mean lifetime of these boards is between 1290 and 2030 hours. However – these are expensive boards and accounting wants to have the margin of error reduced to 0.10 with a 90% confidence level. What should IT do? IT needs to test 68 machines!

Important Caveats… • Read page 426 carefully! • Data must be a SRS • Outliers can wreak havoc! • We “fudged” our knowledge of s, in general we don’t know this • Poorly collected data or bad experiment design cannot be overcome by fancy formulas!

Examples… • 6.13 • 6.18 • 6.19 • 6.30

In conclusion… • This whole discussion rests on your understanding of z-scores. If you are OK with this then just review the new terms and try the previous examples • If you are still “rusty” or un-sure about z-scores, come and see me!

Estimating – with confidence!