1 / 22

Data Analysis Examples

Data Analysis Examples. Anthony E. Butterfield CH EN 4903-1. #1: The Normal PDF.

skule
Download Presentation

Data Analysis Examples

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Analysis Examples Anthony E. Butterfield CH EN 4903-1

  2. #1: The Normal PDF • Your coworker tells you the temperature fluctuations of the outlet temperature from a certain coal gassifier have an average of 1304 K and keep within 12 K of that mean for 95% of her measurements, over months of operation. If we assume the temperature measurements are normally distributed, what is the standard deviation and what are the odds that a temperature measurement would be above 1310 K? • T = 1304 ± 12 K (95% Confidence Level)

  3. Normal Distribution • Probability density function (PDF):

  4. #1: The Normal PDF

  5. #2: Error Propagation • In a falling bead viscometer, the viscosity may be found by the following equation: • Where r is the bead radius, g is gravitational acceleration, V is the terminal velocity, rB is the bead density and rF is the fluid density.  If we find, within a 95% confidence level, that the bead density is 2 ± 0.1 g/cm3, the radius is 3 ± 0.1 mm, the fluid density is 1.1 ± 0.2 g/cm3, and, after terminal velocity is achieved, the bead falls 10 ± 0.2 cm in 12 ± 0.5 seconds.  What is the calculated viscosity and the uncertainty in its value? Which measurement is the greatest source of error?

  6. #2: Error Propagation • A couple options:

  7. #2: Error Propagation

  8. #3: Log Normal • 2. You find the following particle size distributions from a spray dryer experiment: Table of dataIf we were to assume this distribution of particle sizes is log-normal, what would be the mean and standard deviation for the log-normal pdf? • Nonlinear fitting problem, like #6.

  9. #3: Log Normal

  10. #4: Hypothesis Testing • On a certain stage of a distillation column theory predicts the ethanol concentration should be 27%. You take the following measurements over several runs: • What is the likelihood that your measurements match theory?

  11. #4: Hypothesis Testing • Student’s T-Test. • Mean = 24.52 • StDev = 4.2163 • Degrees of Freedomv = na – 1 = 10 -1 = 9

  12. #4: Hypothesis Testing • T-Statistic:

  13. #4: Hypothesis Testing • Use t-statistic in CDF to find probability. Answer = 9.6%

  14. #5: Hypothesis Testing 2 • You are measuring the effectiveness of a new catalyst on a reaction with a great deal of normally distributed variability. You measure the time to 99% conversion of your reactants with both your new and old catalyst for several experimental runs and find the following data: • Given this data, what is the probability that the new catalyst is more effective than the old? What is the probability that they are equally effective?

  15. #5: Hypothesis Testing 2 • Mean A = 10.25, Mean B = 9.50 • StDev A = 1.071, StDev B = 1.066 • Number A = 22, Number B = 20 • Degrees of Freedomv = na + nb – 2 = 40

  16. #5: Hypothesis Testing 2 • T-Statistic:

  17. #5: Hypothesis Testing 2 • Simple rule: • Greater or less than tests use one tail (two unequal areas) and you can easily know which % you want to use by looking at the means. • Equaltest uses two equal tails. • For T-CDF with v = 40 and at t-statistic of -2.295, P = 2.7%. • P that new catalyst is more effective is a one tail test. • More effective (one tail) = 100% - 2.7% = 97% • Equal (two tail) = 2*2.7% = 5%

  18. #6: Non-Linear Fit • The rate of population growth in a bacteria culture are found to be: • It is thought that this data could be fit to the equation: Rate=b1*sin(b2*t)where b1 and b2 are constants to be determined and t is time. Determine the least squares estimated values for b1 and b2 and give an appropriate confidence interval for a confidence level of 90%. Also, what would you anticipate the rate to be at 24 hr? What would the confidence interval for a 95% confidence level be at 24 hr?

  19. #6: Non-Linear Fit

  20. %Anthony Butterfield 2009 %Example of nonlinear fit with CIs clear close all b(1)=1/3; b(2)=1; re=0.1; %random noise strength x=linspace(0,6,20)'; %x data for fitting x2=linspace(0,6,100)'; %x data for plotting n=length(x); y=b(1)*sin(b(2)*x)+re*randn(n,1); %y data for fitting, note the random error added in to make it realistic yt=b(1)*sin(b(2)*x2); %theoretical y data for plotting [beta r J]=nlinfit(x,y,@nlinfitsin,[1 1]); %numerically performs a nonlinear fit bci = nlparci(beta,r,J); %returns the c.i. for the parameters, beta [ypred,delta] = nlpredci(@nlinfitsin,x2,beta,r,J); %returns a predicted y and the c.i. for each y [ypred,delta] = nlpredci(@nlinfitsin,x2,beta,r,J); %returns a predicted y and the c.i. for each y disp('Fit to equation: y = b1 sin(b2 * x)') disp(' x data y data') for i=1:n txt=sprintf(' %5.3f %5.3f',x(i),y(i)); disp(txt) end txt=sprintf('b1 was %3.1f, and is estimated to be: %f ± %f (95%% CL)',b(1),beta(1),abs(beta(1)-bci(1,1))); disp(txt) txt=sprintf('b2 was %3.1f, and is estimated to be: %f ± %f (95%% CL)',b(2),beta(2),abs(beta(2)-bci(2,1))); disp(txt) figure(1) hold on grid on scatter(x,y,10,'r') plot(x2,yt,'Color',[1 0.5 0]) %just wanted to give you an example of how to change the line color to something not preset plot(x2,ypred,'b',x2,ypred+delta,'b:',x2,ypred-delta,'b:') hold off

  21. #6: Non-Linear Fit • nlparci • In “theory” b1 = 0.3; estimated b1 = 0.35 ± 0.05 (90% CL) • In “theory” b2 = 1.0; estimated b2 = 1.04 ± 0.04 (90% CL) • nlpredci • At 24 hr “theory” predicts: • Rate = -0.3019 • Fit predicts: • Rate = -0.1090 ± 0.3839 (95% CL)

More Related