290 likes | 690 Views
Report Writing. A report should be self-explanatory. It should be capable of being read and understood without reference to the original project description. Thus, for each question, it should contain all of the following:. a statement of the problem;
E N D
A report should be self-explanatory. It should be capable of being read and understood without reference to the original project description. Thus, for each question, it should contain all of the following:
a statement of the problem; • (b) a full and careful description of how it is investigated; • (c) All relevant results, including graphical and numerical analyses; variables should be carefully defined, and figures and tables should be properly labelled, described and referenced; • (d) relevant analysis, discussion, and conclusions.
It should be written in the third person. NOT: I think the Central Limit Theorem is true for this example because I see that the graph is normal. INSTEAD: It can be clearly seen that the graph displays a normal distribution confirming that the Central Limit Theorem holds.
The Central Limit Theorem
Let X1, X2………. Xn be independent identically distributed random variables with mean µ and variance σ2. Let S = X1,+ X2+ ………. +Xn Then elementary probability theory tells us that E(S) = nµ and var(S) = nσ2 . The Central Limit Theorem (CLT) further states that, provided n is not too small, S has an approximately normal distribution with the above mean nµ, and variance nσ2.
In other words, S approx ~ N(nµ, nσ2) The approximation improves as n increases. We will use R to demonstrate the CLT.
Let X1,X2……X6 come from the Uniform distribution, U(0,1) 1 0 1
For any uniform distribution on [A,B], µ is equal to and variance, σ2, is equal to So for our distribution,µ= 1/2 and σ2 = 1/12
The Central Limit Theorem therefore states that S should have an approximately normal distribution with meannµ (i.e. 6 x 0.5 = 3) and var nσ2(i.e. 6 x 1/12= 0.5) This gives standard deviation 0.7071 In other words, S approx ~ N(3, 0.70712)
Generate 10 000 results in each of six vectors for the uniform distribution on [0,1] in R. > x1=runif(10000) > x2=runif(10000) > x3=runif(10000) > x4=runif(10000) > x5=runif(10000) > x6=runif(10000) >
Let S = X1,+ X2+ ………. +X6 > s=x1+x2+x3+x4+x5+x6 > hist(s,nclass=20) >
Consider the mean and standard deviation of S > mean(s) [1] 3.002503 > sd(s) [1] 0.7070773 > This agrees with our earlier calculations
A method of examining whether the distribution is approximately normal is by producing a normal Q-Q plot. This is a plot of the sorted values of the vector S (the “data”) against what is in effect a idealised sample of the same size from the N(0,1) distribution.
If the CLT holds good, i.e. if S is approximately normal, then the plot should show an approximate straight line with intercept equal to the mean of S (here 3) and slope equal to the standard deviation of S (here 0.707).
> qqnorm(s) > 4.4 – 1.8 4 = 0.7 to 1 DP
From these plots it seems that agreement with the normal distribution is very good, despite the fact that we have only taken n = 6, i.e. the convergence is very rapid!