Statistics and Data Analysis

Statistics and Data Analysis Professor William Greene Stern School of Business Department of IOMS Department of Economics

Statistics and Data Analysis Random Walk Modelsfor Stock Prices

1/30 A Model for Stock Prices • Preliminary: • Consider a sequence of T random outcomes, independent from one to the next, Δ1, Δ2,…, ΔT. (Δ is a standard symbol for “change” which will be appropriate for what we are doing here. And, we’ll use “t” instead of “i” to signify something to do with “time.”) • Δt comes from a normal distribution with mean μ and standard deviation σ.

2/30 Application • Suppose P is sales of a store. The accounting period starts with total sales = 0 • On any given day, sales are random, normally distributed with mean μ and standard deviation σ. For example, mean $100,000 with standard deviation $10,000 • Sales on any given day, day t, are denoted Δt • Δ1 = sales on day 1, • Δ2 = sales on day 2, • Total sales after T days will be Δ1+ Δ2+…+ ΔT • Therefore, each Δt is the change in the total that occurs on day t.

3/30 Using the Central Limit Theorem to Describe the Total • Let PT = Δ1+ Δ2+…+ ΔTbe the total of the changes (variables) from times (observations) 1 to T. • The sequence is • P1 = Δ1 • P2 = Δ1 + Δ2 • P3 = Δ1 + Δ2 + Δ3 • And so on… • PT = Δ1 + Δ2 + Δ3 + … + ΔT

4/30 Summing • If the individual Δs are each normally distributed with mean μ and standard deviation σ, then • P1 = Δ1 = Normal [ μ, σ] • P2 = Δ1 + Δ2 = Normal [2μ, σ√2] • P3 = Δ1 + Δ2 + Δ3= Normal [3μ, σ√3] • And so on… so that • PT = N[Tμ, σ√T]

5/30 Application • Suppose P is accumulated sales of a store. The accounting period starts with total sales = 0 • Δ1 = sales on day 1, • Δ2 = sales on day 2 • Accumulated sales after day 2 = Δ1+ Δ2 • And so on…

The sequence is P1 = Δ1 P2 = Δ1 + Δ2 P3 = Δ1 + Δ2 + Δ3 And so on… PT = Δ1 + Δ2 + Δ3 + … + ΔT It follows that P1 = Δ1 P2 = P1 + Δ2 P3 = P2 + Δ3 And so on… PT = PT-1+ ΔT 6/30 This defines a Random Walk

7/30 A Model for Stock Prices • Random Walk Model: Today’s price = yesterday’s price + a change that is independent of all previous information. (It’s a model, and a very controversial one at that.) • Start at some known P0 so P1 = P0 + Δ1 and so on. • Assume μ = 0 (no systematic drift in the stock price).

8/30 Random Walk Simulations Pt = Pt-1 + Δt Example: P0= 10, Δt Normal with μ=0, σ=0.02

9/30 Uncertainty • Expected Price = E[Pt] = P0+TμWe have used μ = 0 (no systematic upward or downward drift). • Standard deviation = σ√T reflects uncertainty. • Looking forward from “now” = time t=0, the uncertainty increases the farther out we look to the future.

10/30 Using the Empirical Rule to Formulate an Expected Range

11/30 Application • Using the random walk model, with P0 = $40, say μ =$0.01, σ=$0.28, what is the probability that the stock will exceed $41 after 25 days? • E[P25] = 40 + 25($.01) = $40.25. The standard deviation will be $0.28√25=$1.40.

12/30 Prediction Interval • From the normal distribution,P[μt - 1.96σt< X <μt + 1.96σt] = 95% • This range can provide a “prediction interval, where μt = P0 + tμ and σt = σ√t.

13/30 Random Walk Model • Controversial – many assumptions • Normality is inessential – we are summing, so after 25 periods or so, we can invoke the CLT. • The assumption of period to period independence is at least debatable. • The assumption of unchanging mean and variance is certainly debatable. • The additive model allows negative prices. (Ouch!) • The model when applied is usually based on logs and the lognormal model. To be continued …

14/30 Lognormal Random Walk • The lognormal model remedies some of the shortcomings of the linear (normal) model. • Somewhat more realistic. • Equally controversial. • Description follows for those interested.

15/30 Lognormal Variable If the log of a variable has a normal distribution, then the variable has a lognormal distribution. Mean =Exp[μ+σ2/2] > Median = Exp[μ]

16/30 Lognormality – Country Per Capita Gross Domestic Product Data

17/30 Lognormality – Earnings in a Large Cross Section

18/30 Lognormal Variable Exhibits Skewness The mean is to the right of the median.

19/30 Lognormal Distribution for Price Changes • Math preliminaries: • (Growth) If price is P0 at time 0 and the price grows by 100Δ% from period 0 to period 1, then the price at period 1 is P0(1 + Δ). For example, P0=40; Δ = 0.04 (4% per period); P1 = P0(1 + 0.04). • (Price ratio) If P1 = P0(1 + 0.04) then P1/P0 = (1 + 0.04). • (Math fact) For smallish Δ, log(1 + Δ) ≈ ΔExample, if Δ = 0.04, log(1 + 0.04) = 0.39221.

20/30 Collecting Math Facts

21/30 Building a Model

22/30 A Second Period

23/30 What Does It Imply?

24/30 Random Walk in Logs

25/30 Lognormal Model for Prices

26/30 Lognormal Random Walk

27/30 Application • Suppose P0 = 40, μ=0 and σ=0.02. What is the probabiity that P25, the price of the stock after 25 days, will exceed 45? • logP25 has mean log40 + 25μ =log40 =3.6889 and standard deviation σ√25 = 5(.02)=.1. It will be at least approximately normally distributed. • P[P25 > 45] = P[logP25 > log45] = P[logP25 > 3.8066] • P[logP25 > 3.8066] =P[(logP25-3.6889)/0.1 > (3.8066-3.6889)/0.1)]=P[Z > 1.177] = P[Z < -1.177] = 0.119598

28/30 Prediction Interval We are 95% certain that logP25 is in the intervallogP0 + μ25 - 1.96σ25 to logP0 + μ25 + 1.96σ25. Continue to assume μ=0 so μ25 = 25(0)=0 and σ=0.02 so σ25 = 0.02(√25)=0.1Then, the interval is 3.6889 -1.96(0.1) to 3.6889 + 1.96(0.1)or 3.4929 to 3.8849.This means that we are 95% confident that P0 is in the rangee3.4929 = 32.88 and e3.8849 = 48.66

29/30 Observations - 1 • The lognormal model (lognormal random walk) predicts that the price will always take the form PT = P0eΣΔt • This will always be positive, so this overcomes the problem of the first model we looked at.

30/30 Observations - 2 • The lognormal model has a quirk of its own. Note that when we formed the prediction interval for P25 based on P0 = 40, the interval is [32.88,48.66] which has center at 40.77 > 40, even though μ = 0. It looks like free money. • Why does this happen? A feature of the lognormal model is that E[PT] = P0exp(μT + ½σT2) which is greater than P0 even if μ = 0. • Philosophically, we can interpret this as the expected return to undertaking risk (compared to no risk – a risk “premium”). • On the other hand, this is a model. It has virtues and flaws. This is one of the flaws.

Statistics and Data Analysis