550 likes | 668 Views
De la Garza Phenomenon. Bikas K Sinha ISI, Kolkata RU Workshop : APRIL 18, 2012 Collaborators : N K Mandal & M Pal Calcutta University. Nomenclature..……. Liski - Mandal -Shah- Sinha (2002) : Topics in Optimal Design : Springer- Verlag Monograph
E N D
De la Garza Phenomenon Bikas K Sinha ISI, Kolkata RU Workshop : APRIL 18, 2012 Collaborators : N K Mandal & M Pal Calcutta University
Nomenclature..…… Liski-Mandal-Shah-Sinha (2002) : Topics in Optimal Design : Springer-Verlag Monograph Pukelshiem (2006) : Optimal Design of Experiments Refers as …..Property of Admissibility • Khuri-Mukherjee-Sinha-Ghosh (2006) : Statistical Science …..de la Garza Phenomenon • Min Yang (2010) : Annals of Statistics …title of the paper ‘On the de la Garza Phenomenon’
Motivating Example : First Course in Regression • X : -3.2 -2.7 -1.8 0.2 4.7 6.3 8.2 • Y : … … … …. … … …… • Fit a linear regression equation of Y on X under the usual model assumptions….etc etc • X-transformed to U…… • U : -1.00, -0.91, -0.75, -0.40, 0.39, 0.67, 1.00 • Motivating Question : If we believe in the linear regression model, what good are so many u-values ? Why can’t we work with exactly two u-values &, that too, possibly with +/- 1 ?
Linear Regression Model Mean Model Yx = α + βx with Homoscedastic Errors • Given DN = [(x 1, n 1); (x 2, n 2); …(x k, n k)] ; N=∑ni • χ = Space of the Regressor ‘X’ = [a, b], a < b WOLG : a ≤ x 1 < x 2 <….< xk ≤ b; x’s all distinct For each i, ni ≥ 1 such that ∑ni= N [given] Estimability of α and β ensured iff k ≥ 2. Fitting of Linear Regression Model : β^ = b yx= SPyx/ SSxx; α^ = ybar – b xbar Inference rests on normality of errors etc etc
Motivating Theory :Undergraduate Level X : a ≤ x 1 < x 2 <….< xk ≤ b [k > 1, all x’s distinct] Y : y1 , y2 , y3 , …. yk ……responses on Y Assume Linear Regression of Y on X : E[Yx] = α + βx Usual conditions on the errors…. Find BLUE of the regression coefficient ‘β’. Smart Student’s thought…..pairwise unbiased estimators… β^_(i,j) =b_(i,j) = (yi – yj) / (xi – xj), 1<= i < j <= k So….BLUE can be based on the {b_(i,j)’s}…..k_c_2 pairs All Distinct ? / Correlated / Uncorrelated ? Basis : b_(1,2), b_(1,3), …, b_(1,k) …each unbiased but Jointly correlated estimates…..y_1 is involved everywhere
Formation of BLUE….. • Work out means, variances/ covariances of the estimators and start from there to arrive at the BLUE. • Define ‘η’ as the (k-1)x1 col. vector of the ‘difference estimators’ i.e., η =(b_(1,2), b_(1,3),…,b_(1,k)) so that • E[η] = β1 & Disp.(η) = σ2W, W being a pd matrix • Then blue of β = η’ W-1 1 / 1’ W-1 1 • Show that indeed the above simplifies to β^=b=∑ (yi- ybar)(xi – xbar)/ ∑(xi -xbar)^2.
Smarter move….. • V1 = [y1 – y2]/√2 / [x1 – x2]/ √2 • V2 = [y1 + y2 – 2y3]/ √6 / [x1 + x2 - 2x3] √6 • ……. • Vn-1= [y1 + y2 +…- (n-1)yn]/ √{n(n-1)} / • [x1 + x2 +…- (n-1)xn] / √{n(n-1)} • Then these V’s are uncorrelated. • Hence W(V) is a diagonal matrix etc etc…. • Derivation of β^ is much easier…… • Claim: Same result….novel derivation …use of Helmert’s Orthogonal Transformation.
Motivating Theory : Master Level Regression Design on X : (x1, n1); ( x2, n2); …………..(xk , nk) [k > 1]; all x’s distinct Y : {(y1j); (y2j); ….(ykj)}…altogether n = sum niobservations Assume Linear Regression of Y on X : E[Yx] = α + βx Usual conditions on the errors…. Find BLUE of the regression coefficient ‘β’. Smart Student’s thought…..pairwise unbiased estimators… β^_(i,j) =b_(i,j) = (ybari– ybarj) / (xi – xj), 1<= i < j <= k So….BLUE can be based on the {b_(i,j)’s}. How many ? Correlated /Uncorrelated ? Basis : b_(1,2), b_(1,3), …, b_(1,k) …each unbiased but Jointly correlated estimates…..y_1 is involved everywhere
Motivating Theory : Master Level & Beyond….. • Work out means, variances/ covariances of the estimators and start from there to arrive at the BLUE. • Define ‘η’ as the vector of these ‘difference estimators’ so that • E[η] = β1 & Disp.(η) = σ2W…..Complicated ? • Then blue of β = η’ W-1 1 / 1’ W-1 1 Show that indeed the above simplifies to β^=b=∑ni (ybari - ybarbar)(xi – xbar) / ∑ni (xi -xbar)2.
Smarter move…. • V1 = [√n1 ybar1 - √n2 ybar2]/[….] • V2 = [√n1 ybar1 + √n2 ybar2 - 2√n3 ybar3]/[...] • Etc etc • This time W-matrix becomes a diagonal matrix… • Tremendous simplification in the formation of β^
Turn back to the basic question… X : -3.2 -2.7 -1.8 0.2 4.7 6.3 8.2 • U : -1.00, -0.91, -0.75, -0.40, 0.39, 0.67, 1.00 • Motivating Question : If you believe in the linear regression model • E[Y_x] = α + βx = δ + γu = E[Y_u] what good are so many u-values ? Why can’t you work with exactly two u-values &, that too, possibly with +/- 1 ?
Fisher Information Matrix • I(θ; DN) = X’ X = 2 x 2 matrix with elements • [(N T1); (T1 T2)] where T1 = ∑ ni xi & T2 = ∑ ni x2i X Nx2 = [1 Nx1 , col. vector of xi’s with ni repeats] Averaged Information Matrix per Observation IBAR = (I/N) I(θ) = [(1 μ’1) (μ’1μ’2)] where μ’1 = ∑ ni xi / N μ’2 = ∑ ni x2i / N I(θ) : pd matrix iff k ≥ 2 distinct x’s are considered
de la Garza Phenomenon [de la Garza, A. (1954) : AMS] • Research Paper [Annals of Statistics] : 2010 • Research Paper [Annals of Statistics] : 2009 • Springer-Verlag Monograph on Optimal Designs : 2002 • Wiley Book on Optimal Designs : 2006 • Continuous Flow of Papers involving Linear & Non-Linear Models – both qualitative and quantitative responses – enormous impact of de la Garza Phenomenon in optimality studies
Continuous Design Theory • Context : Linear Regression Model • Space of Regressor : χ = [a, b], a < b • k ≥ 2 distinct x-values in CHI with positive weights • w1, w2, …, wk such that ∑wi= 1 • In applications, we consider in terms of ‘N’ observations, with Nwi = Ni observations taken at • x = xi , i = 1, 2, …, k. • [Choice of ‘N’ ensures integral values of Ni’s] Version of IBAR = [(1 μ’1) (μ’1μ’2)] where μ’1 = ∑ wi xi AND μ’2 = ∑ wi x2i Known as Information Matrix arising out of a Continuous Design, in terms of {(xi ,wi); i = 1, 2, …, k}
De la Garza Phenomenon : Continuous Design Theory • Context : Linear Regression Model with Homoscedastic Errors • Claim 1: Given any continuous regression design ‘D_(k, x, w)’ with ‘k’ support points in χ= [a, b] : • a ≤ x 1 < x 2 <….< xk ≤ b; x’s all distinct and with positive weights w1, w2, …, wk [such that ∑wi= 1], whenever k > 2, we can find exactly 2 points ‘x*’ and ‘x**’ with suitable weights ‘p*’ and ‘p**’ such that (i) x 1 ≤ x* < x** ≤ x k; (ii) p* + p** = 1 and (iii) IBAR based on ‘D*_[(x*, p*); (x**, p**)]’ is identical to IBAR based on D_(k, x, w). [Info. Equivalence]
Proof of Claim 1 • Recall μ’1 = ∑ wi xi [1st moment] • AND μ’2 = ∑ wi x2i [2nd moment] • Start with • IBAR = [(1 μ’1) (μ’1μ’2 )] • Set IBAR = I*BAR and derive defining equ. • p*x* + p** x** = μ’1 …………………..(1) • p*x*2 + p** x**2 = μ’2…………..(2) • Claim : There is an acceptable solution for • [(x*, p*); (x**, p**) satisfying (1) and (2).
Proof….contd. • WOLG : x1 = -1 AND xk = +1 • Solution set : Define μ2 = μ’2 – μ’12 > 0 • x* = μ’1 +/- [p** μ2/p*] • X** = μ’1 -/+ [p* μ2/p**] • Further, for x* < x**, we readily verify • -1 < x* = μ’1 – [p** μ2/p*] AND • x* < x** = μ’1+ [p* μ2/p**] < 1 • whenever μ2 / [μ2 + (1 +μ’1)2] < p* < • (1-μ’1)2 / [μ2 + (1- μ’1)2] • NOTE : Verified LHS < RHS
Statement of Information Equivalence : Polynomial Regression Therefore : Guaranteed existence of [(x*, p*); (x**, p**)]; -1 < x* < x** < 1; 0 < p* < 1 such that IBAR = IBAR*. de la Garza Phenomenon applies to pth degree polynomial regression model in terms of Information Equivalence of any k [>p+1]–point supported continuous design with that of a suitably chosen exactly (p+1)-point supported continuous design !
Caratheodory’s Theorem • If ‘p+1’ is the number of parameters in a model, one can restrict attention to at most (p+1)(p+2)/2 parameters. • Strength…..model specification …most general • Weakness….pth degree polynomial regression model…de la Garza provides much better result • [ p+1 < <(p+1)(p+2)/2, in general terms]
Higher Degree Polynomial Regression • Yes….de a Garza Phenomenon holds for higher degree polynomial regressions as well…..proof is a marvel exercise in matrix theory !!! • Equate given pd matrix I(D) to I(D*) where • I(D*) = X*W*X*, with X* being a square matrix and W* being a diagonal matrix. The claim is that such X* and W* matrices exist with minimum number of support points …..this is the spirit of de la Garza Phenomenon in terms of Information Equivalence. Information Dominance came much later.
Back to de la Garza Phenomenon: Exact Design Theory [EDT] • This aspect …somehow…has been bypassed in the literature……difficult to provide a general theory as to the exact sample size for Info. Equi. to work ! • Motivating Example : Linear Regression with 3 points to start with : [-1, 0, 1] so that k = 3 > 2. Accordingly to de la Garza Phenomenon, under continuous design theory, there are weights • 0 < w -1, w0 , w +1 < 1, sum = 1 • assigned to these points. AND then we can find
De la Garza Phenomenon : EDT one 2-point design, say [(a, p); (b, q)] such that -1 ≤ a < b ≤ 1, 0 < p < 1 and there is Information Equivalence between the two designs ! What if we are in an exact design scenario with a given total number of observations ‘N’ and its decomposition into n(-), n(o) and n(+) – being assigned to -1, 0 and 1 respectively ? Can we now find a solution to [(a, na); (b, nb)] satisfying
EDT… • (i) -1 ≤ a < b ≤ 1; • (ii) na + nb = N – both being integers • (iii) Information Equivalence ? • Do we need a condition on ‘N’ at all ? • Crucial Observation : NOT ALL VALUES OF ‘N’ ARE AMENABLE TO SUPPORTING THE EQUIVALENCE THEOREM OF THE INFORMATION MATRIX .….NEEDED A MINIMUM VALUE……ONLY THEN IT WORKS !
EDT : Choice of ‘N’ • Examples : N Remark • (i) -1(1), 0(1), +1 (1) : 3 NOT Possible • (ii) -1(2), 0(2), +1(2) : 6 Possible • (iii) -1(1), 0(2), +1(1) : 4 Possible • (iv) -1(2), 0(1), +1(1) : 4 Not Possible • (v) -1(4), 0(2), +1(2) : 8 Possible • (vi) -1(1), 0(3), +1(1) : 5 Possible • (vii) -1(1), 0(2), +1 (4) : 7 Possible • (viii) -1(1), a(1), +1(1) : 3 Not Possible • (vi) -1(2), a(2), +1(2) : 6 Possible iff 3 – 2(3) < a < 2 (3) – 3
EDT : General Theory for 3 pointswith point symmetry • Consider a general allocation design : • -1 (n-), 0(no) and 1(n+) where each of n-, no and n+ is a positive integer and (n-) + (no) + (n+) = N ≥ 3. • Once more, we want to replace the above 3-point point-symmetric design by a two point design of the form : (x, nx) and (y, ny) so that nx + ny = N and, moreover, Information Equivalence holds. That suggests
EDT • x nx + y ny = (n+) – (n-) ..…….(3) • x2 nx + y2ny = (n+) + (n-) ……….(4) • Set • a = nx, b = ny, T1 = (n+) – (n-) and T2 = (n+ ) + (n-) ……………(5) • From (3) and (4), in terms of (5), we obtain • x = [T1 / (a+b)] ± [{b[(a+b)T2 – T12]}/a(a+b)2] • y = [T1 / (a+b)] ±[{a[(a+b)T2 – T12]}/b(a+b)2] • It can be readily verified that (a+b) T2 > T12.
EDT • Let us choose • x = [T1 / (a+b)] +[{b[(a+b)T2 – T12]}/a(a+b)2] • and • y = [T1 / (a+b)] -[{a[(a+b)T2 – T12]}/b(a+b)2] • so that y < x. • Note that T1 and T2 are both known. We will now sort out values of nxand nysubject to nx+ ny= N so as to satisfy the requirement that • -1 ≤ y < x ≤ 1.
EDT • First, note that • (i) a + b = N • (ii) expressions for x and y depend on a and b only through a/b or b/a. • Set n(-)/N = P- n( 0) / N = Po n(+)/N = P+ • Conditions : -1 ≤ y AND x ≤ 1 • Equivalent to : • 1 + T1/(a+b) ≥ [{a[(a+b)T2 – T12]}/b(a+b) 2] • AND • 1 – T1/(a+b) ≥[{b[(a+b) T2 – T12]}/a(a+b) 2]
EDT • Equivalent to : [Po(1-Po)+ 4(P+)(P-)]/[2(P-) + Po]2 ≤ nx/ny nx/ny <= [2(P+) + Po]2 /[Po(1-Po)+4(P+)(P-)] • Equivalent to : L =[Po(1-Po)+ 4(P+)(P-)]/[Po(1-Po)+ 4(P+)(P-)+[2(P-) + Po]2] ≤ nx / N <= [2(P+) + Po]2 / [Po(1- Po)+ 4(P+)(P-) + [2(P+) + Po]2] = U • Written alternatively as : N.L ≤ nx ≤N.U.
EDT • Implication : Choice of ‘N’ must be such that the interval [N.L, N.U] includes at least one integer which can serve as the value of nx. A sufficient condition for this to happen is, of course, that the length of the interval viz. N(U - L) ≥ 1. Even otherwise, a choice of nx could be ensured. Note : So far….this [length less than unity] has been eluding us !!!
EDT (i) Po = P+ = P- = 1/3 [point and mass symmetric design] • Here we find L = 2/5 and U = 3/5. • So, for N = 3, N.L = 6/5 and N.U =9/5, which do not include any integer. So 3-point design with point and mass symmetry cannot be replaced by a 2-point design whenever N = 3. • Again, for N = 6, we have N.L = 12/5, N.U = 18/5 and these include the integer ‘3’. So there is a solution and we have : ± (2/3), each with 3 observations…as was mentioned before.
EDT • For N = 9, we have N.L = 18/5 and N.U = 27/5. These include 2 integers : 4 and 5. So we have two solutions : • [-5/(30), 4]; [4/(30), 5] • AND • [-4/(30), 5]; [5/(30), 4].
EDT • (ii) Po = 2/7, P+ = 4/7 and P- = 1/7 i.e., the initial design is has a size which is a multiple of 7, say N = 7k. This design is pt-sym but mass-asymmetric. • And explicitly it is : [(-1, k); (0, 2k), (1, 4k)] where k is an integer. • Note that L and U are independent of k. Computations yield : L = 13/21 [= 39/63] and U =50/63. • (a) k =1 : N = 7; N.L=13/3 < N.U=50/9 : one sol. • nx = 5, x = 3/7 + (1040)/70; • ny = 2, y = 3/7 – 5 (1040)/140
EDT • (b) k = 2 : N = 14….three solutions • nx = 9, x = 3/7 + (520)/42; • ny = 5, y = 3/7 – 3 (2080)/140 • nx =10, x = 3/7 + (1040)/70; • ny = 4, y = 3/7 –(260)/14 nx =11, x = 3/7 + (3432)/154; ny = 3, y = 3/7 –(3432)/ 42
EDT • (iii) Po = 3/5, P+ = P- = 1/5 i.e., the initial design has size multiple of 5, say N = 5k and explicitly it is : • [(-1, k); (0, 3k); (1, k)] where k is an integer. . • This is point and mass-symmetric • Note that L and U are independent of k. Computations yield : L = 2/7 and U = 5/7. • k = 1 : N = 5, 10/7 ≤ nx ≤ 25/7 : • (nx, ny) = (2, 3) OR (3, 2). • Solutions : x = 2/(15) and y = -3/(15) • with nx = 3 and ny = 2; • x = 3/(15) and y = -2/(15) • with nx = 2 and ny = 3.
EDT • k = 2 : N = 10, 20/7 ≤ nx ≤ 50/7 : nx = 3, 4, 5, 6, 7. • Solutions: x = 6/(210) and y = -14/(210) for (nx, ny) = (7, 3) x = 14/(210) and y = -6/(210) for (nx, ny) = (3, 7) x = 4/(60) and y = -6/(60) for (nx, ny) = (6, 4) x = 6/(60) and y = -4/(60) for (nx, ny) = (4, 6) x = 2/(10) and y = -2/(10) for (nx, ny) = (5, 5).
EDT • EXAMPLE of 3 -point asymmetric design : N = 3 • Consider an asymmetric design [(-1, 1), (a, 1), (1, 1)] with a # 0. WOLG, we take a > 0. • Consider Information Equivalence with [(x, 2), (y, 1)]. • Then • a = 2x + y……………………..…(6) • 2 + a2= 2x2+ y2…………………..(7) • This yields : x = a/3 ± 2/3 times (a2 + 3) • and for 0 < a < 1, it turns out that • a/3 – 2/3 times (a3+ 3) < -1 • and 1 < a/3 + 2/3 times (a2 + 3). • Hence, N = 3 does not work !
EDT • For N = 6, naturally, equal allocation of 2 at each of the 3 points will yield the same negative result when we opt for [(x, 4), (y, 2)]. It follows that [(x, 5), (y, 1)] also fails to yield any affirmative result. • For [(x, 3), (y, 3)], we require • 2a = 3(x+y) • 4 + 2a2= 3(x2+ y2). • We obtain : • x, y = a/3 ±1/3 times (6 + 2a2)
EDT • Note : For a = 0, this leads to : x, y = ± (2/3). This was discussed earlier. • Condition : -1 < x < 1 leads to : • 0 < a < 2(3) – 3, if a > 0. • This was stated earlier.
EDT • More examples….. • [(-1, 1); (0, 2); (1, 1)] is equivalent to • [(-1/(2), 2); ((1/(2), 2)] • [(-1, 2); (0, 1); (1, 1)] : Impossible • [(-1, 4); (0, 2); (1, 2)] is equivalent to • [(-1/4 - (165)/20; 5); (-1/4 + (165)/12, 3]
Turning back to the example… U : -1.00, -0.91, -0.75, -0.40, 0.39, 0.67, 1.00 Under Linear Regression : Does there exist a 2- point Information Equivalent Design ? Computations yield : n = 7 μ’1= -1/7= -0.142857; μ’2 = 4.1516/7 Alt. Choice : -1 < a(4) < 0 < b(3) < 1 for 7 obs. 4a + 3b = -1 and 4a^2 + 3b^2 = 4.1516 a = -0.7982 AND b = 0.7309….reqd. solution
Quadratic Regression : Info Equi. • Context : Quadratic Regression Model with Homoscedastic Errors • [ Mean Model Yx = α + βx + γx2 ] • Claim : Given any continuous regression design ‘D_(k, x, w)’ with ‘k’ support points in χ =[a, b] : • a ≤ x 1 < x 2 <….< xk ≤ b; x’s all distinct and with positive weights w1, w2, …, wk [such that ∑wi= 1], whenever k > 3, we can find exactly 3 points ‘x*’, ‘x**’ and ‘x***’ with suitable weights ‘p*’, ‘p**’ and ‘p***’ such that (i) x 1 ≤ x* < x** < x*** ≤ x k; (ii) p* + p** + p***= 1 and (iii) IBAR based on ‘D*_[(x*, p*); (x**, p**); (x***, p***)]’ is identical to IBAR based on D_(k, x, w). [Info. Equivalence]
Quadratic Regression : EDT • Problem # 1 • Given D_4 : [(-1, 1); (-a, 1); (a, 1); (1, 1)] • Can we find [(x, 2); (y, 1); (z, 1)] for Information Equivalence with -1 ≤ x # y # z ≤ 1? • Answer : Impossible ! • Problem # 2 • Given D_6 : [(-1, 1); (-0.5, 2); (0.5, 2); (1, 1)] • Can we find [(-x, f); (0, 6-2f); (x, f)] for Information Equivalence with 0 < x < 1 ? • Yes : Unique sol. x = (3)/2 and f = 2.
More on Quadratic Regression : EDT Problem # 3. What about D_(2k+2) : [(-1, 1); (-0.5, k); (0.5, k); (1, 1)] ? Sol. [(-x, f); (0, 2k+2-2f); (x, f)] for some x & f ? ‘No’ for k = 3 to 7 For k = 8 : f = 6 and x = 1/(2) ! More Affirmative Cases : • D_36 :[-1, 2);(-0.5, 16);(0.5, 16);(1, 2)] = D_36 : [(-1/ (2), 12); (0, 12); (1/ (2), 12)] (ii) D_68 :[-1, 2);(-0.5, 32);(0.5, 32);(1, 2)] = D_68 : [(-(2/5), 25); (0, 18); ((2/5), 25)]
Information Domination… • De la Garza Phenomenon : Info Equivalence • More to it in terms of Information Domination • WOLG ………..χ = [-1, 1] • Claim 2: Given D*=[(x*, p*); (x**, p**)] with (x*, x**) NOT both equal to (-1, 1), there exists • 0 < c < 1 so that Dc = [(-1, c); (+1, 1-c)] produces an Information Matrix I(Dc) which ‘dominates’ I(D*) in the sense of ‘matrix domination’. That is, I(Dc) – I(D*) is nnd. In a way, I(Dc) dominates I(D*) in every sense ! • This is the best result one can think of ………...in terms of ‘improving’ over I(D*) !!
Information Domination…. • Proof of Claim 2 : • Set 1 – 2c = μ’1 and solve for c =[1- μ’1]/2. • Note that (x*, x**) # (-1, 1) so that -1 < μ’1 < 1 and so 0 < c < 1. • Next note that μ’2 < 1. • Therefore, I(Dc) – I(D*) = [(0, 0) (0, 1- μ’2)] which is nnd. • Message : Push the points to the boundaries !
Quadratic Regression : Information Dominance • Context : Quadratic Regression Model with Homoscedastic Errors [ Mean Model Yx = α + βx + γx2 ] • Claim : Set χ = [-1, 1] WOLG. • Given any continuous regression design • ‘D*_[(x*, p*); (x**, p**); (x***, p***)]’ with -1 < x* < x** < X*** < 1, there exist proportions ‘p’, ‘q’ and ‘r’ and a constant c, -1 < c < 1 such that the design D_[(-1, p); (c, r); (+1, q)] provides Information Dominance over the design D*.
Sketch of the Proof…. • I= (1 μ’1 μ’2) • (μ’1μ’2 μ’3) • (μ’2 μ’3 μ’4) • I* = etc etc • Equate μ’1, μ’2and μ’3to those of I* and solve for p, q, r and c. Then show that • μ’4< μ*’4 • For details…..Pukelsheim’s Book • Also…….Liski et al Monograph [2002] : • Topics in Optimal Design
Binary Response Models • Impressive Literature on Optimality Issues • de la Garza Phenomenon & Information Dominance…recent advances…. • Optimal designs for binary data under logistic regression. • Mathew-Sinha (2001) • Jour. Stat Plan. & Inf., 93, 295-307
Binary Response Model…. • P[Yx = 1] = 1/[1+exp{-(α + βx)}] • {(xi, ni)}; i=1, 2, …, k ….given data • Binomial model…..log likelihood….differentiation etc etc…Information Matrix….. Approximate Theory : {(xi, pi)} etc……∑ pi = 1 Set ai = α + βxi for each i I(α,β)=[(∑ pi exp(-ai)/[1+exp(-ai)]2; (∑ pi xi exp(-ai)/[1+exp(-ai)]2; do; (∑ pi xi2 exp(-ai)/[1+exp(-ai)]2