200 likes | 228 Views
Sample error, true error<br> Confidence intervals for observed hypothesis error<br> Estimators<br> Binomial distribution, Normal distribution, Central Limit Theorem<br> Paired t tests<br> Comparing learning methods<br>
E N D
Ev aluating Hyp otheses ?Read Ch? ?? ?Recommended exercises? ???? ???? ???? ? Sample error? true error ? Con?dence in terv als for observ ed h yp othesis error ? Estimators ? Binomial distribution? Normal distribution? Cen tral Limit Theorem ? P aired t tests ? Comparing learning metho ds ?? lecture slides for textb o ok Machine L e arning? T? Mitc hell? McGra w Hill? ????
Tw o De?nitions of Error The true error of h yp othesis h with resp ect to target function f and distribution D is the probabilit y that h will misclassify an instance dra wn at random according to D ? er r or ?h? ? Pr ?f ?x? ?? h?x?? D x?D The sample error of h with resp ect to target function f and data sample S is the prop ortion of examples h misclassi?es ? X er r or ?h? ? ? ?f ?x? ?? h?x?? S n x?S Where ? ?f ?x? ?? h?x?? is ? if f ?x? ?? h?x?? and ? otherwise? Ho w w ell do es er r or ?h? estimate er r or ?h?? S D ?? lecture slides for textb o ok Machine L e arning? T? Mitc hell? McGra w Hill? ????
Problems Estimating Error ?? Bias? If S is training set? er r or ?h? is S optimisticall y biased bias ? E ?er r or ?h?? ? er r or ?h? S D F or un biased estimate? h and S m ust b e c hosen indep enden tly ?? V arianc e? Ev en with un biased S ? er r or ?h? ma y S still vary from er r or ?h? D ?? lecture slides for textb o ok Machine L e arning? T? Mitc hell? McGra w Hill? ????
Example Hyp othesis h misclassi?es ?? of the ?? examples in S ?? er r or ?h? ? ? ??? S ?? What is er r or ?h?? D ?? lecture slides for textb o ok Machine L e arning? T? Mitc hell? McGra w Hill? ????
Estimators Exp erimen t? ?? c ho ose sample S of size n according to distribution D ?? measure er r or ?h? S er r or ?h? is a random v ariable ?i?e?? result of an S exp erimen t? er r or ?h? is an un biased estimator for er r or ?h? S D Giv en observ ed er r or ?h? what can w e conclude S ab out er r or ?h?? D ?? lecture slides for textb o ok Machine L e arning? T? Mitc hell? McGra w Hill? ????
Con?dence In terv als If ? S con tains n examples? dra wn indep enden tly of h and eac h other ? n ? ?? Then ? With appro ximately ??? probabilit y ? er r or ?h? D lies in in terv al v u u u er r or ?h??? ? er r or ?h?? S S u t er r or ?h? ? ???? S n ?? lecture slides for textb o ok Machine L e arning? T? Mitc hell? McGra w Hill? ????
Con?dence In terv als If ? S con tains n examples? dra wn indep enden tly of h and eac h other ? n ? ?? Then ? With appro ximately N? probabilit y ? er r or ?h? D lies in in terv al v u u u er r or ?h??? ? er r or ?h?? S S u t er r or ?h? ? z S N n where N ?? ??? ??? ??? ??? ??? ??? ??? z ? ???? ???? ???? ???? ???? ???? ???? N ?? lecture slides for textb o ok Machine L e arning? T? Mitc hell? McGra w Hill? ????
er r or ?h? is a Random V ariable S Rerun the exp erimen t with di?eren t randomly dra wn S ?of size n? Probabilit y of observing r misclassi?ed examples? n? r n?r P ?r ? ? er r or ?h? ?? ? er r or ?h?? D D McGra r ??n ? r ?? Binomial distribution for n = 40, p = 0.3 0.14 0.12 0.1 0.08 P(r) 0.06 0.04 0.02 0 0 5 10 15 20 25 30 35 40 ?? lecture slides for textb o ok Machine L e arning? T? Mitc hell? w Hill? ????
Binomial Probabilit y Distributi on n? r n?r P ?r ? ? p ?? ? p? r ??n ? r ?? Probabilit y P ?r ? of r heads in n coin ?ips? if Binomial distribution for n = 40, p = 0.3 0.14 p ? Pr ?heads? 0.12 0.1 ? Exp ected? or mean v alue of X ? E ?X ?? is 0.08 P(r) n X 0.06 E ?X ? ? iP ?i? ? np i?? 0.04 0.02 ? V ariance of X is 0 ? 0 5 10 15 20 25 30 35 40 V ar ?X ? ? E ??X ? E ?X ?? ? ? np?? ? p? ? Standard deviation of X ? ? ? is X r r ? ? ? E ??X ? E ?X ?? ? ? np?? ? p? X ?? lecture slides for textb o ok Machine L e arning? T? Mitc hell? McGra w Hill? ????
Normal Distributi on Appro ximates Bino? mial er r or ?h? follo ws a Binomial distribution? with S ? mean ? ? er r or ?h? D er r or ?h? S ? standard deviation ? er r or ?h? S v u u u er r or ?h??? ? er r or ?h?? D D u t ? ? er r or ?h? S n Appro ximate this b y a Normal distribution with ? mean ? ? er r or ?h? D er r or ?h? S ? standard deviation ? er r or ?h? S v u u u er r or ?h??? ? er r or ?h?? S S u t ? ? er r or ?h? S n ?? lecture slides for textb o ok Machine L e arning? T? Mitc hell? McGra w Hill? ????
Normal Probabilit y Distributi on x?? ? ? ? ? ? ? ? ? p p?x? ? e ? ?? ? The probabilit y that X will fall in to the in terv al ?a? b? is giv en b y Normal distribution with mean 0, standard deviation 1 Z 0.4 b p?x?dx 0.35 a 0.3 ? Exp ected? or mean v alue of X ? E ?X ?? is 0.25 0.2 0.15 E ?X ? ? ? 0.1 0.05 ? V ariance of X is 0 -3 -2 -1 0 1 2 3 ? V ar ?X ? ? ? ? Standard deviation of X ? ? ? is X ? ? ? X ?? lecture slides for textb o ok Machine L e arning? T? Mitc hell? McGra w Hill? ????
Normal Probabilit y Distributi on ??? of area ?probabilit y? lies in ? ? ????? N? of area ?probabilit y? lies in ? ? z ? N N ?? ??? ??? ??? ??? ??? ??? ??? 0.4 z ? ???? ???? ???? ???? ???? ???? ???? N 0.35 0.3 0.25 0.2 0.15 0.1 0.05 0 -3 -2 -1 0 1 2 3 ?? lecture slides for textb o ok Machine L e arning? T? Mitc hell? McGra w Hill? ????
Con?dence In terv als? More Correctly If ? S con tains n examples? dra wn indep enden tly of h and eac h other ? n ? ?? Then ? With appro ximately ??? probabilit y ? er r or ?h? S lies in in terv al v u u u er r or ?h??? ? er r or ?h?? D D u t er r or ?h? ? ???? D n equiv alen tl y ? er r or ?h? lies in in terv al D v u u u er r or ?h??? ? er r or ?h?? D D u t er r or ?h? ? ???? S n whic h is appro ximately v u u u er r or ?h??? ? er r or ?h?? S S u t er r or ?h? ? ???? S n ?? lecture slides for textb o ok Machine L e arning? T? Mitc hell? McGra w Hill? ????
Cen tral Limit Theorem Consider a set of indep enden t? iden ticall y distributed random v ariables Y ? ? ? Y ? all go v erned ? n b y an arbitrary probabilit y distribution with mean ? ? and ?nite v ariance ? ? De?ne the sample mean? n ? X ? Y ? Y i i?? n Cen tral Limit Theorem? As n ? ?? the ? distribution go v erning Y approac hes a Normal ? ? distribution? with mean ? and v ariance ? n ?? lecture slides for textb o ok Machine L e arning? T? Mitc hell? McGra w Hill? ????
Calculating Con?dence In terv als ?? Pic k parameter p to estimate ? er r or ?h? D ?? Cho ose an estimator ? er r or ?h? S ?? Determine probabilit y distribution that go v erns estimator ? er r or ?h? go v erned b y Binomial distribution? S appro ximated b y Normal when n ? ?? ?? Find in terv al ?L? U ? suc h that N? of probabilit y mass falls in the in terv al ? Use table of z v alues N ?? lecture slides for textb o ok Machine L e arning? T? Mitc hell? McGra w Hill? ????
Di?erence Bet w een Hyp otheses T est h on sample S ? test h on S ? ? ? ? ?? Pic k parameter to estimate d ? er r or ?h ? ? er r or ?h ? D ? D ? ?? Cho ose an estimator ? d ? er r or ?h ? ? er r or ?h ? S ? S ? ? ? ?? Determine probabilit y distribution that go v erns estimator s error ?h ??? ? error ?h ?? error ?h ??? ? error ?h ?? S ? S ? S ? S ? ? ? ? ? ? ? ? ? d n n ? ? ?? Find in terv al ?L? U ? suc h that N? of probabilit y mass falls in the in terv al v u u u er r or ?h ??? ? er r or ?h ?? er r or ?h ??? ? er r or ?h ?? S ? S ? S ? S ? u ? ? ? ? ? u d?z ? t N n n ? ? ?? lecture slides for textb o ok Machine L e arning? T? Mitc hell? McGra w Hill? ????
P aired t test to compare h ?h A B ?? P artition data in to k disjoin t test sets T ? T ? ? ? ? ? T of equal size? where this size is at ? ? k least ??? ?? F or i from ? to k ? do ? ? er r or ?h ? ? er r or ?h ? i T A T B i i ? ?? Return the v alue ? ? where k ? X ? ? ? ? i i?? k N ? con?dence in terv al estimate for d? ? ? ? t s ? N ?k ?? ? v u u k u ? X u ? ? u s ? ?? ? ? ? ? i t ? i?? k ?k ? ?? Note ? appr oximately Normal ly distribute d i ?? lecture slides for textb o ok Machine L e arning? T? Mitc hell? McGra w Hill? ????
Comparing learning algorithms L and L A B What w e?d lik e to estimate? E ?er r or ?L ?S ?? ? er r or ?L ?S ??? S ?D D A D B where L?S ? is the h yp othesis output b y learner L using training set S i?e?? the exp ected di?erence in true error b et w een h yp otheses output b y learners L and L ? when A B trained using randomly selected training sets S dra wn according to distribution D ? But? giv en limited data D ? what is a go o d ? estimator? ? could partition D in to training set S and ? training set T ? and measure ? er r or ?L ?S ?? ? er r or ?L ?S ?? T A ? T B ? ? ? ? ev en b etter? rep eat this man y times and a v erage the results ?next slide? ?? lecture slides for textb o ok Machine L e arning? T? Mitc hell? McGra w Hill? ????
Comparing learning algorithms L and L A B ?? P artition data D in to k disjoin t test sets ? T ? T ? ? ? ? ? T of equal size? where this size is at ? ? k least ??? ?? F or i from ? to k ? do use T for the test set? and the r emaining data i for tr aining set S i ? S ? fD ? T g i ? i ? h ? L ?S ? A A i ? h ? L ?S ? B B i ? ? ? er r or ?h ? ? er r or ?h ? i T A T B i i ? ?? Return the v alue ? ? where k ? X ? ? ? ? i i?? k ?? lecture slides for textb o ok Machine L e arning? T? Mitc hell? McGra w Hill? ????
Comparing learning algorithms L and L A B ? Notice w e?d lik e to use the paired t test on ? to obtain a con?dence in terv al but not really correct? b ecause the training sets in this algorithm are not indep enden t ?they o v erlap?? more correct to view algorithm as pro ducing an estimate of E ?er r or ?L ?S ?? ? er r or ?L ?S ??? S ?D D A D B ? instead of E ?er r or ?L ?S ?? ? er r or ?L ?S ??? S ?D D A D B but ev en this appro ximation is b etter than no comparison ?? lecture slides for textb o ok Machine L e arning? T? Mitc hell? McGra w Hill? ????