1 / 20

Machine Learning -Chapter -5

Sample error, true error<br> Confidence intervals for observed hypothesis error<br> Estimators<br> Binomial distribution, Normal distribution, Central Limit Theorem<br> Paired t tests<br> Comparing learning methods<br>

2715
Download Presentation

Machine Learning -Chapter -5

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Ev aluating Hyp otheses ?Read Ch? ?? ?Recommended exercises? ???? ???? ???? ? Sample error? true error ? Con?dence in terv als for observ ed h yp othesis error ? Estimators ? Binomial distribution? Normal distribution? Cen tral Limit Theorem ? P aired t tests ? Comparing learning metho ds ?? lecture slides for textb o ok Machine L e arning? T? Mitc hell? McGra w Hill? ????

  2. Tw o De?nitions of Error The true error of h yp othesis h with resp ect to target function f and distribution D is the probabilit y that h will misclassify an instance dra wn at random according to D ? er r or ?h? ? Pr ?f ?x? ?? h?x?? D x?D The sample error of h with resp ect to target function f and data sample S is the prop ortion of examples h misclassi?es ? X er r or ?h? ? ? ?f ?x? ?? h?x?? S n x?S Where ? ?f ?x? ?? h?x?? is ? if f ?x? ?? h?x?? and ? otherwise? Ho w w ell do es er r or ?h? estimate er r or ?h?? S D ?? lecture slides for textb o ok Machine L e arning? T? Mitc hell? McGra w Hill? ????

  3. Problems Estimating Error ?? Bias? If S is training set? er r or ?h? is S optimisticall y biased bias ? E ?er r or ?h?? ? er r or ?h? S D F or un biased estimate? h and S m ust b e c hosen indep enden tly ?? V arianc e? Ev en with un biased S ? er r or ?h? ma y S still vary from er r or ?h? D ?? lecture slides for textb o ok Machine L e arning? T? Mitc hell? McGra w Hill? ????

  4. Example Hyp othesis h misclassi?es ?? of the ?? examples in S ?? er r or ?h? ? ? ??? S ?? What is er r or ?h?? D ?? lecture slides for textb o ok Machine L e arning? T? Mitc hell? McGra w Hill? ????

  5. Estimators Exp erimen t? ?? c ho ose sample S of size n according to distribution D ?? measure er r or ?h? S er r or ?h? is a random v ariable ?i?e?? result of an S exp erimen t? er r or ?h? is an un biased estimator for er r or ?h? S D Giv en observ ed er r or ?h? what can w e conclude S ab out er r or ?h?? D ?? lecture slides for textb o ok Machine L e arning? T? Mitc hell? McGra w Hill? ????

  6. Con?dence In terv als If ? S con tains n examples? dra wn indep enden tly of h and eac h other ? n ? ?? Then ? With appro ximately ??? probabilit y ? er r or ?h? D lies in in terv al v u u u er r or ?h??? ? er r or ?h?? S S u t er r or ?h? ? ???? S n ?? lecture slides for textb o ok Machine L e arning? T? Mitc hell? McGra w Hill? ????

  7. Con?dence In terv als If ? S con tains n examples? dra wn indep enden tly of h and eac h other ? n ? ?? Then ? With appro ximately N? probabilit y ? er r or ?h? D lies in in terv al v u u u er r or ?h??? ? er r or ?h?? S S u t er r or ?h? ? z S N n where N ?? ??? ??? ??? ??? ??? ??? ??? z ? ???? ???? ???? ???? ???? ???? ???? N ?? lecture slides for textb o ok Machine L e arning? T? Mitc hell? McGra w Hill? ????

  8. er r or ?h? is a Random V ariable S Rerun the exp erimen t with di?eren t randomly dra wn S ?of size n? Probabilit y of observing r misclassi?ed examples? n? r n?r P ?r ? ? er r or ?h? ?? ? er r or ?h?? D D McGra r ??n ? r ?? Binomial distribution for n = 40, p = 0.3 0.14 0.12 0.1 0.08 P(r) 0.06 0.04 0.02 0 0 5 10 15 20 25 30 35 40 ?? lecture slides for textb o ok Machine L e arning? T? Mitc hell? w Hill? ????

  9. Binomial Probabilit y Distributi on n? r n?r P ?r ? ? p ?? ? p? r ??n ? r ?? Probabilit y P ?r ? of r heads in n coin ?ips? if Binomial distribution for n = 40, p = 0.3 0.14 p ? Pr ?heads? 0.12 0.1 ? Exp ected? or mean v alue of X ? E ?X ?? is 0.08 P(r) n X 0.06 E ?X ? ? iP ?i? ? np i?? 0.04 0.02 ? V ariance of X is 0 ? 0 5 10 15 20 25 30 35 40 V ar ?X ? ? E ??X ? E ?X ?? ? ? np?? ? p? ? Standard deviation of X ? ? ? is X r r ? ? ? E ??X ? E ?X ?? ? ? np?? ? p? X ?? lecture slides for textb o ok Machine L e arning? T? Mitc hell? McGra w Hill? ????

  10. Normal Distributi on Appro ximates Bino? mial er r or ?h? follo ws a Binomial distribution? with S ? mean ? ? er r or ?h? D er r or ?h? S ? standard deviation ? er r or ?h? S v u u u er r or ?h??? ? er r or ?h?? D D u t ? ? er r or ?h? S n Appro ximate this b y a Normal distribution with ? mean ? ? er r or ?h? D er r or ?h? S ? standard deviation ? er r or ?h? S v u u u er r or ?h??? ? er r or ?h?? S S u t ? ? er r or ?h? S n ?? lecture slides for textb o ok Machine L e arning? T? Mitc hell? McGra w Hill? ????

  11. Normal Probabilit y Distributi on x?? ? ? ? ? ? ? ? ? p p?x? ? e ? ?? ? The probabilit y that X will fall in to the in terv al ?a? b? is giv en b y Normal distribution with mean 0, standard deviation 1 Z 0.4 b p?x?dx 0.35 a 0.3 ? Exp ected? or mean v alue of X ? E ?X ?? is 0.25 0.2 0.15 E ?X ? ? ? 0.1 0.05 ? V ariance of X is 0 -3 -2 -1 0 1 2 3 ? V ar ?X ? ? ? ? Standard deviation of X ? ? ? is X ? ? ? X ?? lecture slides for textb o ok Machine L e arning? T? Mitc hell? McGra w Hill? ????

  12. Normal Probabilit y Distributi on ??? of area ?probabilit y? lies in ? ? ????? N? of area ?probabilit y? lies in ? ? z ? N N ?? ??? ??? ??? ??? ??? ??? ??? 0.4 z ? ???? ???? ???? ???? ???? ???? ???? N 0.35 0.3 0.25 0.2 0.15 0.1 0.05 0 -3 -2 -1 0 1 2 3 ?? lecture slides for textb o ok Machine L e arning? T? Mitc hell? McGra w Hill? ????

  13. Con?dence In terv als? More Correctly If ? S con tains n examples? dra wn indep enden tly of h and eac h other ? n ? ?? Then ? With appro ximately ??? probabilit y ? er r or ?h? S lies in in terv al v u u u er r or ?h??? ? er r or ?h?? D D u t er r or ?h? ? ???? D n equiv alen tl y ? er r or ?h? lies in in terv al D v u u u er r or ?h??? ? er r or ?h?? D D u t er r or ?h? ? ???? S n whic h is appro ximately v u u u er r or ?h??? ? er r or ?h?? S S u t er r or ?h? ? ???? S n ?? lecture slides for textb o ok Machine L e arning? T? Mitc hell? McGra w Hill? ????

  14. Cen tral Limit Theorem Consider a set of indep enden t? iden ticall y distributed random v ariables Y ? ? ? Y ? all go v erned ? n b y an arbitrary probabilit y distribution with mean ? ? and ?nite v ariance ? ? De?ne the sample mean? n ? X ? Y ? Y i i?? n Cen tral Limit Theorem? As n ? ?? the ? distribution go v erning Y approac hes a Normal ? ? distribution? with mean ? and v ariance ? n ?? lecture slides for textb o ok Machine L e arning? T? Mitc hell? McGra w Hill? ????

  15. Calculating Con?dence In terv als ?? Pic k parameter p to estimate ? er r or ?h? D ?? Cho ose an estimator ? er r or ?h? S ?? Determine probabilit y distribution that go v erns estimator ? er r or ?h? go v erned b y Binomial distribution? S appro ximated b y Normal when n ? ?? ?? Find in terv al ?L? U ? suc h that N? of probabilit y mass falls in the in terv al ? Use table of z v alues N ?? lecture slides for textb o ok Machine L e arning? T? Mitc hell? McGra w Hill? ????

  16. Di?erence Bet w een Hyp otheses T est h on sample S ? test h on S ? ? ? ? ?? Pic k parameter to estimate d ? er r or ?h ? ? er r or ?h ? D ? D ? ?? Cho ose an estimator ? d ? er r or ?h ? ? er r or ?h ? S ? S ? ? ? ?? Determine probabilit y distribution that go v erns estimator s error ?h ??? ? error ?h ?? error ?h ??? ? error ?h ?? S ? S ? S ? S ? ? ? ? ? ? ? ? ? d n n ? ? ?? Find in terv al ?L? U ? suc h that N? of probabilit y mass falls in the in terv al v u u u er r or ?h ??? ? er r or ?h ?? er r or ?h ??? ? er r or ?h ?? S ? S ? S ? S ? u ? ? ? ? ? u d?z ? t N n n ? ? ?? lecture slides for textb o ok Machine L e arning? T? Mitc hell? McGra w Hill? ????

  17. P aired t test to compare h ?h A B ?? P artition data in to k disjoin t test sets T ? T ? ? ? ? ? T of equal size? where this size is at ? ? k least ??? ?? F or i from ? to k ? do ? ? er r or ?h ? ? er r or ?h ? i T A T B i i ? ?? Return the v alue ? ? where k ? X ? ? ? ? i i?? k N ? con?dence in terv al estimate for d? ? ? ? t s ? N ?k ?? ? v u u k u ? X u ? ? u s ? ?? ? ? ? ? i t ? i?? k ?k ? ?? Note ? appr oximately Normal ly distribute d i ?? lecture slides for textb o ok Machine L e arning? T? Mitc hell? McGra w Hill? ????

  18. Comparing learning algorithms L and L A B What w e?d lik e to estimate? E ?er r or ?L ?S ?? ? er r or ?L ?S ??? S ?D D A D B where L?S ? is the h yp othesis output b y learner L using training set S i?e?? the exp ected di?erence in true error b et w een h yp otheses output b y learners L and L ? when A B trained using randomly selected training sets S dra wn according to distribution D ? But? giv en limited data D ? what is a go o d ? estimator? ? could partition D in to training set S and ? training set T ? and measure ? er r or ?L ?S ?? ? er r or ?L ?S ?? T A ? T B ? ? ? ? ev en b etter? rep eat this man y times and a v erage the results ?next slide? ?? lecture slides for textb o ok Machine L e arning? T? Mitc hell? McGra w Hill? ????

  19. Comparing learning algorithms L and L A B ?? P artition data D in to k disjoin t test sets ? T ? T ? ? ? ? ? T of equal size? where this size is at ? ? k least ??? ?? F or i from ? to k ? do use T for the test set? and the r emaining data i for tr aining set S i ? S ? fD ? T g i ? i ? h ? L ?S ? A A i ? h ? L ?S ? B B i ? ? ? er r or ?h ? ? er r or ?h ? i T A T B i i ? ?? Return the v alue ? ? where k ? X ? ? ? ? i i?? k ?? lecture slides for textb o ok Machine L e arning? T? Mitc hell? McGra w Hill? ????

  20. Comparing learning algorithms L and L A B ? Notice w e?d lik e to use the paired t test on ? to obtain a con?dence in terv al but not really correct? b ecause the training sets in this algorithm are not indep enden t ?they o v erlap?? more correct to view algorithm as pro ducing an estimate of E ?er r or ?L ?S ?? ? er r or ?L ?S ??? S ?D D A D B ? instead of E ?er r or ?L ?S ?? ? er r or ?L ?S ??? S ?D D A D B but ev en this appro ximation is b etter than no comparison ?? lecture slides for textb o ok Machine L e arning? T? Mitc hell? McGra w Hill? ????

More Related