1 / 19

Mining for Patterns Based on Contingency Tables by KL-Miner First Experience

Mining for Patterns Based on Contingency Tables by KL-Miner First Experience. Jan Rauch Milan Šimůnek (PhD. student) Václav Lín (student) University of Economics Prague. … KL-Miner , First Experience. KL-Miner Basic features Application example Implementation principles

lyre
Download Presentation

Mining for Patterns Based on Contingency Tables by KL-Miner First Experience

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Mining for Patterns Based on Contingency Tables by KL-Miner First Experience Jan Rauch Milan Šimůnek (PhD. student) Václav Lín (student) University of Economics Prague

  2. … KL-Miner, First Experience • KL-Miner Basic features • Application example • Implementation principles • Scalability • Concluding remarks FDM 2003

  3. KL-Miner -- Data and Patterns Data: Data Matrix • Patterns i.e. KL-hypothesis: R  C /  • row attribute R  {A1, …, AP}, possible values i.e. categories: r1, …, rK • column attribute C  {A1, …, AP}, possible values i.e. categories: c1, …,cL • Boolean attribute derived from other attributes A1, …, AP • KL quantifier  …. Condition imposed on contingency table of R and C FDM 2003

  4. KL – quantifiers Contingency table of R and C: Examples of quantifiers: Simple aggregate function: Kendall’s quantifier: e.g. |b | P FDM 2003

  5. Kendall’s quantifier Kendall’s coeficient: : b   0;1  b> 0 … positive ordinal dependence b< 0 … negative ordinal dependence b= 0 … ordinal independence |b | = 1 … C is a function of R Kendall’s quantifier: e. g. | b |  p or | b |  p FDM 2003

  6. KL-Miner application example STULONG Project, 1419 patients, entry examination See http://euromise.vse.cz FDM 2003

  7. STULONG attributes examples (1) Systolic blood pressure Smoking Group of patients FDM 2003

  8. STULONG attributes examples (2) Skinfold above musculus triceps (mm) Beer – amount / day 219 attributes total 38 ordinal attributes We use 17 ordinal attributes FDM 2003

  9. Example - analytic question Are there any ordinal dependencies among attributes under some conditions? at least 50 patients |b |  0.75 relevant conditions : FDM 2003

  10. Example – relevant condition specification (1) Group of patients (normal), Group of patients (risk), … Beer 10(yes), Beer 12(yes), …, Beer 10(yes)  Beer 12(yes) Sliding windows … FDM 2003

  11. Example – relevant condition specification (2) Sliding window 4, 5, 6, 7, 9,10, 11, 12, 13, 14, 15, ....., 43, 44, 45, 46, 47, 48, 49, 50 4, 5, 6, 7, 9,10, 11, 12, 13, 14, 15, ....., 43, 44, 45, 46, 47, 48, 49, 50 4, 5, 6, 7, 9,10, 11, 12, 13, 14, 15, ....., 43, 44, 45, 46, 47, 48, 49, 50 ........... 4, 5, 6, 7, 9, 10, 11, 12, 13, 14, 15, ....., 43, 44, 45, 46, 47, 48, 49, 50 4, 5, 6, 7, 9, 10, 11, 12, 13, 14, 15, ....., 43, 44, 45, 46, 47, 48, 49, 50 FDM 2003

  12. Example – output overview 2 min 1sec 550 310 verifications 25 hypotheses 3.06 GHz 512 MB DDR SDRAM FDM 2003

  13. Example – output detail (1) b= 0.82 (i.e. strong positive ordinal dependence) FDM 2003

  14. Example – output detail (2) b= 0.78 (i.e. strong positive ordinal dependence) FDM 2003

  15. Implementation principles (1) Attributes are represented by cards of categories i.e. strings of bits Attributes Cards of categories of A1 FDM 2003

  16. Implementation principles (2) CARD [] = bit string representation of Booelan attribute  CARD [ Group of patients (normal)  Beer 10(yes)  Beer 12(yes) ] = Group of patients [normal]  Beer 10[yes]  Beer 12[yes] Count() – number of “1” in the bit string  FDM 2003

  17. Implementation principles (3) n1,1 = Count( R[r1] C[c1] CARD []) FDM 2003

  18. Scalability 75 000 verifications approximately linear FDM 2003

  19. Concluding remarks • KL-Miner practically interesting results • Suitable for interactive work • Further quantifiers • Combinations with further mining procedures FDM 2003

More Related