1 / 22

HASAR : Mining Sequential Association Rules for Atherosclerosis Risk Factor Analysis

HASAR : Mining Sequential Association Rules for Atherosclerosis Risk Factor Analysis. Laurent Brisson, Nicolas Pasquier, Céline Hebert, Martine Collard I3S Laboratory, University of Nice-Sophia Antipolis GREYC Laboratory, University of Caen. Contents. 1. Analytic question & Objectives

anila
Download Presentation

HASAR : Mining Sequential Association Rules for Atherosclerosis Risk Factor Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. HASAR: Mining Sequential Association Rules for Atherosclerosis Risk Factor Analysis Laurent Brisson, Nicolas Pasquier, Céline Hebert, Martine Collard I3S Laboratory, University of Nice-Sophia Antipolis GREYC Laboratory, University of Caen

  2. Contents 1. Analytic question & Objectives 2. Model & Data Preparation 3. Algorithms 4. Results

  3. Analytic Question Are there any differences in thedevelopment of risk factorsand other characteristics between men of the risk group, who came down with the observedcardiovascular diseasesand those whostayed healthy ?

  4. Objectives • Evolution of Risk Factors according behavioural changes • Groups RG versus PG and NG • Healthy patients (NCVD)versus those with cardiovascular diseases(CVD) • Groups based on patient education level and job

  5. Sequential Rules IDE_itemset BEH_time_itemset  RF_time_item • IDE_itemset : static identification attributes • Age of the patient • Educational level of the patient • Alcohol consumption at the beginning of the study

  6. Sequential Rules IDE_itemset  BEH_time_itemset  RF_time_item • BEH_time_itemset : behavioural change attributes • Comsumption of cigarettes a day • Physical activity after job • Physical activity in a job • Different kinds of diet • Medecine for cholesterol • Medecine for blood pressure

  7. Sequential Rules IDE_itemset  BEH_time_itemset  RF_time_item • RF_time_item : risk factor change attribute • Cholesterol level • HDL Cholesterol level • LDL Cholesterol level • Triglycerides level • Obesity • …

  8. Model IDE_itemset BEH_time_itemsetRF_time_item • Action period where it occurs at least one control • Latency period a waiting time before observing effects • Observation period where it occurs only one control

  9. Data Preparation : creation of changes variables

  10. Data Preparation : Flattening operation Initial table : 1 row  1 control

  11. Data Preparation : Flattening operation Flattened table : 1 row  1 patient static attributes control 1 control n

  12. Evolutionnary Approach A Genetic Algorithm searching for temporal rules Fixed-length chromosome Identification Behaviours Risk factor

  13. Evolutionnary Approach A gene for each static identification attributes IDE1 … IDEj Behaviours Risk factor

  14. Evolutionnary Approach A gene for each kind of behavioural changes Identification BEH 1 … BEH k Risk factor Action period

  15. Evolutionnary Approach One gene to describe a risk factor Identification Behaviours RF i Action period Observation period

  16. Evolutionnary Approach Fitness function : support * confidence * lift Latency period Identification Behaviours RF i Action period Observation period

  17. Genetic Algorithm Optimization • A CLOSE based approach for initialization • CLOSE algorithm improves: • extraction efficiency reducing the search-space (use of generators and frequent close itemset) • results relevance suppressing redondant rules (bases generation)

  18. Results : Patient classes comparison • Best rules on PG versus NG and RG

  19. Results : Patient classes comparison • Best rules on CVD versus NCVD

  20. Results : Initialization Methods • Comparison on RG group

  21. Conclusion • Different tendencies among groups • Confirmation of prior medical knowledge • Contradictions with some "assumptions" • Further investigations with assistance of medical experts

  22. Future Researches • To analyse relationships between time windows and various risk factors • To Develop new evaluation criteria • To Integrate physician’s prior knowledge • To apply HASAR approach to other temporal datasets

More Related