1 / 31

داده كاوي و كاربرد آن در پزشكي

بنام خدا. داده كاوي و كاربرد آن در پزشكي. نام دانشجو : بابك رزاقي شماره دانشجويي : 85233510 استاد راهنما : جناب آقاي دكتر توحيد خواه (سمينار درس كاربرد فناوري اطلاعات در پزشكي). Why DATA MINING?. Necessity is mother of invention Huge amounts of data

Download Presentation

داده كاوي و كاربرد آن در پزشكي

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. بنام خدا داده كاوي و كاربرد آن در پزشكي نام دانشجو : بابك رزاقي شماره دانشجويي : 85233510 استاد راهنما : جناب آقاي دكتر توحيد خواه (سمينار درس كاربرد فناوري اطلاعات در پزشكي)

  2. Why DATA MINING? • Necessity is mother of invention • Huge amounts of data • Electronic records of our decisions • Choices in the supermarket • Financial records • Our comings and goings • We swipe our way through the world – every swipe is a record in a database • Data rich – but information poor • Lying hidden in all this data is information!

  3. What is DATA MINING? • Extracting or “mining” knowledge from large amounts of data • Data -driven discovery and modeling of hidden patterns in large volumes of data • Extraction of implicit, previously unknown and unexpected, potentially extremely useful information from data

  4. Data visualization Data mining Large database Data visualization • Ways of seeing patterns in large data sets • Uses the efficiency of human pattern recognition

  5. Terminology • Gold Mining • Knowledge mining from databases • Knowledge extraction • Data/pattern analysis • Knowledge Discovery Databases or KDD

  6. __ ____ __ ____ __ ____ Patterns and Rules Knowledge Discovery Process Integration Interpretation & Evaluation Knowledge Data Mining Knowledge Raw Data Transformation Selection & Cleaning Understanding Transformed Data Target Data DATA Ware house

  7. Data Mining Central Quest Find true patterns and avoid overfitting (false patterns due to randomness)

  8. Major Data Mining Tasks • Classification: predicting an item class • Clustering: finding clusters in data • Associations: e.g. A & B & C occur frequently • Visualization: to facilitate human discovery • Summarization: describing a group • Estimation: predicting a continuous value • Deviation Detection: finding changes • Link Analysis: finding relationships

  9. DATA MINING CHALLENGES • Computationally expensive to investigate all possibilities • Dealing with noise/missing information and errors in data • Choosing appropriate attributes/input representation • Finding the minimal attribute space • Finding adequate evaluation function(s) • Extracting meaningful information • Not over fitting

  10. Data Mining Software • INSIGHTFUL MINER • Angoss Knowledge ACCESS • ARMiner • Eudaptics Viscovery • Goal TV • MDR • ViscoverySOMine • SPSS

  11. DATA MINING APPLICATIONS • Science: Chemistry, Physics • Bioscience • Sequence-based analysis • Protein structure and function prediction • Protein family classification • Microarray gene expression • Financial Industry - banks, businesses, e-commerce • Stock and investment analysis • Pharmaceutical companies • Health care • Sports and Entertainment

  12. Clinical Data Mining processes • Digital format for all pertinent data • Create structure • Obtain coded information • Natural language understanding • Create a widely accessible repository

  13. Minimum systolic blood pressure over a 24-hour period following admission to the hospital > 91 <= 91 Age of Patient Class 2: Early death <=62.5 >62.5 Class 1: Survivors Was there sinus tachycardia? Classification example for Medical Diagnosis and Prognosis Heart Disease YES NO Class 2: Early death Class 1: Survivors

  14. Genome, DNA & Gene Expression • An organism’s genome is the “program” for making the organism, encoded in DNA • Human DNA has about 30-35,000 genes • A gene is a segment of DNA that specifies how to make a protein • Cells are different because of differential gene expression • About 40% of human genes are expressed at one time • Microarray devices measure gene expression

  15. Microarray Raw Image Gene Value D26528_at 193 D26561_cds1_at -70 D26561_cds2_at 144 D26561_cds3_at 33 D26579_at 318 D26598_at 1764 D26599_at 1537 D26600_at 1204 D28114_at 707 raw data Scanner enlarged section of raw image

  16. Microarray Potential Applications • New and better molecular diagnostics • New molecular targets for therapy • few new drugs, large pipeline, … • Outcome depends on genetic signature • best treatment? • Fundamental Biological Discovery • finding and refining biological pathways • Personalized medicine ?!

  17. Microarray Data Mining Challenges • Avoiding false positives, due to • too few records (samples), usually < 100 • too many columns (genes), usually > 1,000 • Model needs to be robust in presence of noise • For reliability need large gene sets; for diagnostics or drug targets, need small gene sets • Estimate class probability • Model needs to be explainable to biologists

  18. Initial query page

  19. Clusters matching query results

  20. Display of cluster

  21. Data Mining Software Guide

  22. Conclusion • Discover useful relationships in data • Discover information otherwise overlooked • Provide intelligence to improve various phases • Intellectual property • Competitive advantages: • Getting more out of your data • Finding other relevant information faster • Exploratory, hypothesis-generating analyses • Increase productivity – reduced amount of time and money

  23. Thank You All razaghi.b@gmail.com

More Related