1 / 20

HLTHINFO 730 Healthcare Decision Support Systems Lecture 6: Decision Trees

HLTHINFO 730 Healthcare Decision Support Systems Lecture 6: Decision Trees. Lecturer: Prof Jim Warren. Decision Trees. Essentially flowcharts A natural order of ‘micro decisions’ (Boolean – yes/no decisions) to reach a conclusion In simplest form all you need is A start (marked with an oval)

shanon
Download Presentation

HLTHINFO 730 Healthcare Decision Support Systems Lecture 6: Decision Trees

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. HLTHINFO 730Healthcare Decision Support SystemsLecture 6: Decision Trees Lecturer: Prof Jim Warren

  2. Decision Trees • Essentially flowcharts • A natural order of ‘micro decisions’ (Boolean – yes/no decisions) to reach a conclusion • In simplest form all you need is • A start (marked with an oval) • A cascade of Boolean decisions (each with exactly outbound branches) • A set of decision nodes (marked with ovals) and representing all the ‘leaves’ of the decision tree (no outbound branches)

  3. Example • Consider this fragment of the ‘Prostate Cancer Workup (Evaluation)’ decision tree from http://www.nccn.org/patients/patient_gls/_english/_prostate/contents.asp# The page also shows supporting text: “Additional testing is recommended for men expected to live 5 or more years or who have symptoms from the cancer. For example, if the tumor is T1 or T2, a bone scan is recommended if the PSA level is greater than 20 or if the Gleason score is greater than 8. A bone scan is also recommended if the man has any symptoms, or the cancer is growing outside the prostate (T3 or T4). A CT or MRI of the pelvis is recommended when the tumor is T1 or T2 and there is a 7% or greater chance of lymph node spread based on the Partin tables, or the tumor is growing outside the prostate (T3 or T4).”

  4. KE problems for flowchart • The natural language may pack a lot in • E.g., “any one of the following” • Even harder if they say “two or more of the following” which implies they mean to compute some score and then ask if it’s >=2 • Incompleteness • There are logically possible (and, worse, physically possible) cases that aren’t handled • The ‘for example’ in the text is a worry • Inconsistency • Are we trying to reach one decision (which test) or a set of decisions • 1) whether to do a bone scan • 2) whether to do a ‘CT or MRI’

  5. Let’s try it anyway • What’s said for ‘staging workup’ looks like this Legend S2 = step 2: ‘CT or MRI of pelvisBS = Bone scanS3 = step 3: ‘All others: no additional testing’LNS = ‘7% or greater chance of lymph node spread based on the Partin tables’ Please don’t decide your father’s Prostate follow-up from this! It’s unverified, and I don’t think a tumour can be ‘T1 or T2’ and ALSO ‘T3 or T4’ (but that’s what it says!) T1 or T2 Y N T3 or T4 PSA>20 BS S3 S2 Symptoms BS T3 or T4 LNS BS S3 S2

  6. Decision Tables • As you can see from the Prostate example, a flowchart can get huge • We can pack more into a smaller space if we relinquish some control on indicating the order of microdecisions • A decision table has • One row per ‘rule’ • One column per decision variable • An additional column for the decision to take when that rule evaluates to true

  7. Decision Table example d= doesn’t matter (True or False) From van Bemmel & Musen, Ch 15

  8. Flowcharts v. Tables • Decision table is not as natural as a flowchart • But we’ve seen, a ‘real’ (complete and consistent) flowchart ends up very large (or representing a very small decision) • Decision table gets us close to production rule representation • Good as design specification to take to an expert system shell • Completeness is more evident with a flowchart • Decision table could allow for multiple rules to simultaneously evaluate to true • Messy on a flowchart (need multiple charts, or terminals that include every possible combination of decision outcomes) • Applying either in practice requires KE in a broad sense • E.g., may need to reformulate the goals of the guideline

  9. On to production rule systems • In a production rule system we have decision-table-like rule, but also the decision outcomes can feed back to the decision variables • Evaluating some special decision rule (or rules) is then the goal for the decision process • The other rules are intermediary, and might be part of the explanation of how externally-derived decision variables were used to reach a goal decision • The inference engine of the expert system shell chooses how to reach the goal • i.e., with backward chaining, or forward chaining • Possibly with some direction from a User Interface (UI) manager component (e.g., we might group sets of variables for input into forms as a web page)

  10. Boolean Algebra • To formulate flow chart decisions and (especially) decision table rows, can help to have mastered Boolean Algebra • Basic operators • NOT – if A was true, NOT A is false • AND – A AND B is only true if both A and B are true • OR – A OR B is true if either A, or B, or both are true (aka inclusive or) • This is not the place for a course on Boolean algebra, but a few ideas will help…

  11. Notation • Alas there are a lot of ways the operators are written • NOT A might appear as A, ~A, A′ or ¬A • A AND B might appear as A.B, A·B, A^B or simply AB • A OR B might appear as A+B or AvB • We can use parentheses like in normal algebra • C(A+B) means the expression is True if and only if C is true AND either B is true OR C is true (or both) • It’s equivalent to CA + CB (C-AND-A or C-AND-B, evaluate AND before OR) • So AND is a bit like multiplication, whereas OR is a bit like addition • 1 + 1 ≥ 1 1 + 0 ≥ 1 (inclusive OR) • 1 x 1 ≥ 1 1 x 0 ≥ 1 (logical AND)

  12. Think! • If you just keep your head and focus on the meaning in the clinical domain, you can usually find the Boolean expression you need • Be sure to be precise • “NOT (x>43)” is “x is NOT GREATER than 43” is “x<=43” (get your equals in the right place!) (with this advice, I won’t teach you De Morgan’s Law, truth tables, or Karnaugh maps, but feel free to look them up – they all Google well)

  13. Venn diagrams • Visual representations of membership in sets • Can be very useful to decide what Boolean expression you need • Say A is the set of everything with two legs and B the set of everythingthat flies • A^B would be true for a parrot • A would be true for a human,B would be false • B would be true for a mosquito,A would be false A: 2 legs B: can fly Mossie Human Parrot

  14. Decision Tree Induction • An alternative to knowledge engineering a decision tree is to turn the task over to a machine learning algorithm • The decision tree can be ‘induced’ (or inducted) from a sufficiently large set of example • The ID3 algorithm is the classic for inducing a decision tree using Information Theory • If I have 50 examples where the patients survived and 50 where they didn’t I have total (1.0) entropy and zero information • Given a set of potential decision attributes I can try to create more order (less entropy, more information) in the data

  15. Example: Induced Decision Tree Of course they go and use ovals for listing the decision variables, put the test criteria on the arcs and put ‘leaf’ decisions in rectangles – notations vary; get used to it! From Chen et al, Complete Blood Count as a Surrogate CD4 Marker for HIV Monitoring in Resource-limited Settings, 10th Conf on Retroviruses and Opportunistic Infection, 2003.

  16. Using Entropy measures in ID3 • For a decision node S with pp positive example (e.g., surviving patients) and pn negataive example • Entropy(S) = - pplog2 pp – pnlog2 pn • So with 15 survivors out of 25 patients • Entropy(S) = - (15/25) log2 (15/25) - (10/25) log2 (10/25) = 0.970 • I want to select a Boolean attribute A that splits S such that the two subsets are as ordered as possible, usually written…

  17. ID3 continued • So if I have 20 available Boolean decision variables • I try splitting my cases, S, according to each, until I find the variable that gives the most Gain • I repeat this on each sub-tree until either every node if perfect (all survivors, or all deaths) or I run out of attributes • If my variables aren’t Boolean, then I have more work to do • Actually, the Gain equation works fine if the attribute is multi-valued (Day of Week would be OK, I just have a 7-way split in my tree) • For continuous values I have to ‘discretize’ – make one or more split points • e.g., SBP<140? – now I’ve made continuous-valued blood pressure into a Boolean • Can be done based on knowledge (e.g., clinical significance), or handed to an algorithm to search for the max Gain See http://dms.irb.hr/tutorial/tut_dtrees.php

  18. Tools • You don’t find ‘pure’ ID3 too much • Other algorithms in a similar spirit to search for are C4.5 and Adaboost • Tools • Matlab implements decision tree induction • Weka toolkit (from Waikato Uni) has a variety of Java tools for machine learning • Try Pierre Geurts’ online decision tree induction applet, e.g., for ‘animal descriptions’ from http://www.montefiore.ulg.ac.be/~geurts/dtapplet/dtexplication.html#online

  19. I de-selected ‘backbone’ from the available decision attributes, hit New Tree, then Build, and hit Zoom+ a couple times (note that the attribute order in the database effects how the decision nodes end up phrased)

  20. Summary • Decision trees are a basic design-level knowledge representation technique for ‘logical’ (rule based, Boolean-predicate-driven) decisions • Decision tables let you compactly compile a host of decisions on a fixed set of decision variables • These take you very close to the representation needed to encode production rules for an inference engine • Rule induction from data provides an alternative to conventional Knowledge Engineering • Computer figures out rules that fit past decisions instead of you pursuing experts to ask them what rules they use

More Related