Mining for Knowledge to Build Decision Support System for Tinnitus Treatment

MINING FOR KNOWLEDGE TO BUILDDECISION SUPPORT SYSTEM FOR DIAGNOSIS AND TREATMENTOF TINNITUS Pamela L. Thompson & Zbigniew W. Ras University of North Carolina at Charlotte College of Computing and Informatics

Research partially supported by the Project ME913 of the Ministry of Education, Youth, and Sports of the Czech Republic

Introduction • Methodology • Domain Knowledge • Data Collection • Data Preparation • New Feature Construction • Advanced Clustering Techniques for Temporal Feature Extraction • Mining the Data: Unclustered and Clustered Data • Action Rules • Contributions • Future Research • Questions Topics

Neil Young, Barbara Streisand, Pete Townshend, William Shatner, David Letterman, Paul Schaffer, Steve Martin, Ronald Reagan, Neve Campbell, Jeff Beck, Burt Reynolds, Sting, Eric Clapton, Thomas Edison, Peter Jennings, Dwight D. Eisenhower, Cher, Phil Collins, Vincent Van Gogh, Ludwig Van Beethoven, Charles Darwin, . . . Introduction

Introduction

OUR APPROACH: We are interested in the application of data mining and action rule discovery to the TRT patient databases THE RESEARCH QUESTION: Can data mining and action rule discovery help us understand the relationships among the treatment factors, measurements and patient emotions in order to better understand tinnitus treatment and gain new knowledge for predicting treatment success? THE KNOWLEDGE GAINED will result in the design foundations of a decision support system to aid in tinnitus treatment effectiveness for TRT. Introduction

CONTRIBUTIONS: • A new knowledge discovery approach which can be used to build a decision support system for supporting tinnitus treatment • New temporal, emotional and text features related to tinnitus evaluation and treatment along with an evaluation of their contribution to learning the tinnitus problem • A new clustering approach for grouping similar visit sequences for tinnitus patients • The first application of Action Rule Discovery to the Tinnitus Problem including the application of LISP-miner and a new frequent sets based action rule generator (MARDs) • The first application and evaluation of new emotion centered temporal features integrated with the emotion-valence plane used in music emotion classification research Introduction

TRT includes • DIAGNOSIS • Preliminary medical examination • Completion of initial interview questionnaire • Audiological testing • TREATMENT • Counseling • Sound Habituation Therapy • Exposure to a different stimulus to reduce emotional reaction • Visit questionnaire (THI) • Secondary questionnaire (TFI) in the new dataset • Instrument tracking (instruments can be table top or in ear, different manufacturers) • Continued audiological tests Methodology: Domain Knowledge

Tinnitus Retraining Therapy • Neurophysical Model • Focuses on physiological aspect of nervous system function • TRT “cures” tinnitus by • Working with association between • Limbic nervous system (fear, thirst, hunger, joy, happiness) • Autonomic nervous system (breathing, heart rate) • Involvement of limbic nervous system worsens tinnitus symptoms Methodology: Domain Knowledge

Methodology: Domain Knowledge

Original Dataset • 555 patients • Relational • 11 tables • New Dataset • 758 patients • Relational • Secondary questionnaire (TFI) answers are added to the new dataset • TFI - Tinnitus Functional Index Methodology: Database Features

Initial Interview form provides basis for Patient/Doctor Treatment Category 0 to 4 (stored in Questionnaires tables) 0 – low tinnitus only: counseling 1 – high tinnitus: sound generators set at mixing point 2 – high tinnitus w/hearing loss (subjective): hearing aid 3 – Hyperacusis: sound generators set above threshold of hearing 4 – persistent hyperacusis: sound generators set at the threshold; very slow increase of sound level varies as treatment progresses, stored as C (first) and CC (last) Methodology: Database Features

Tinnitus Handicap Inventory • Questionnaire, forms Neumann-Q Table • Function, Emotion, Catastrophic Scores • Total Score (sum) • THI • 0 to 16: slight severity • 18 to 36: mild • 38 to 56: moderate • 58 to 76: severe • 78 to 100: catastrophic Methodology: Database Features

Tinnitus Functional Index • In the new dataset but only for some patients • Cognitive and emotional questions • Scale of 0 to 10 and some % • Includes questions related to • Anxious/worried • Bothered/upset • Depressed Methodology: Database Features

Methodology: Database Features

Audiological Features • Standard Deviation of Audiological Testing related to LDL’s • LDL - measure of decreased sound tolerance as indicated by • Hyperacusis (discomfort to sound) • Misophonia (dislike of sound) • Phonophobia (fear of sound) Methodology: ETL

THI - Tinnitus Handicap Inventory Discretization of attributes • mainly based on domain/expert knowledge • T score is discretized with a purpose to form decision attribute: • a (good) to e (bad) and other variations Methodology: ETL

Data Preparation for Mining Work with: • Missing values (sparse data) • Problems with primary keys • Temporal information – related to visits, needs to be tied to PATIENT for some mining operations Methodology: ETL

Data Transformation – ORIGINAL DATABASE • Flattened File in original database - one tuple per patient with additional features added • Patterns • Text • Statistical • Temporal • Decision Feature – discretized THI total score • Clustered patient databases (by similar visit patterns) with new additional features • Coefficients, angles • Data Transformation – NEW DATABASE • Clustered patient records (by similar visit patterns) • Boolean decision features plus TFI [Tinnitus Functional Index] features added (features in new dataset)

Feature Development for Categorical Data • Treatment Category and Instruments • MFP – Most Frequent Pattern (Value) • FP/LP – First Pattern, Last Pattern (Value) • Used for: • Instrument • Treatment Category • Tinnitus Problem New Feature Construction: Categorical Features

Text Mining • Text fields • Demographic, Miscellaneous, Medication tables • Categories may show cause of tinnitus for patient • Stress, Noise, Medical: New Feature Construction: Text Features

Statistical • From Audiological Features over visits • Standard Deviation • Average Methodology: ETL

Temporal Feature Development and Extraction • Extract features that describe the situation of the patient based on behavior of attributes over time • Temporal patterns may better express treatment process than static features • New temporal features: • Sound level centroid, sound level spread, recovery rate New Feature Construction: Temporal Features

New Temporal Features • Sound Level Centroid T - Total number of visits per patient (3) V - Sound level feature (ex. LDL measurement) measured at each visit - values V(1), V(2), V(3). 1/3*V(1) + 2/3 * V(2) + 3/3 * V(3) V(1) + V(2) + V(3) New Feature Construction: Temporal Features

New Temporal Features • Sound Level Spread SQRT V(1) * (1/3-C)2 + V(2) * (2/3-C)2 + V(3) * (3/3 – C)2 V(1) + V(2) + V(3) C - sound level centroid; V – sound level feature; T – number of visits. New Feature Construction: Temporal Features

New Temporal Features • Recovery Rate V = Total Score from THI Vo = first score (should be less than Vk) Vk is the best or min score in the vector Tk is the date of best score New Feature Construction: Temporal Features

Creation of 8 new decision attributes based on different discretizations of Total Score from Tinnitus Handicap Inventory. New Feature Construction: Decision Feature

Initial Experiments and Results • WEKA • J48 (C4.5 Decision Tree Learner) • 253 patients, 126 attributes: Experiment 1 • Investigate treatment factors and recovery • 229 patients, 16 attributes: Experiment 2 • Investigate audiological features and recovery Data Mining: Unclustered Data

In Search for Optimal Classifiers • WEKA • J48 (C4.5 Decision Tree Learner) • Random Forest • Multilayer Perceptron Data Mining: Unclustered Data

Initial Experiments and Results • Experiment#1: • (Category of treatment = C1) (R50 >12.5) (R3 <=15)==> improvement is neutral • The support of the rules is 10, the accuracy is 90.9%. It means that if treatment category chosen by patient is C1 then when R50 parameter is above 12.5 and average of R3 is less or equals to 15 then the recovery is neutral. • (Category of treatment = C2) ==> good • The support of the rules is 44, the accuracy is 74.6%. It means that if category of treatment chosen by patient is C2 then Improvement is good. • (Category of treatment = C3) (Model = BTE)==>good • The support of the rules is 17, the accuracy is 100.0%. • Experiment#2: • 40>Lr50>19 ==>Somehow has tinnitus all of the time • The support of the rules is 27, the accuracy is 100.0%. It means that if Lr50 is in range of 19 to 40, somehow the patient has tinnitus all the time, where the tinnitus may not be a major problem. "From Mining Tinnitus Database to Tinnitus Decision-Support System, Initial Study", P. Thompson, X. Zhang, W. Jiang, Z.W. Ras, in the Proceedings of IEEE/WIC/ACM International Conference on Intelligent Agent Technology (IAT 2007), IEEE Computer Society, San Jose, Calif., 2007, 203-206 Data Mining: Unclustered Data

Additional Experiments and Results • Seven more experiments using 8 new decision attributes • 253 patients, variations of 126 attributes • Goal of exploring treatment factors and recovery using discretized total score • WEKA • J48, Random Forest, Multilayer Perceptron Data Mining: Unclustered Data

Additional Experiments and Results • Seven Experiments: • 1) Original data with Standard Deviations and Averages from Audiological features • 2) Original data with Standard Deviations, Averages, Sound level centroid and sound level spread (Sound) only • 3) Original Data with Standard Deviations, Averages, and Text • 4) Original Data Standard Deviations, Averages, Text and Sound • 5) Original Data with Text • 6) Original Data with Sound • 7) Original Data with Sound, Text, and Recovery Rate Data Mining: Unclustered Data

Best Results Data Mining: Unclustered Data

Top Classification Results for all 8 decision variables Sound Level Centroid, Sound Level Spread, Recovery Rate Data Mining: Unclustered Data

SUMMARY – Mining unclustereddata from the Original Database: The addition of new temporal based features improves the confidence of the original classification. Of particular interest are the new Sound and Recovery Rate Features – these have value for Decision Support System (DSS) implementation WEKA J48 appears to be the best classifier. Summary Data Mining: Unclustered Data

Continuing the Search for Optimal Classifiers • Transformation to Visit Structure • Creating Clustered-Driven Databases for Mining • Adding New Features Data Mining: Clustered Data

Clustering for the purpose of Temporal Feature Extraction Data Mining: Clustered Data

If we have two patients denoted by p, q, then patient p visits are represented by a vector vp = [v1, v2,…, vn] andvector vq = [w1, w2,…, wm] represents visits of patient q. • Ifn  m, thenthe distance(p,q) between p, q and the distance(q,p) between q, p is defined as • [wJ(1) , wJ(2) ,…, wJ(n)] is a subsequence of [w1, w2,…wm] such thatwhere the sum of the distances is minimal for all n-element subsequences of [w1, w2,…, wm]. By |vi – wJ(i)| we mean the absolute value of [vi – wJ(i)]. Clustering Techniques for Temporal Feature Extraction

Clustering Techniques for Temporal Feature Extraction

Ultimate goal of constructing tolerance classes: to identify the right groups of patients for which useful temporal features can be built and used to extend the original (or current) database. The construction of a collection of databases Dp where p is patient and Dp is a database representing patients identified by the tolerance class generated by p. Two groups of databases for three and four visit sets were constructed. Clustering Techniques for Temporal Feature Extraction

Coefficients and Angles Feature Construction for Dp where p is a patient with 4 visits: Clustering Techniques for Temporal Feature Extraction

Quadratic Equation Based New Features Clustering Techniques

Clustering Techniques

Clustering Process Resulted in two classes of viable datasets for mining: • Three visits datasets (14 total) • Four visits datasets (5 total) Data Mining: Clustered Data

Data Mining: Clustered Data

In order to test the classifiers with the clustered data, WEKA with J48, Random Forest, and Multilayer Perceptron (Neural Network) was used on the following: • 1) Datasets with standard deviations and averages, • 2) Datasets with coefficients and text, • 3) Datasets with coefficients and angles, • 4) Datasets with coefficients only, • 5) Datasets with angles only, • 6) Datasets with angles and text, • 7) Datasets with angles, coefficients and text. Data Mining: Clustered Data

Data Mining: Clustered Data

Mining for Knowledge to Build Decision Support System for Tinnitus Treatment