1 / 20

Identification of Transgender Patients using Deep Recurrent Neural Networks

Learn about the application of deep recurrent neural networks for the identification and classification of transgender patients using electronic health record (EHR) data. This presentation discusses the challenges faced in healthcare for transgender patients, the use of recurrent neural networks, word embeddings, and the implementation and performance of the classifier.

rsharon
Download Presentation

Identification of Transgender Patients using Deep Recurrent Neural Networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Deep recurrent neural networks identify transgender patients Oral Presentations – Methods for Identification, Classification, and Association using EHR Data S23 Joseph D. Romano, MPhil Columbia University Twitter: #AMIA2017 #S23

  2. Disclosure • I have no relevant relationships with commercial interests to disclose. AMIA 2017 | amia.org

  3. Learning Objectives • After participating in this session the learner should be better able to: • Conceptualize a recurrent neural network text classifier, and see how it can be applied to transgender patient classification. • Understand the need for data-driven methods to improve healthcare for transgender patients. • Understand that deep learning models do not address the ethical issues presented by tasks such as transgender status classification. AMIA 2017 | amia.org

  4. The transgender health crisis • Transgender individuals experience unique health disparities • Lacking adequate subpopulation research due to historical stigmatization • Health care professionals often untrained in LGBT health • Specific physical and psychological comorbidities more common • It is challenging to identify retrospective transgender cohorts • ‘Transgender’ often not coded in health information systems • Fear of stigmatization may lead to lack of disclosure • Increased privacy concerns, particularly regarding EHR data Institute of Medicine (US). National Academies Press (US);2011. (PMID: 22013611) AMIA 2017 | amia.org

  5. Recurrent Neural Networks • Accepts an ordered sequence as input • In our case, a sequence of embedded words • Returns a sequence as output • For sequence classification, discard all but the last item in the output sequence http://colah.github.io/posts/2015-08-Understanding-LSTMs/ AMIA 2017 | amia.org

  6. Vectorizing words via embedding Mikolov, T et. al. NIPS. 2013;23:3111-3119. AMIA 2017 | amia.org

  7. Note classification pipeline AMIA 2017 | amia.org

  8. Implementation • LSTM network written in Keras (Tensorflow back-end) • Embedding layer  LSTM layer  Fully connected layer • Embedding dimensionality: 64 • LSTM output dimensionality: 100 • Activation functions: • LSTM layer: Hard sigmoid • Fully connected layer: Sigmoid • 578,101 free parameters • Trained on CentOS Linux server with 4x Nvidia Tesla P100 GPU Accelerators • 14,336 total CUDA cores AMIA 2017 | amia.org

  9. Implementation Targets Inputs AMIA 2017 | amia.org

  10. Results: Cohort and note characteristics • EHR Cohort • Cases: 39 manually-identified transgender patients • Controls: 400 randomly selected patients with clinical notes • Free-text clinical notes • Obtained all notes for included patients • Tokenized; removed numbers, proper nouns, punctuation • Left-pad/truncate notes to 1000 words • 33/67% train-test split • Each patients’ notes in either train or test set, never both • Train word2vec embeddings on entire set of notes AMIA 2017 | amia.org

  11. Results: Classifier performance AMIA 2017 | amia.org

  12. Results: Word embeddings AMIA 2017 | amia.org

  13. Results: Accuracy and training loss Accuracy: Loss: 1 2 3 4 5 Training epoch: AMIA 2017 | amia.org

  14. Comparison to stroke classification Acute ischemic stroke AMIA 2017 | amia.org

  15. Limitations and future improvements • We need far more data! • 37 patients so far–we must be overfitting • How do we find more patients? grep approach is primitive • Leverage emerging techniques to extract knowledge from the learned networks • Neural networks are hard to introspect; no “beta coefficient” equivalent • Eventually, incorporate into clinical decision support • See a clinical note, evaluate, trigger alert if likely transgender AMIA 2017 | amia.org

  16. Application to multiple institutions • Does our model translate to other hospital systems? • If not, how about the word embeddings? • Major opportunity to improve training data size issues • Different institutions/EHR systems implement gender differently • NYP/Weill Cornell Medical Center: patient-reported gender with transgender options • Stuck in IRB purgatory • Use cutting-edge techniques to advance privacy guarantees • Generative Adversarial Networks and/or VariationalAutoencoders • Differential privacy analysis AMIA 2017 | amia.org

  17. Ethical considerations • Essential to address the ethical concerns associated with automated identification of transgender patients • Misuse could lead to patient discrimination • Reidentification of training patients may be possible • Gender is complicated, and imposing labels on patients may be counterproductive • See S57: Oral Presentations, first presentation: • “The Use of Informatics to Reduce Disparities in Transgender Health” (Kenrick Cato, PhD, RN) • 8:30 AM-8:48 AM; Tuesday (Fairchild) Cato, K et. al. J Empir Res Hum Res Ethics. 2016;11(3):214-219. AMIA 2017 | amia.org

  18. Acknowledgements • Tatonetti Lab • Nicholas Tatonetti, PhD* • Rami Vanguri, PhD* • Kayla Quinnies, PhD • Theresa Koleck, PhD • Yun Hao • Phyllis Thangaraj • Alexandre Yahi • Fernanda Polubriaginof, MD • Nick Giangreco • Jenna Kefeli • Jing Ai • Katie LaRow • Kenrick Cato, PhD* *Coauthors AMIA 2017 | amia.org

  19. AMIA is the professional home for more than 5,400 informatics professionals, representing frontline clinicians, researchers, public health experts and educators who bring meaning to data, manage information and generate new knowledge across the research and healthcare enterprise. AMIA 2017 | amia.org

  20. Thank you! Email me at: jdr2160@cumc.columbia.edu

More Related