283 likes | 543 Views
NVIDIA BioBert, an optimized version of BioBert was created specifically for biomedical and clinical domains, providing this community easy access to state-of-the-art NLP models.
E N D
BioBERT for NLU Build Cutting Edge Biomedical & Clinical NLU Models
TRENDS IN NLP & SPEECH NLP’s ImageNet Moment has Arrived UNSTRUCTURED & UNTAPPED GROWTH OF MULTI-MODAL DATASETS Textual data is still largely not utilized in healthcare, despite its value. EHR data, PubMed literature, Clinical Notes, Imaging, Devices, Patient Communications, Social Media. DOMAIN SPECIFIC BEATS GENERIC DRAMATICALLY IMPROVING ALGORITHMS Transformer & its derivatives like BERT & XLNet produce game changing performance improvements. BioBERT beats BERT on Biomedical tasks. ClinicalBERT beats BioBERT on clinical tasks. CONVERSATIONAL AI NEEDS LARGE MODELS LOWER BARRIER TO ENTRY Pre-train a very language model once and fine tune many times for different use cases You don’t need a Phd in ML to do industrial strength NLP. 2
USE CASES IN HEALTHCARE Text Classification Sentiment Analysis Intent Classification Message Triaging Claims Processing Named Entity Recognition Information Extraction Features in ML models Knowledge Graphs Automatic Weak Labeling De-identification Question-Answer Answer questions posed in natural language Chatbots Text Summarization Summarize physician notes, radiology reports etc. Speech Recognition Call Center optimization Voice commands Machine Translation Patient Engagement Published Literature 3
RACE TO CONVERSATIONAL AI Exceeding Human Level Performance Google (Transformer ) Facebook (RoBERTa ) Google (BERT) 2019 Today 2018 2017 Microsoft (MT-DNN) Baidu (ERNIE) Alibaba GLUE Leaderboard (Enriched BERT base) Uber (Plato) 4
DOMAIN SPECIFIC BEATS GENERIC Clinical BERT(s) BioBERT Pre-trained on top of Bio-BERT using clinical Notes Pre-trained on top of BERT using PubMed data • • Beats BioBERT on clinical tasks. Beats BERT on Biomedical tasks. • • 5
TRAIN USING NGC Optimized, Scalable & Easy to Use • Convenient scripts for pre-training & fine-tuning • Optimized Docker images for TensorFlow • Automatic Mixed Precision for up to 3x speedup • Scale out for pre-training & fine-tuning https://ngc.nvidia.com/catalog/model-scripts/nvidia:biobert_for_tensorflow 8
TRAIN USING NGC Optimized, Scalable & Easy to Use For comparison, the BioBERT paper reported 10+ days (240+ hours) to train on a 8x32 GB V100 system. https://news.developer.nvidia.com/biobert-optimized/ 9