150 likes | 275 Views
Logic form identification of medical clinical trials. Clint Tustison. Introduction. The what … Identify and extract logic forms from medical clinical trials (in)eligibility criteria The why … Understand the data Match up the information with other data, i.e., patients ’ medical records
E N D
Logic form identification of medical clinical trials Clint Tustison
Introduction • The what… • Identify and extract logic forms from medical clinical trials (in)eligibility criteria • The why… • Understand the data • Match up the information with other data, i.e., patients’ medical records • The how… • Syntactic parser • Cognitive modeling architecture
Process Predicate Calculus Clinical Trials (input) Text- processing Cognitive modeling engine Post- Processing (output) Syntactic Parser
Input • ClinicalTrials.gov • Sponsored by NIH and other federal agencies, private industry • 8,800 current trials online • 3,000,000 page views per month • Purpose, eligibility, location, more info.
Text processing • Convert trials to .xml format <criteria trial="http://www.clinicaltrials.gov/ct/show/NCT00055250"> <criterion> <text>Eligibility</text> <text>Criteria</text> <text>Inclusioncriteria:</text> <text>Adenocarcinoma of the pancreas</text> </criterion> . . </criteria>
Process: Input Predicate Calculus Clinical Trials (input) A criterion equals adenocarcinoma of the pancreas. Cognitive modeling engine Post- Processing (output) Syntactic Parser
Syntactic parser • Link-Grammar Parser • Characteristics • Syntactic dependency parse • Constraints for determining grammaticality • Links give clues on how to process constituents • Benefits • written in C very fast • Robust - ability to process spelling errors • Free - http://www.link.cs.cmu.edu/link • Can be easily integrated with other applications
Process: Syntactic Parser A criterion equals adenocarcinoma of the pancreas. Syntactic parser +--------------------------------Xp--------------------------------+ +-----Wd-----+ +----Js----+ | | +--Ds--+----Ss----+------Os-----+-----Mp----+ +---Ds--+ | | | | | | | | | | LEFT-WALL a criterion.n equals.v adenocarcinoma[?].n of the pancreas.n .
Intelligent Processing • Soar Architecture • Model and theory of cognition used in AI programming • Translates syntactic parse to logic output by reading links • Benefits • Goal-directed problem solving • Agent-based architecture • Ability to learn • Proven in multiple applications • Natural Language-Soar • Tactical Air-Soar • Nasa Test Director-Soar
Process: Intelligent processing (M1 ^idea N5 ^idea N4 ^idea N3 ^idea N2) (N5 ^annotation feat-dumped ^annotation seq-dumped ^annotation seq-prep ^aug N4^nuc pancreas ^wcount 7) (N4 ^annotation seq-dumped ^annotation seq-prep ^aug N3 ^nuc adenocarcinoma^of N5 ^wcount 4) (N3 ^ext N2 ^int N4 ^nuc equals ^wcount 3) (N2 ^annotation feat-dumped ^annotation seq-dumped ^annotation seq-prep ^aug N3 ^nuc criterion ^wcount 2)
Tools: Representation • Predicate Logic • Formal properties, allow for wide range of applications, usable crosslinguistically • Vocabulary, syntax, semantics • First-order: quantification over individuals (FOPC) • Higher-order: quantification over relations, etc.
Process: Logic Output Clinical Trials (input) Predicate Calculus A criterion equals adenocarcinoma of the pancreas. Cognitive modeling engine criterion(N2) & adenocarcinoma(N4) & pancreas(N5) & equals(N2,N4) & of(N4,N5). Post- Processing (output) Syntactic Parser
Post-processing • Prolog axioms • Remove elements not included in language of the criterion). • Format elements needed in output (ampersands). • Reduce(Z, Y) :- member(Criterm, Y), functor(Criterm, criterion, 1), arg(1, Criterm, Critvar), member(Predterm, Y), functor(Predterm, Xterm, 1), arg(1, Predterm, Predvar), member(Equalsterm, Y), functor(Equalsterm, equals, 2), arg(1, Equalsterm, Critvar), arg(2, Equals, Critvar, Predvar), delete(Y, Criterm, Z2), delete(Z2, Equalsterm, Z). • Turns previous statement: • criterion(N2) & adenocarcinoma(N4) & pancreas(N5) & equals(N2,N4) & of(N4,N5). • Into: • adenocarcinoma(N4) & pancreas(N5) & of(N4,N5).
Output <criteria trial="http://www.clinicaltrials.gov/ct/show/NCT00055250”> <criterion> <text>Eligibility</text> <text>Criteria</text> <text>Inclusion Criteria:</text> <text val=“1”>Adenocarcinoma of the pancreas</text> <pred val=“1”>pancreas(N5) & adenocarcinoma(N4)& of(N4,N5).</pred> </criterion> . . </criteria>
Results/Conclusion • Data can be matched up with patients’ medical records to determine if they meet criteria posted in the clinical trial. • Disadvantages • Grammar is difficult to write • Only one parsed output per utterance • Advantages • Fast • Robust • Implementation in other languages • Can be easily integrated with other applications/corpora