210 likes | 431 Views
UIMA. SHARP 4 - NLP May 25, 2010. Outline. UIMA Terminology (not just TLAs) Parts of a UIMA pipeline Running a pipeline Viewing annotations Creating a new annotator. UIMA terminology. CAS XCAS JCAS View Analysis Engine ( AE ) / Annotator Aggregate Analysis Engine
E N D
UIMA SHARP 4 - NLP May 25, 2010
Outline • UIMA Terminology (not just TLAs) • Parts of a UIMA pipeline • Running a pipeline • Viewing annotations • Creating a new annotator
UIMA terminology • CAS XCAS JCAS View • Analysis Engine (AE) / Annotator • Aggregate Analysis Engine • XML output: XCAS XMI • Type System JCasGen • CAS Visual Debugger (CVD) • CPE (Collection Processing Engine)
UIMA and Eclipse • UIMA plugin for Eclipse requires EMF • UIMA plugin provides visual editors for descriptors • An “Update site” exists for installing plugin
UIMA Pipeline Flow • Collection Reader • (CAS Initializer - deprecated) • Analysis Engine (AE) / Annotator • CAS Consumer
Pipeline Example UIMA term Collection Reader Analysis Engine Analysis Engine CAS Consumer Example Read files from a dir Sentence annotator Tokenizer annotator Output tokens to a DB
Options for running UIMA tools • Tools: • CPE Configurator • CVD • Options: • Command line scripts/.bat files • Run within Eclipse
Tying together a UIMA pipeline • Type System • Defines the data types passed along • CAS(Common Analysis Structure) • Container for the data
Tying together a UIMA pipeline • CPE descriptor – select the parts • Collection Reader • Analysis Engine(s) • CAS Consumer • Aggregate analysis engine • Multiple Analysis Engines and their order
Options for running a pipeline • CVD GUI • Single Aggregate Analysis Engine • No Collection Reader • CPE GUI • Instantiate a CpeDescription and invoke the process() method 2.3. Running a CPE from Your Own Java Application
Example: Running a pipeline Running cTAKES within Eclipse using a CPE Use run configuration UIMA_CPE_GUI--clinical_documents_pipeline CPE test1.xml from clinical documents pipeline\desc\collection_processing_engine
Options for viewing annotations • CVD • Annotation viewer • XML viewer • Text editor
Example: Viewing annotations Viewing annotations using the CVD • Load the Type System • Load the XCAS or XMI
Example: Running an AE in CVD Using CVD to run an Analysis Engine • No Collection Reader • Single Analysis Engine (can be an aggregate) • No CAS Consumer • Just paste/type in text to process Family history of hyperlipidemia.
Creating a New Annotator • Create Java project • Right click -> Add UIMA Nature • Add UIMA jars to .classpath (Build Path) • Create Analysis Engine (AE) descriptor • Add types to AE descriptor, or optionally create separate Type System descriptor • Write code!
Example: Creating a PEAR file • Right click -> Add UIMA Nature • Right click -> Generate Pear • Select Analysis Engine descriptor • Select OS and JDK • Modify Properties if needed • Select what to include
Example: Modifying a parameter UIMA’s descriptor editors allow you to modify most parameters without looking at the XML itself.
Links • Getting started with UIMA http://uima.apache.org/doc-uima-annotator.html • UIMA Update site for use in Eclipse http://www.apache.org/dist/incubator/uima/eclipse-update-site/
Email address masanz.james@mayo.edu