Improving ACE Performance

Improving ACE Performance Edward Loper Seth Kulick

{ { PersonPersonLocation (NE) (Nom)(NE) Soc At The ACE Task John met his son at the beach. • Detect and classify entities • Person, Geo-Political Entity, Organization, Facility, Location • Entity Types: • Named Entities: Cisco, George Washington • Nominals: a large crowd, a quaint library • Pronoun: he, it • Detect relations between entities • At, Role, Near, Social, Part

The U. Penn ACE System • A rapidly developed IE system • Built using TIDES-PennTools • Pipelined Architecture • Easy to construct from existing components • Easy to plug in new components • Statistical Components • Require less hand-tuning • Easy to improve with new training data

Tokenizing/Preprocessing Input File NE Tagging Parsing Nominal Tagging Relation Extraction Coreference Output File

Improving the ACE System • Improve Pipeline Components • Add new features to existing models • Replace Pipeline Components • New machine learning techniques • Generate New Training Data • Active learning (WordFreak) • Improve the Architecture • Wide Pipeline architecture

Improving Components • Use more informative features • Use features based on richer annotation • PropBank roles • Use PropBank roles as features to improve relation detection. • SuperTAGs • Use supertags instead of part of speech tags, to improve the detection and classification of named entities and nominals.

Improving the Architecture • Disadvantages of a simple pipelined architecture: • Interaction between stages is limited • If one stage produces incorrect output, later stagescan’t recover. • Wide Pipeline architecture - Each component generates multiple weighted outputs. • Increased interaction between stages • Later stages can re-rank the earlier outputs. • We have built a prototype wide pipeline system • NE Classification only

Replacing Components • Using improved ML algorithms, can we get better results with less training data? • Ryan McDonald implemented a NE tagger using Conditional Random Fields (CRF). • Outperforms our system’s Maxent NE tagger. • Experiment: Integrating the CRF tagger • Replace the Maxent NE tagger with a CRF tagger. • Exclude BBN training data (about 1/3 of the data) • Evaluate the changes in overall system performance

Integrating CRF: Results Entity Scores Relation Scores • The CRF tagger significantly improves NE detection, giving a higher entity score. • Better NE detection allows the system to find more relations, giving a higher relation score. Maxent Maxent +BBN CRF Maxent Maxent +BBN CRF

Conclusions • The architecture of the ACE System allows for: • Rapid improvement • Concurrent development • We are working to improve the system… • By improving the existing components. • By adding more sophisticated components. • By improving our training data with active learning. • By improving the basic system architecture.

Improving ACE Performance

Improving ACE Performance

Presentation Transcript

improving performance through strength

Improving Vendor Performance

Improving Board Performance

Improving OpenMP Performance

Improving Computer Performance

Improving our Performance

Improving reading performance

Improving CPR Field Performance

Improving Parallel Performance

Improving Performance

Improving Performance

Improving Operational Performance

Improving Test Performance

Improving performance

Improving Database Performance

Improving Student Performance

Improving Contractor Safety Performance

Improving Ado.Net Performance

Improving Cache Performance

Improving Web Servers performance