70 likes | 81 Views
GeoSQA : A Benchmark for Scenario-based Question Answering in the Geography Domain at High School Level. 黄子贤 2019/6/4. Scenario-based question answering. Patient summary in the medical domain Legal domain H igh-school geography exams.
E N D
GeoSQA: A Benchmark for Scenario-based Question Answering in the Geography Domain at High School Level 黄子贤 2019/6/4
Scenario-based question answering • Patient summary in the medical domain • Legal domain • High-school geography exams Crawled over 6,000 scenarios and 13,000 questions from Gaokao and mock tests
Deduplication Method Question3 Question2 Option A Option B Option C Option D Option A Option B Option C Option D 1、Alignment 2、Matching 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 option_A scenario option_D question 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 …… 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 question option_A scenario option_D
Deduplication Experiments Training label 1,000 pairs Accuracy on test set 95.3% Verify Manually check 100 pairs scenarios that are predicted to be duplicates: 100% Correct Randomly sample 50 scenarios manually check 10 top rankedscenarios 6% Wrong
Annotation 1、选类别 4、填写自由文本 • 标注迭代式进行 • 最终有22个类别 • 总计81个模板 • 有11%的图片没有类别未标记 2、选择类别对应的模板 3、填写模板
Annotation Audit The rate below 3 in any dimension are excluded. Quality Experiment 100大题,共计195小题 看标注错误,看图片后做对18 只看标注做对132,正确率67.7% 看图片做对149,正确率76.4% Dataset Size 1,981 scenarios and 4,112 multiple-choice questions
Benchmark Results Background knowledge Training Corpus • XNLI • LCQMC • DuReader