130 likes | 266 Views
Automatic Readability Evaluation Using a Neural Network Vivaek Shivakumar October 29, 2009. Background and Purpose. Readability – how difficult it is to read and comprehend a text – used in educational settings, grade-level reading evaluation Traditional readability formulas
E N D
Automatic Readability Evaluation Using a Neural Network Vivaek Shivakumar October 29, 2009
Background and Purpose • Readability – how difficult it is to read and comprehend a text – used in educational settings, grade-level reading evaluation • Traditional readability formulas • Invented in 20th Century before the computer age • Use primitive surface linguistic features • Still widely used, even in computer applications • e.g., Flesch-Kincaid Grade Lv.= • Dale-Chall Raw Score = 0.1579 %DW + 0.0496 AvSL + 3.6365
Background and Purpose • Real measure of readability factors in • Surface features (e.g., syllables per word, average sentence length) • Syntactic features (sentence structure, e.g., number of subordinate clauses) • Parse tree size (e.g., Feng, 2009) • Semantic features (meanings, e.g., lexical density) • *Pragmatics (context) (out of project scope)
Background and Purpose • Goal: create a model to give a more accurate score of readability of text using sophisticated techniques • Machine learning, e.g., neural networks: can be used to create a model using textual features as inputs • Supervised – using state grade level standards assessment tests for training set
Development • Neural Network • (still to be implemented) • Will be supervised – training set: reading passages from state and national grade level assessments • Grade levels “teach” the model to get more accurate • The neural network readability model should reflect the relationships between the different inputs that will be used
Development • Criteria/Features to be used as inputs (possible) • Average word length in syllables • Average sentence length in words • Average sizes of sentence's parse/dependency trees • Lexical density (index based on frequency of words in text compared to in English in gen.) • common/uncommon words • Other syntactic features such as the presence of certain dependency types, etc.
Development • Surface feature statistics (e.g. word/sentence lengths) and percentage of uncommon words* • Trivial to implement *not finished • Parse/Dependency trees • Using Stanford Parser (or another if faster) • Output is analyzed from easy-to-read format • Neural network • Not trivial to implement – bulk of development
Development • Example of problem of working with natural language: syllable demarcation irregularities • Implementation used to count syllables: • Each group of consecutive vowels (a,e,i,o,u) counts towards a syllable, with the following exceptions: • Final -ES, -ED, and -E are not counted as syllables (besides -LE, which is). • The letter “y” is a vowel unless it starts a word or follows another vowel. • Any word of three letters or less counts as one syllable.
Preliminary Testing • Evaluating three readability formulas vs. “actual” grade levels – same with dependency/parse tree sizes • Investigate whether there is a relationship, and if so how strong • Texts used: same as for neural network training set – 92 texts at various grade levels
Analysis of Prelim. Results • Dependency and Parse tree sizes are very closely linearly associated • Makes sense to only use one or the other in neural network • All of the three readability formulas show some association with grade level – surface features are useful but not alone • None are consistent – high deviation – all are unreliable
Expected Results • Ideally, neural network learns to evaluate U.S. Grade level of a given text with a significantly greater accuracy and precision than the existing formulas do