Linguistic Complexity as an Indicator of Writing Quality
This work explores correlations between measures of linguistic complexity and writing quality. In particular, measures of lexical and syntactic complexity are compared to grades awarded to 1,782 persuasive essays written by eighth-grade students. The set of essays was partitioned into three groups according to grade in order to identify salient linguistic features of low-, medium-, and high-proficiency writers. The paper provides detailed definitions for twenty-one measures of linguistic complexity as well as presenting a statistical analysis of their predictive power as indicators of writing quality. A number of insights were derived from the results of this analysis. For example, it was discovered that sentence-based complexity measures lack robustness with respect to low-proficiency essays due to their reliance on the correct use of punctuality. The paper concludes that both lexical and syntactic complexity can serve as powerful indicators of writing quality, with one measure of lexical complexity accounting for up to twenty-three percent of all variation in essay grade. Regarding the application of this work, automatically calculable correlates of writing quality can be used to create personalised pedagogical feedback, devise writing strategies, and provide instantaneous automated essay grades.