It’s complicated: The relationship between lexis, syntax and proficiency

Discourse and Interaction


This paper explores the relationship between lexical and syntactic complexity measures and proficiency in L2 English argumentative essays written by L1 Czech high school students. Syntactic complexity is generally understood as referring to the “range and sophistication” (Ortega 2015) of grammatical constructions, whereas lexical complexity can refer to the range and frequency of the words used. The research used 100 essays written by final year high school students. Lexical complexity was analysed using the Lexical Complexity Analyzer (Ai & Lu 2010, Lu 2012), syntactic complexity using the L2 Syntactic Complexity Analyzer (Lu 2010, 2014) and Biber et al.’s (2011) hypothesised developmental stages for complexity framework. Despite a large number of measurements failing to produce any significant patterns, positive correlations were found between lexical diversity measures and vocabulary scores. Similarly, Mean Length of Clause (MLC) and Complex nominals per clause (CN/C) showed weak positive associations with grammar scores, as did Stage 5 of the developmental stages. The findings provide an insight into the kinds of complexity features that can be given more focus during instruction and underscore the potential of these measures as determinants of proficiency.

syntactic complexity, lexical complexity, writing proficiency
Author biography

Christopher Williams

Christopher Williams is a doctoral student and language centre instructor at the Faculty of Education, Masaryk University, Brno. His dissertation is focused on the role of syntactic and lexical complexity in the argumentative essays of highschool students.


Beers, S. and Nagy, W. (2009) ‘Syntactic complexity as a predictor of adolescent writing quality: Which measures? Which genre?’ Reading and Writing 22(2), 185-200.
Bi, P. and Jiang, J. (2020) ‘Syntactic complexity in assessing young adolescent EFL learners’ writings: Syntactic elaboration and diversity.’ System 91, 102248.
Biber, D., Gray, B. and Poonpon, K. (2011) ‘Should we use characteristics of conversation to measure grammatical complexity in L2 writing development?’ TESOL Quarterly 45(1), 5-35.
Biber, D., Gray, B. and Staples, S. (2016) ‘Predicting patterns of grammatical complexity across language exam task types and proficiency levels.’ Applied Linguistics 37(5), 639-668.
Biber, D., Gray, B., Staples, S. and Egbert, J. (2020) ‘Investigating grammatical complexity in L2 English writing research: Linguistic description versus predictive measurement.’ Journal of English for Academic Purposes 46, 100869.
Bulté, B. and Housen, A. (2014) ‘Conceptualizing and measuring short-term changes in L2 writing complexity.’ Journal of Second Language Writing 26, 42-65.
Carroll, J. B. (1964) Language and Thought. Englewood Cliffs: Prentice-Hall.
Casal, E. J. and Lee, J. J. (2019) ‘Syntactic complexity and writing quality in assessed first-year L2 writing.’ Journal of Second Language Writing 44, 51-62.
Chen, H., Xu, J. and He, B. (2014) ‘Automated essay scoring by capturing relative writing quality.’ The Computer Journal 57(9), 1318-1330.
Crossley, S. A., Salsbury, T., McNamara, D. S. and Jarvis, S. (2011) ‘Predicting lexical proficiency in language learner texts using computational indices.’ Language Testing 28(4), 561-580.
Crossley, S. A., Cai, Z. and McNamara, D. (2012) ‘Syntagmatic, paradigmatic, and automatic n-gram approaches to assessing essay quality.’ In: McCarthy, P. M. and Youngblood, G. M. (eds) Proceedings of the 25th International Florida Artificial Intelligence Research Society (FLAIRS) Conference. AAAI Press. 214–219.
Crossley, S. A. and McNamara, D. S. (2014) ‘Does writing development equal writing quality? A computational investigation of syntactic complexity in L2 learners.’ Journal of Second Language Writing 28, 66-79.
Crossley, S. A. (2020) ‘Linguistic features in writing quality and development: An overview.’ Journal of Writing Research 11(3), 415-443.
Daller, H., Van Hout, R. and Treffers-Daller, J. (2003) ‘Lexical richness in the spontaneous speech of bilinguals.’ Applied Linguistics 24(2), 197-222.
Davies, M. (2008) The Corpus of Contemporary American English: 425 million words, 1990–present.
Grant, L. and Ginther, A. (2000) ‘Using computer-tagged linguistic features to describe L2 writing differences.’ Journal of Second Language Writing 9(2),123-145. Guiraud, P. (1960) Problemes et Methodes de la Statistique Linguistique. Paris: Presses universitaires de France.
Housen, A., Kuiken, F., and Vedder, I. (2012) ‘Complexity, accuracy and fluency: Definitions, measurement and research.’ In: Housen, A., Kuiken, F. and Vedder, I. (eds) Dimensions of L2 Performance and Proficiency: Complexity, Accuracy and Fluency in SLA Language Learning & Language Teaching. Amsterdam and Philadelphia: John Benjamins.1-20.
Jarvis, S., Grant, L., Bikowski, D. and Ferris, D. (2003) ‘Exploring multiple profiles of highly rated learner compositions.’ Journal of Second Language Writing 12, 377-403.
Johansson, V. (2008) ‘Lexical diversity and lexical density in speech and writing: A developmental perspective.’ Working Papers 53, 61-79.
Johnson, M. D., Acevedo, A. and Mercado, L. (2013) ‘What vocabulary should we teach?: Lexical frequency profiles and lexical diversity in second language writing.’ Writing and Pedagogy 5(1), 83-103.
Johnson, M. D., Acevedo, A. and Mercado, L. (2016) ‘Vocabulary knowledge and vocabulary use in second language writing.’ TESOL Journal 7(3), 700-715.
Kim, J. (2014) ‘Predicting L2 writing proficiency using linguistic complexity measures: A corpus-based study.’ English Teaching 69(4), 27-51.
Klein, D. and Manning, C. D. (2003) ‘Fast exact inference with a factored model for natural language parsing.’ In: Becker, S., Thrun, S. and Obermayer, K. (eds) Advances in Neural Information Processing Systems 15. Cambridge: MIT Press. 3-10
Kyle, K. (2016) Measuring Syntactic Development in L2 Writing: Fine Grained Indices of Syntactic Complexity and Usage-based Indices of Syntactic Sophistication. (Doctoral dissertation). Online document. Retrieved on 23 November 2023 <>.
Kyle, K. and Crossley, S. A. (2018) ‘Measuring syntactic complexity in L2 writing using fine-grained clausal and phrasal indices.’ The Modern Language Journal 102(2), 333-349.
Khushik, G. A. and Huhta, A. (2019) ‘Investigating syntactic complexity in EFL learners. Writing across Common European Framework of Reference Levels A1, A2, and B1.’ Applied Linguistics 41(4), 506-532.
Lee, C., Ge, H. and Chung, E. (2021) ‘What linguistic features distinguish and predict L2 writing quality? A study of examination scripts written by adolescent Chinese learners of English in Hong Kong.’ System 97, 102461.
Lu, X. (2010) ‘Automatic analysis of syntactic complexity in second language writing.’ International Journal of Corpus Linguistics 15, 474-496.
Lu, X. (2011) ‘A corpus‐based evaluation of syntactic complexity measures as indices of college‐level ESL writers’ language development.’ TESOL Quarterly 45(1), 36-62.
Lu, X. (2012) ‘The relationship of lexical richness to the quality of ESL learners’ oral narratives.’ The Modern Language Journal 98(2), 190-208. Lu, X. and Ai, H. (2015) ‘Syntactic complexity in college-level English writing: Differences among writers with diverse L1 backgrounds.’ Journal of Second Language Writing 29, 16-27.
Lu, X. (2017) ‘Automated measurement of syntactic complexity in corpus-based L2 writing research and implications for writing assessment.’ Language Testing 34(4), 493-511.
Mazgutova, D. and Kormos, J. (2015) ‘Syntactic and lexical development in an intensive English for Academic Purposes programme.’ Journal of Second Language Writing 29, 3-15.
McNamara, D. S., Crossley, S. A. and Roscoe, R. (2013) ‘Natural language processing in an intelligent writing strategy tutoring system.’ Behavior Research Methods 45, 499-515.
Nation, I. (2006) ‘How large a vocabulary is needed for reading and listening?’ Canadian Modern Language Review 63(1), 59-82.
Ortega, L. (2015) ‘Syntactic complexity in L2 writing: Progress and expansion.’ Journal of Second Language Writing 29, 82-94.
Paquot, M. (2018) ‘Phraseological competence: A missing component in university entrance language tests? Insights from a study of EFL learners’ use of statistical collocations.’ Language Assessment Quarterly 15(1), 29-43.
Qin, W. and Uccelli, P. (2016) ‘Same language, different functions: A cross-genre analysis of Chinese EFL learners’ writing performance.’ Journal of Second Language Writing 33, 3-17.
Shadloo, F., Ahmadi, H. S. and Ghonsooly, B. (2019) ‘Exploring syntactic complexity and its relationship with writing quality in EFL argumentative essays.’ Topics in Linguistics 20(1), 68-81.
Torruella, J. and Capsada, R. (2013) ‘Lexical statistics and typological structures: A measure of lexical richness.’ Procedia – Social and Behavioral Sciences 95, 447-454.
Vermeer, A. (2000) ‘Coming to grips with lexical richness in spontaneous speech data.’ Language Testing 17(1), 65-83
Wolfe-Quintero, K., Inagaki, S. and Kim, H. Y. (1998) Second Language Development in Writing: Measures of Fluency, Accuracy, and Complexity. Honolulu, HI: University of Hawaii, Second Language Teaching & Curriculum Center.
Yang, W., Lu, X. and Weigle, S. (2015) ‘Different topics, different discourse: Relationships among writing topic, measures of syntactic complexity, and judgments of writing quality.’ Journal of Second Language Writing 28, 53-67.
Yoon, H. J. and Polio, C. (2016) ‘The linguistic development of students of English as a second language in two written genres.’ Tesol Quarterly 51(2), 275-301.
Yoon, H. (2017) ‘Linguistic complexity in L2 writing revisited: Issues of topic, proficiency, and construct multidimensionality.’ System 66, 130-141.



Crossref logo





PDF views