Findings indicate that corpus tools can help language learners correct errors, but not all types of errors are corrected effectively.

Corpus linguistics is a relatively recent science. A corpus (from Latin, meaning body) is a collection of text or texts that is analyzed through software. Researchers search corpora to find specific words or example sentences. Their search results help them better understand how the words are used or the sentences are constructed. Although some people might consider corpora primarily a research tool, research suggests that they might also be used as an editing tool for non-native English speakers.


In the study “How Useful are Corpus Tools for Error Correction? Insights From Learner Data,” researchers Natalia Doglova and Charles Mueller (2019) gathered data from 175 English-language learning graduate students enrolled in a large US university. The participants received basic training in corpus tools and showed proficiency in using the tools. Participants were asked to then use the corpus tools to correct errors in their own research papers, which had already been graded. While errors were marked on the graded papers, the marks did not suggest corrections, so participants had to determine how to correct the error. Researchers gathered 304 corrections from the participants.

“The errors that were easiest to correct were those that appeared verbatim in the corpus and for which a more careful consideration of context was unnecessary.”

—Doglova and Mueller (2019)

Of the 304 corrections, 166 (55%) were local grammatical errors (e.g., preposition collocation, part of speech choice), 41 (13%) were global grammatical errors (e.g., word order, clausal level chunks), and 97 (32%) were register errors. (Register refers to a subset of language, such as spoken English, informal English, or written English. In this study, participants corrected the register errors by changing word choice appropriate for a more academic register.)

Researchers found that not all the corrections made by the students were accurate. Many of the corrections contained nonstandard forms and failed to address the actual error. In one example, a participant replaced the phrase present an analyze with present an investigate, assuming that the error involved word choice instead of word inflection (the correct correction would have been present an analysis) (Doglova and Mueller 2019, 104).

Researchers sorted corrections according to whether or not the correction was native-like (i.e., would be used by a native English speaker). Among local errors, 97 (58%) were native-like and 69 (42%) were non-native-like. Among global errors, 31 (76%) were native-like and 10 (24%) were non-native-like. Among register errors, 84 (87%) were native-like and 13 (13%) were non-native-like. Researchers applied a Chi-square test to the data and concluded that the relationship between error type and success was significant: participants made fewer native-like corrections in local errors and made more native-like corrections in register errors.


Through qualitative analysis, the researchers determined that “the errors that were easiest to correct were those that appeared verbatim in the corpus and for which a more careful consideration of context was unnecessary” (Doglova and Mueller 2019, 105). Considering “context” here refers to the broader meaning of the excerpts of the corpus texts, and not just the keyword(s) alone. All global, local, and register errors varied in their need for considering context. While correcting local errors seemed an intuitive choice for learners, they were weakest in this category (Doglova and Mueller 2019, 106). This finding is a necessary caution to language teachers and learners: while the growing literature supports the use of corpus tools to assist language learning, corpora cannot function as a substitute for human judgment. Understanding the context of errors and applying corpus tools correctly will best help English language learners correct errors.

To learn more about how corpus tools assist in error correction, read the full article:

Dolgova, Natalia, and Charles Mueller. 2019. “How Useful Are Corpus Tools for Error Correction? Insights from Learner Data.” Journal of English for Academic Purposes 39 (2019): 97-108.

—Skyler Garrett, Editing Research


Find more research

Survey results found that the majority of participants found corpus tools useful for learning vocabulary (Doglova and Mueller 2019, 105). Learn more about language learners’ vocabulary by reading George Higginbotham’s and Jacqui Read’s (2019) article: Higginbotham, George, and Jacqui Reid. “The lexical sophistication of second language learners’ academic essays.” Journal of English for Academic Purposes 37 (2019): 127-140.

To learn about using corpora in editing, check out Brady Davis’s article: “How to Use Corpora to Edit Technical Articles Effectively and Accurately.”