Longitudinal studies of language errors based on a German-language learner corpus
DOI: 10.23951/2307-6127-2024-1-54-64
In the age of digitalization and the active spread of corpus technologies in linguistic education, linguodidactics specialists are constantly discovering new opportunities in working with big data. One relatively new phenomenon in Russian education is the collection of corpora of student texts in a foreign language. It’s possibilities for linguodidactical research depend primarily on the duration of the data collection and on the markup that corpus contains. The article focuses on the corpus of German-language student texts PACT (Petrozavodsk annotated corpus of texts) and longitudinal research of types of linguistic mistakes made by students throughout the study of the German language for 5 years. The result of the research is statistics for 90 classes of errors, divided into 7 major groups – grammar, vocabulary, orthography, punctuation, discourse, omissions and superfluous elements – and the dynamics of these statistics over the 5 years of German language study. Comparison of the most frequent errors made by 1st and 5th year students respectively shows that subjects causing the most problems for students during all years of study are lexeme selection, orthography, omissions in text, punctuation and reverse word order. At the end of study problems with indefinite articles, adjective and noun declension, formation of plural form and gender of nouns are giving way to other issues such as superfluous elements in text, logic, word order in subordinate sentences and stylistic errors.
Keywords: learner corpus, German as a foreign language, language errors, educational data mining
References:
1. Uvarov A. Yu., Frumin I. D. (eds) Trudnosti i perspektivy tsifrovoy transformatsii obrazovaniya [Challenges and prospects for digital transformation of education]. Moscow, NIU VSHE Publ., 2019. 344 p. (in Russian). DOI: https://doi.org/10.17323/978-5-7598-1990-5
2. Pavlova O. Yu. Ispol’zovaniye yazykovykh korpusov v obuchenii inostrannomu yazyku [Linguistic corpora in foreign language teaching]. Yazyk i kul’tura – Language und Culture, 2021, no. 54, pp. 283–298 (in Russian). DOI: https://doi.org/10.17223/19996195/54/16
3. Fiofanova O. A. (ed.) Bol’shiye dannyye v obrazovanii: dokazatel’noye razvitiye obrazovaniya [Big Data in Education: Evidence-based education development]. Moscow, Delo Publ., 2021. 342 p. (in Russian).
4. Hou J., Koppatz M., Hoya Quecedo J. M., Stoyanova N., Kopotev M., Yangarber R. Modeling Language Learning Using Specialized Elo Ratings. Innovative Use of NLP for Building Educational Applications. Proceedings of the 14th Workshop. Stroudsburg, PA: Association for Computational Linguistics, 2019. P. 494–506. DOI: http://dx.doi.org/10.18653/v1/W19-4451
5. Granger S. The International Corpus of Learner English: a new resource for foreign language learning and teaching and second language acquisition research. TESOL Quarterly, 2003, 37 (3), pp. 538–546.
6. Kamshilova O. N. Uchebnyy korpus tekstov: potentsial, sostav, struktura [Text corpora: potential, composition, structure]. Saint Petersburg, Knizhnyy dom Publ., 2012. 56 p. (in Russian).
7. Akhapkina Ya. E. Erratologicheskaya razmetka korpusa russkikh uchebnykh tekstov: takticheskiye resheniya [Erratorical marking of the textbook corpora: tactical solutions]. Trudy Instituta russkogo yazyka im. V. V. Vinogradova – Proceedings of the V. V. Vinogradov Institute of Russian Language, 2019, no. 4 (22), pp. 9–21 (in Russian).
8. Grudeva E. V., Buchilova I. A., Volkova N. A. Korpusy oshibok: tselevaya auditoriya, vozmozhnaya arkhitektura korpusa [Corpora of errors: target audience, a possible architecture of the corpus]. Vestnik Cherepovetskogo gosudarstvennogo universiteta – Cherepovets State University Bulletin, 2018, vol. 5 (86), pp. 63–72 (in Russian). URL: https://cyberleninka.ru/article/n/korpusy-oshibok-tselevaya-auditoriya-vozmozhnaya-arhitektura-korpusa (accessed 21 January 2023).
9. Kotyurova I. A., Shchegoleva L. V. Korpus studencheskikh tekstov na nemetskom yazyke kak istochnik dannykh dlya obrazovaniya i nauki [Learner corpus in German as a data source for education and science]. Voprosy obrazovaniya – Educational Studies Moscow, 2022, no. 4, pp. 322–349 (in Russian). URL: https://cyberleninka.ru/article/n/korpus-studencheskih-tekstov-na-nemetskom-yazyke-kak-istochnik-dannyh-dlya-obrazovaniya-i-nauki (accessed 21 January 2023).
10. Götz S. Learner Corpora to Inform Testing and Assessment. The Routledge Handbook of Corpora and English Language Teaching and Learning, Routledge, 2022. Pp. 311–326.
11. Vinogradova O., Login N. The Design of Tests with Multiple Choice Questions Automatically Generated from Essays in a Learner Corpus. Higher School of Economics Research Paper, No. WP BRP, 2017, vol. 60, pp. 16.
12. Bowles M. A. Using instructor judgment, learner corpora, and DIF to develop a placement test for Spanish L2 and heritage learners. Language Testing, 2022, no. 39 (3), рр. 355–376. DOI: https://doi.org/10.1177/02655322221076033
13. Granger S. The computer learner corpus: a versatile new source of data for SLA research. Learner English on computer, Routledge, 2014. P. 3–18.
14. Kwon H. English learner corpora and research in Korea. Corpora, 2022, vol. 17, no. Supplement, pp. 5–22.
15. Liu K., Oiwun Cheung J., Zhao N. Learner corpus research in Hong Kong: past, present and future. Corpora, 2022, vol. 17, no. Supplement, pp. 79–97.
16. Smul’skaya E. D. Longityudnyye issledovaniya v lingvistike: opyt i perspektivy [Longitudinal Studies in Linguistics: Experience and Prospects]. Izvestiya RGPU im. A. I. Gertsena – Herzen University Journal of Humanities and Sciences, 2016, no. 182, pp. 53–58 (in Russian).
17. Sibiryakova N. B. Interferentsiya s rodnym (russkim) yazykom pri punktuatsionnykh oshibkakh v tekstakh studentov na nemetskom yazyke [Interference with the mother language (Russian) in punctuation errors in student texts in German]. Yazykovyye kontakty v polikul’turnom mire [Language contacts in a multicultural world]. Kursk, 2022. Pp. 87–94 (in Russian).
Issue: 1, 2024
Series of issue: Issue 1
Rubric: LINGUISTIC EDUCATION
Pages: 54 — 64
Downloads: 218