CORPUS LINGUISTICS: THEORY, METHODOLOGY, AND APPLICATION IN THE ANALYSIS OF NUSANTARA FOLKTALES (Case Study: The Legend of Malin Kundang from West Sumatra)
Keywords:
corpus linguistics, Nusantara folktales, computational analysis, Legend of Malin Kundang, cultural preservationAbstract
Corpus linguistics offers an empirical framework for analyzing language through large-scale digital
text collections, enabling both quantitative and qualitative exploration of linguistic patterns. This study
investigates how corpus linguistics theory and methodology can be applied to the analysis of Nusantara
folktales, with the Legend of Malin Kundang serving as a representative case study. Through a
systematic review of corpus linguistics scholarship and a descriptive corpus analysis using AntConc
software, this research integrates theoretical and methodological insights to examine the narrative,
lexical, and semantic dimensions of the text. The findings reveal that corpus analysis not only uncovers
patterns of lexical frequency, collocation, and semantic association but also highlights how linguistic
choices encode moral and cultural values central to Nusantara oral traditions. Dominant keywords
such as ibu, Malin, kapal, batu, and kutuk form a recurrent semantic network that reflects the thematic
interplay between filial disobedience, divine retribution, and social morality. These results
demonstrate the potential of corpus linguistics to enhance the interpretation of folklore narratives and
contribute to the development of a replicable methodological model for computational preservation
and cultural documentation of Indonesia’s oral heritage.
References
Adamou, E. (2019). Corpus linguistic methods. In J. Darquennes, J. Salmons, & W. Vandenbussche
(Eds.), Language contact. Boston & Berlin: Mouton de Gruyter.
Anthony, L. (2005). AntConc: Design and development of a freeware corpus analysis toolkit for the
technical writing classroom. 2005 IEEE International Professional Communication Conference
Proceedings.
Biber, D., & Conrad, S. (2019). Register, genre, and style. Cambridge: Cambridge University Press.
Biber, D., & Reppen, R. (Eds.). (2015). The Cambridge handbook of English corpus linguistics.
Cambridge: Cambridge University Press.
Hoey, M. (2005). Lexical priming: A new theory of words and language. London: Routledge.
Kilgarriff, A. (2001). Comparing corpora. International Journal of Corpus Linguistics, 6(1), 97-133.
McEnery, T., & Hardie, A. (2012). Corpus linguistics: Method, theory and practice. Cambridge:
Cambridge University Press.
McEnery, T., & Wilson, A. (2001). Corpus linguistics: An introduction (2nd ed.). Edinburgh:
Edinburgh University Press.
Muiser, I., Theune, M., et al. (2012). Cleaning up and standardizing a folktale corpus for humanities
research. ARCH 2012 Conference Proceedings.
Rühlemann, C. (2013). Narrative in English conversation: A corpus analysis of storytelling.
Cambridge: Cambridge University Press.
Sinclair, J. (1991). Corpus, concordance, collocation. Oxford: Oxford University Press.
Sinclair, J. (2004). Trust the text: Language, corpus and discourse. London: Routledge.
Stubbs, M. (2001). Words and phrases: Corpus studies of lexical semantics. Oxford: Blackwell.
Tangherlini, T. R. (2016). Big folklore: A special issue on computational folkloristics. Journal of
American Folklore, 129(511), 5-13.
Tognini-Bonelli, E. (2001). Corpus linguistics at work. Amsterdam: John Benjamins.
Wallis, S., & Nelson, G. (2001). Knowledge discovery in grammatically analysed corpora. Data
Mining and Knowledge Discovery, 5(4), 305-335.




