CORPUS LINGUISTICS: THEORY, METHODOLOGY, AND APPLICATION IN THE ANALYSIS OF NUSANTARA FOLKTALES (Case Study: The Legend of Malin Kundang from West Sumatra)

Authors

  • Niken Ardila Rehiraky Bali Business School Author

Keywords:

corpus linguistics, Nusantara folktales, computational analysis, Legend of Malin Kundang, cultural preservation

Abstract

Corpus linguistics offers an empirical framework for analyzing language through large-scale digital 
text collections, enabling both quantitative and qualitative exploration of linguistic patterns. This study 
investigates how corpus linguistics theory and methodology can be applied to the analysis of Nusantara 
folktales, with the Legend of Malin Kundang serving as a representative case study. Through a 
systematic review of corpus linguistics scholarship and a descriptive corpus analysis using AntConc 
software, this research integrates theoretical and methodological insights to examine the narrative, 
lexical, and semantic dimensions of the text. The findings reveal that corpus analysis not only uncovers 
patterns of lexical frequency, collocation, and semantic association but also highlights how linguistic 
choices encode moral and cultural values central to Nusantara oral traditions. Dominant keywords 
such as ibu, Malin, kapal, batu, and kutuk form a recurrent semantic network that reflects the thematic 
interplay between filial disobedience, divine retribution, and social morality. These results 
demonstrate the potential of corpus linguistics to enhance the interpretation of folklore narratives and 
contribute to the development of a replicable methodological model for computational preservation 
and cultural documentation of Indonesia’s oral heritage.

References

Adamou, E. (2019). Corpus linguistic methods. In J. Darquennes, J. Salmons, & W. Vandenbussche

(Eds.), Language contact. Boston & Berlin: Mouton de Gruyter.

Anthony, L. (2005). AntConc: Design and development of a freeware corpus analysis toolkit for the

technical writing classroom. 2005 IEEE International Professional Communication Conference

Proceedings.

Biber, D., & Conrad, S. (2019). Register, genre, and style. Cambridge: Cambridge University Press.

Biber, D., & Reppen, R. (Eds.). (2015). The Cambridge handbook of English corpus linguistics.

Cambridge: Cambridge University Press.

Hoey, M. (2005). Lexical priming: A new theory of words and language. London: Routledge.

Kilgarriff, A. (2001). Comparing corpora. International Journal of Corpus Linguistics, 6(1), 97-133.

McEnery, T., & Hardie, A. (2012). Corpus linguistics: Method, theory and practice. Cambridge:

Cambridge University Press.

McEnery, T., & Wilson, A. (2001). Corpus linguistics: An introduction (2nd ed.). Edinburgh:

Edinburgh University Press.

Muiser, I., Theune, M., et al. (2012). Cleaning up and standardizing a folktale corpus for humanities

research. ARCH 2012 Conference Proceedings.

Rühlemann, C. (2013). Narrative in English conversation: A corpus analysis of storytelling.

Cambridge: Cambridge University Press.

Sinclair, J. (1991). Corpus, concordance, collocation. Oxford: Oxford University Press.

Sinclair, J. (2004). Trust the text: Language, corpus and discourse. London: Routledge.

Stubbs, M. (2001). Words and phrases: Corpus studies of lexical semantics. Oxford: Blackwell.

Tangherlini, T. R. (2016). Big folklore: A special issue on computational folkloristics. Journal of

American Folklore, 129(511), 5-13.

Tognini-Bonelli, E. (2001). Corpus linguistics at work. Amsterdam: John Benjamins.

Wallis, S., & Nelson, G. (2001). Knowledge discovery in grammatically analysed corpora. Data

Mining and Knowledge Discovery, 5(4), 305-335.

Downloads

Published

2025-11-28

Issue

Section

Articles