Paiwan Stories
Overview
The Paiwan Stories corpus is a small collection of children’s stories told in Eastern Paiwan. Drawn from three storybooks, these narratives provide valuable cultural and linguistic insights into the Paiwan language as it is used in accessible, community-oriented contexts. Their inclusion in FormosanBank broadens the range of available materials, offering a glimpse into traditional knowledge, everyday life, and childhood learning in a Paiwan-speaking environment.
Source Materials
giling, gesi & giling tjaiwan (2020) vuvu 的寶物 / kavatjes ni vuvu
giling, gesi & giling tjaiwan (2021) dingding蝸牛
giling, gesi & giling tjaiwan (2022) maljialjian a qaciljay
Corpus Processing
The Paiwan Stories were integrated into FormosanBank’s standardized XML format to ensure consistency and ease of access.
Processing Notes
Manual Conversion: Due to the small size and narrative complexity of these texts in addition to having the source material in PDFs which isn't the easiest to parse, the XML was created by hand rather than through automated scripts.
Cleaning and Standrdization: This isn't necessary because everything was already standardized when it was processed manually.
Minimal Editing: Orthographic and formatting adjustments were kept to a minimum, preserving the authors’ original representations of the language. Light cleaning, such as removing empty elements and standardizing punctuation, was applied to maintain data consistency.
Access Details
The repo containing the original stories, the XML corpus in FormosanBank as well as the code to reconstruct the corpus can be found here.
Copyrights
Stories can be used under Creative Commons CC-BY-NC license
Citation
In accordance with our Terms of Use, if you use this corpus or any product derived from this corpus in any publication, you must cite both FormosanBank and:
Juan, T. F., & Ruan, X. (2024). Corpus of Paiwan stories [Electronic resource].
Last updated