FormosanBank
English
English
  • Welcome
  • Background
    • Formosan Languages
    • Why Formosan?
    • FormosanBank
    • Contributors
  • The Bank Architecture
    • FormosanBank XML Format
    • Formosan Dialects
    • Corpora
      • ePark
      • ILRDF Dictionaries
      • Wikipedias
      • Presidential Apologies
      • NTU Paiwan ASR
      • Virginia Fey's Amis Dictionary
      • Paiwan Stories
    • Developers
      • 🤗HuggingFace
      • Folder structure
  • Additional Resources
    • Newsletters
      • October 2024
      • Septemper 2023
    • Publications
    • Terms of Use
    • Contributing to FormosanBank
Powered by GitBook
On this page

Welcome

NextFormosan Languages

Last updated 3 months ago

Welcome to FormosanBank, a large-scale data-driven project dedicated to the preservation and revitalization of the Indigenous Formosan languages of Taiwan. These languages, which form a significant part of the Austronesian language family, are endangered, with some facing the risk of extinction. Our mission is to create a comprehensive, machine-readable corpus of these languages to support linguistic research, language education, and revitalization efforts.

Here, you'll find a description of the corpus collected and processed across the 16 official Formosan languages, which includes over 8 million tokens and over 730 hours of audio across the languages (detailed breakdown can be found). You'd further find a detailed description of how the data is structured and the various way to access it. You can also access the github with all the work and data related to FormosanBank and the huggingFace organization with all the audio files from.

The large-scale nature of FormosanBank would not have been possible without the collaborative efforts of numerous individuals and organizations.

Principal Investigators

Advisory Board

  • Xuan Ruan

  • Ūi-iÅ« Kán

And .

here
here
here
Joshua Hartshorne
Emily Prud'hommeaux
Li-May Sung
Chuan-Jie Lin
Damián Blasi
Yuyang Liu
our many contributors
Page cover image