The Abkhaz National Corpus

The Abkhaz National Corpus is a comprehensive and open, grammatically annotated text corpus which makes the Abkhaz language accessible to scientific investigations from various perspectives (linguistics, literary studies, history, political and social sciences etc.). It also serves as a means for the long-term preservation of Abkhaz language documents in digital form.

The corpus now comprises more than 10 million words and is continuously being extended.

The corpus is being developed in the project PALAG, which started in 2015. Project partners are the Goethe University Frankfurt, Institute of Empirical Linguistics, the Centre for Civil Integration and Inter-Ethnic Relations (CCIIR), and the organization Business Women of Abkhazia.

The project is funded by the United States Agency for International Development (USAID).

Getting started

You can find a gentle introduction into working with the Abkhaz National Corpus on the page Using the corpus. More detailed information can be found on the Documentation page.


The corpus material is available under a CLARIN PUB (CLARIN_PUB-BY-NC-ND) license. For more information see //

Design & implementation: Paul Meurer, Uni Research Computing, 2018