The Abkhaz National Corpus

The Abkhaz National Corpus is a comprehensive and open, grammatically annotated text corpus which makes the Abkhaz language accessible to scientific investigations from various perspectives (linguistics, literary studies, history, political and social sciences etc.). It also serves as a means for the long-term preservation of Abkhaz language documents in digital form.

The corpus now comprises more than 10 million words and is continuously being extended.

The corpus was initially developed in the project PALAG, which run from 2015 to 2017. Project partners were the Goethe University Frankfurt, Institute of Empirical Linguistics, the Centre for Civil Integration and Inter-Ethnic Relations (CCIIR), the organization Business Women of Abkhazia, and the Clarino Bergen Centre.

The project was funded by the United States Agency for International Development (USAID).

Getting started

You can find a gentle introduction into working with the Abkhaz National Corpus on the page Using the corpus. More detailed information can be found on the Documentation page.


The corpus material is available under a CLARIN PUB (CLARIN_PUB-BY-NC-ND) license. For more information see //

Design & implementation: Paul Meurer, University of Bergen, Clarino Centre, 2019