INESS-logo
Treebank Selection

Treebanks

Tools


Select a set of treebanks to work with. ?
Languages: All · Abaza (0/3) · Abkhazian (0/3) · Afrikaans (1/7) · Akkadian (2/10) · Akuntsu (1/4) · Albanian (1/3) · Amharic (1/6) · Ancient Greek (to 1453) (4/27) · Ancient Hebrew (0/3) · Apurinã (1/4) · Arabic (4/25) · Armenian (1/9) · Assyrian Neo-Aramaic (1/5) · Azerbaijani (0/2) · Bambara (1/6) · Basque (2/12) · Bavarian (0/2) · Beja (1/3) · Belarusian (1/7) · Bengali (0/3) · Bhojpuri (1/5) · Borôro (0/3) · Breton (1/6) · Bulgarian (2/13) · Buriat (1/7) · Catalan (2/10) · Cebuano (0/3) · Chinese (7/43) · Chukot (1/4) · Church Slavic (2/14) · Classical Armenian (0/3) · Coptic (1/8) · Croatian (2/12) · Czech (8/48) · Danish (2/14) · Dutch (4/22) · Egyptian (Ancient) (0/2) · Emerillon (0/3) · English (12/80) · Erzya (1/6) · Estonian (3/18) · Faroese (2/11) · Finnish (6/37) · French (10/54) · Galician (4/21) · Gbaya (Central African Republic) (0/1) · Georgian (0/13) · German (5/42) · Gheg Albanian (0/3) · Gothic (2/12) · Guajajára (1/4) · Gujarati (0/2) · Gweno (0/1) · Haitian (0/2) · Hausa (0/4) · Hebrew (2/16) · Hindi (3/18) · Hittite (0/3) · Hungarian (2/16) · Icelandic (3/18) · Indonesian (4/24) · Irish (1/19) · Italian (9/57) · Jamamadí (0/3) · Japanese (5/34) · Javanese (0/3) · K'iche' (1/4) · Kangri (1/4) · Karelian (1/5) · Karo (Ethiopia) (0/3) · Kazakh (2/10) · Khunsari (1/4) · Kirghiz (0/5) · Komi (2/12) · Komi-Permyak (1/5) · Korean (3/20) · Latgalian (0/2) · Latin (8/47) · Latvian (2/13) · Ligurian (0/3) · Literary Chinese (0/5) · Lithuanian (2/12) · Livvi (1/5) · Low German (1/4) · Luxembourgish (0/2) · Macedonian (0/2) · Makuráp (1/4) · Malayalam (0/3) · Maltese (1/6) · Manx (1/4) · Marathi (1/7) · Mbyá Guaraní (2/10) · Middle French (ca. 1400-1600) (0/2) · Modern Greek (1453-) (2/19) · Moksha (1/5) · Mundurukú (1/4) · (0/1) · (0/2) · (0/1) · (0/1) · (0/1) · (0/1) · Nayini (1/4) · Neapolitan (0/3) · Nhengatu (0/3) · Nigerian Pidgin (1/6) · Northern Kurdish (1/7) · Northern Sami (16/32) · Norwegian (0/5) · Norwegian Bokmål (17/66) · Norwegian Nynorsk (3/28) · Old English (ca. 450-1100) (0/5) · Old French (842-ca. 1400) (1/7) · Old Irish (to 900) (0/6) · Old Norse (0/7) · Old Russian (2/34) · Old Turkish (1/4) · Paraguayan Guaraní (0/3) · Paumarí (0/2) · Pech (0/1) · Persian (3/16) · Phrygian (0/1) · Polish (4/46) · Pomak (0/3) · Portuguese (5/42) · Pushto (0/1) · Qafar (0/3) · Romanian (5/29) · Russian (6/38) · Sanskrit (2/12) · Saya (0/3) · Scottish Gaelic (1/5) · Serbian (1/7) · Sinhala (0/3) · Skolt Sami (1/5) · Slovak (2/9) · Slovenian (5/22) · Sonha (1/4) · South Levantine Arabic (1/4) · Spanish (5/31) · Spanish Sign Language (0/1) · Swedish (5/31) · Swedish Sign Language (1/8) · Swiss German (1/5) · Tagalog (2/10) · Tamil (2/16) · Tatar (0/3) · Telugu (1/9) · Thai (1/6) · Tswana (0/2) · Tupinambá (1/4) · Turkish (10/54) · Uighur (2/9) · Ukrainian (2/8) · Umbrian (0/3) · Upper Sorbian (1/7) · Urdu (2/10) · Urubú-Kaapor (1/4) · Uzbek (0/1) · Veps (0/2) · Vietnamese (2/11) · Warlpiri (1/6) · Welsh (1/5) · Western Armenian (1/4) · Western Frisian (1/4) · Wolof (4/8) · Xavánte (0/3) · Xibe (0/3) · Yakut (0/3) · Yoruba (1/6) · Yue Chinese (1/7) · Yupik (1/4) · Zacatlán-Ahuacatlán-Tepetzintla Nahuatl (0/3)
Treebank Collections: All · Acquis (1/7) · Alpino (1) · BulTreeBank (0/1) · CLARIN-PL (0/5) · DELPH-IN (0/2) · GEGO (2/4) · GNC (2) · GeoGram (4) · HunGram (0/4) · ISWOC (3/9) · JOS (0/1) · Menotec (0/7) · Mercurius (0/1) · NAOB (0/15) · NDT (0/6) · NorGram (0/58) · NorGramBank (0/40) · Ordbøkene (0/8) · POLFIE (0/23) · PROIEL (4/10) · PaHC (1/2) · ParGram (4/11) · ParTMA (3/15) · Sami-open (0/15) · Sami-restricted (0/7) · Sofie (2/9) · TIGER (0/3) · TOROT (0/22) · Universal Dependencies 1.1 (2/19) · Universal Dependencies 1.2 (9/36) · Universal Dependencies 1.3 (13/53) · Universal Dependencies 1.4 (16/63) · Universal Dependencies 2.0 (13/63) · Universal Dependencies 2.1 (24/103) · Universal Dependencies 2.12 (54/245) · Universal Dependencies 2.14 (64/283) · Universal Dependencies 2.15 (65/297) · Universal Dependencies 2.3 (27/130) · Universal Dependencies 2.5 (29/157) · Universal Dependencies 2.8 (47/200) · WolGram (0/3) · XPar (1/2)
Treebank Types: All · lfg (0/129) · constituency (0/19) · constituency-alpino (0/1) · dependency (0/48) · dependency-cg (60/1691) · dependency-tuebadz (0/1) · hpsg (0/2)
Show only Parallel Treebanks

Show custom treebank:
Click on a treebank name below to proceed. All selected treebanks will be available for viewing and searching. | Show treebank descriptions
Selected Name Collection Type Sentences Words Indexed License Downloads
all | none 107 329 1 656 012
Afrikaans (afr) 0   0
Albanian (sqi) 0   0
Ancient Greek (to 1453) (grc) 30 516   389 990
grc-ud-2.0-dep Universal Dependencies 2.0 dependency-cg 12 613 164 652 no unspecified no
grc-ud-proiel-2.0-dep Universal Dependencies 2.0 dependency-cg 17 903 225 338 no unspecified no
Arabic (ara) 6 984   202 789
ara-ud-2.0-dep Universal Dependencies 2.0 dependency-cg 6 984 202 789 no unspecified no
Beja (bej) 0   0
Chinese (zho) 4 497   98 602
zho-ud-2.0-dep Universal Dependencies 2.0 dependency-cg 4 497 98 602 no unspecified no
Dutch (nld) 19 891   256 029
nld-ud-2.0-dep Universal Dependencies 2.0 dependency-cg 13 050 177 157 no unspecified no
nld-ud-lassy-small-2.0-dep Universal Dependencies 2.0 dependency-cg 6 841 78 872 no unspecified no
Gothic (got) 4 372   45 142
got-ud-2.0-dep Universal Dependencies 2.0 dependency-cg 4 372 45 142 no unspecified no
Icelandic (isl) 0   0
Indonesian (ind) 5 036   95 252
ind-ud-2.0-dep Universal Dependencies 2.0 dependency-cg 5 036 95 252 no unspecified no
Modern Greek (1453-) (ell) 2 065   46 415
ell-ud-2.0-dep Universal Dependencies 2.0 dependency-cg 2 065 46 415 no unspecified no
Mundurukú (myu) 0   0
Nayini (nyq) 0   0
Portuguese (por) 19 765   407 188
por-ud-2.0-dep Universal Dependencies 2.0 dependency-cg 8 891 173 562 no unspecified no
por-ud-br-2.0-dep Universal Dependencies 2.0 dependency-cg 10 874 233 626 no unspecified no
Slovak (slk) 9 543   76 346
slk-ud-2.0-dep Universal Dependencies 2.0 dependency-cg 9 543 76 346 no unspecified no
Sonha (soi) 0   0
Swedish Sign Language (swl) 0   0
Tamil (tam) 0   0
Telugu (tel) 0   0
Tupinambá (tpn) 0   0
Turkish (tur) 4 660   38 259
tur-ud-2.0-dep Universal Dependencies 2.0 dependency-cg 4 660 38 259 no unspecified no
Yoruba (yor) 0   0
Yupik (ypk) 0   0