INESS-logo
Treebank Selection

Treebanks

Tools


Select a set of treebanks to work with. ?
Languages: All · Abaza (0/3) · Abkhazian (0/3) · Afrikaans (2/7) · Akkadian (1/10) · Akuntsu (0/4) · Albanian (0/3) · Amharic (1/6) · Ancient Greek (to 1453) (8/27) · Ancient Hebrew (0/3) · Apurinã (0/4) · Arabic (8/25) · Armenian (1/9) · Assyrian Neo-Aramaic (1/5) · Azerbaijani (0/2) · Bambara (1/6) · Basque (5/12) · Bavarian (0/2) · Beja (0/3) · Belarusian (2/7) · Bengali (0/3) · Bhojpuri (1/5) · Borôro (0/3) · Breton (1/6) · Bulgarian (5/13) · Buriat (2/7) · Catalan (4/10) · Cebuano (0/3) · Chinese (12/43) · Chukot (0/4) · Church Slavic (4/14) · Classical Armenian (0/3) · Coptic (3/8) · Croatian (5/12) · Czech (17/48) · Danish (5/14) · Dutch (8/22) · Egyptian (Ancient) (0/2) · Emerillon (0/3) · English (19/80) · Erzya (1/6) · Estonian (5/18) · Faroese (1/11) · Finnish (12/37) · French (15/54) · Galician (7/21) · Gbaya (Central African Republic) (0/1) · Georgian (0/13) · German (10/42) · Gheg Albanian (0/3) · Gothic (4/12) · Guajajára (0/4) · Gujarati (0/2) · Gweno (0/1) · Haitian (0/2) · Hausa (0/4) · Hebrew (5/16) · Hindi (6/18) · Hittite (0/3) · Hungarian (5/16) · Icelandic (0/18) · Indonesian (6/24) · Irish (5/19) · Italian (13/57) · Jamamadí (0/3) · Japanese (7/34) · Javanese (0/3) · K'iche' (0/4) · Kangri (0/4) · Karelian (1/5) · Karo (Ethiopia) (0/3) · Kazakh (4/10) · Khunsari (0/4) · Kirghiz (0/5) · Komi (2/12) · Komi-Permyak (1/5) · Korean (4/20) · Latgalian (0/2) · Latin (12/47) · Latvian (4/13) · Ligurian (0/3) · Literary Chinese (0/5) · Lithuanian (3/12) · Livvi (1/5) · Low German (0/4) · Luxembourgish (0/2) · Macedonian (0/2) · Makuráp (0/4) · Malayalam (0/3) · Maltese (1/6) · Manx (0/4) · Marathi (2/7) · Mbyá Guaraní (2/10) · Middle French (ca. 1400-1600) (0/2) · Modern Greek (1453-) (5/19) · Moksha (1/5) · Mundurukú (0/4) · (0/1) · (0/2) · (0/1) · (0/1) · (0/1) · (0/1) · Nayini (0/4) · Neapolitan (0/3) · Nhengatu (0/3) · Nigerian Pidgin (1/6) · Northern Kurdish (2/7) · Northern Sami (2/32) · Norwegian (5) · Norwegian Bokmål (36/66) · Norwegian Nynorsk (10/28) · Old English (ca. 450-1100) (5) · Old French (842-ca. 1400) (2/7) · Old Irish (to 900) (0/6) · Old Norse (0/7) · Old Russian (0/34) · Old Turkish (0/4) · Paraguayan Guaraní (0/3) · Paumarí (0/2) · Pech (0/1) · Persian (5/16) · Phrygian (0/1) · Polish (7/46) · Pomak (0/3) · Portuguese (14/42) · Pushto (0/1) · Qafar (0/3) · Romanian (7/29) · Russian (13/38) · Sanskrit (3/12) · Saya (0/3) · Scottish Gaelic (1/5) · Serbian (2/7) · Sinhala (0/3) · Skolt Sami (1/5) · Slovak (3/9) · Slovenian (8/22) · Sonha (0/4) · South Levantine Arabic (0/4) · Spanish (11/31) · Spanish Sign Language (0/1) · Swedish (11/31) · Swedish Sign Language (3/8) · Swiss German (1/5) · Tagalog (1/10) · Tamil (4/16) · Tatar (0/3) · Telugu (2/9) · Thai (1/6) · Tswana (0/2) · Tupinambá (0/4) · Turkish (7/54) · Uighur (3/9) · Ukrainian (3/8) · Umbrian (0/3) · Upper Sorbian (2/7) · Urdu (2/10) · Urubú-Kaapor (0/4) · Uzbek (0/1) · Veps (0/2) · Vietnamese (3/11) · Warlpiri (1/6) · Welsh (1/5) · Western Armenian (0/4) · Western Frisian (0/4) · Wolof (1/8) · Xavánte (0/3) · Xibe (0/3) · Yakut (0/3) · Yoruba (1/6) · Yue Chinese (2/7) · Yupik (0/4) · Zacatlán-Ahuacatlán-Tepetzintla Nahuatl (0/3)
Treebank Collections: All · Acquis (0/7) · Alpino (0/1) · BulTreeBank (1) · CLARIN-PL (0/5) · DELPH-IN (0/2) · GEGO (0/4) · GNC (0/2) · GeoGram (0/4) · HunGram (0/4) · ISWOC (3/9) · JOS (0/1) · Menotec (0/7) · Mercurius (0/1) · NAOB (0/15) · NDT (0/6) · NorGram (0/58) · NorGramBank (0/40) · Ordbøkene (0/8) · POLFIE (0/23) · PROIEL (1/10) · PaHC (0/2) · ParGram (1/11) · ParTMA (0/15) · Sami-open (0/15) · Sami-restricted (0/7) · Sofie (0/9) · TIGER (0/3) · TOROT (0/22) · Universal Dependencies 1.1 (1/19) · Universal Dependencies 1.2 (4/36) · Universal Dependencies 1.3 (5/53) · Universal Dependencies 1.4 (7/63) · Universal Dependencies 2.0 (6/63) · Universal Dependencies 2.1 (10/103) · Universal Dependencies 2.12 (15/245) · Universal Dependencies 2.14 (19/283) · Universal Dependencies 2.15 (20/297) · Universal Dependencies 2.3 (11/130) · Universal Dependencies 2.5 (12/157) · Universal Dependencies 2.8 (13/200) · WolGram (0/3) · XPar (0/2)
Treebank Types: All · lfg (0/129) · constituency (0/19) · constituency-alpino (0/1) · dependency (3/48) · dependency-cg (35/1691) · dependency-tuebadz (0/1) · hpsg (0/2)
Show only Parallel Treebanks

Show custom treebank:
Click on a treebank name below to proceed. All selected treebanks will be available for viewing and searching. | Show treebank descriptions
Selected Name Collection Type Sentences Words Indexed License Downloads
all | none 248 711 4 293 618
Bulgarian (bul) 53 957   650 868
bul-ud-1.3-dep Universal Dependencies 1.3 dependency-cg 11 138 135 149 no unspecified no
bul-ud-1.4-dep Universal Dependencies 1.4 dependency-cg 11 138 135 149 no unspecified no
bul-ud-2.1-dep Universal Dependencies 2.1 dependency-cg 11 138 135 157 no unspecified no
bul-ud-2.5-dep Universal Dependencies 2.5 dependency-cg 11 138 135 157 no (Accepted) no
bul-ud-dep Universal Dependencies 1.1 dependency-cg 9 405 110 256 yes unspecified no
Gothic (got) 21 701   222 928
got-ud-1.3-dep Universal Dependencies 1.3 dependency-cg 5 450 56 134 no unspecified no
got-ud-1.4-dep Universal Dependencies 1.4 dependency-cg 5 450 56 134 no unspecified no
got-ud-2.1-dep Universal Dependencies 2.1 dependency-cg 5 400 55 324 no unspecified no
got-ud-2.5-dep Universal Dependencies 2.5 dependency-cg 5 401 55 336 no (Accepted) no
Lithuanian (lit) 4 168   66 985
lit-ud-2.1-dep Universal Dependencies 2.1 dependency-cg 263 4 383 no unspecified no
lit-ud-alksnis-2.5-dep Universal Dependencies 2.5 dependency-cg 3 642 58 219 no (Accepted) no
lit-ud-hse-2.5-dep Universal Dependencies 2.5 dependency-cg 263 4 383 no (Accepted) no
Portuguese (por) 99 158   2 050 569
por-cge1-dep ISWOC dependency 745 10 806 yes CC-BY-NC-SA no
por-cge2-dep ISWOC dependency 641 9 886 yes CC-BY-NC-SA no
por-coutdec-v-8-dep ISWOC dependency 641 12 598 yes CC-BY-NC-SA no
por-ud-1.3-dep Universal Dependencies 1.3 dependency-cg 9 359 197 002 no unspecified no
por-ud-1.4-dep Universal Dependencies 1.4 dependency-cg 9 359 180 793 no unspecified no
por-ud-2.1-dep Universal Dependencies 2.1 dependency-cg 9 368 181 705 no unspecified no
por-ud-bosque-1.4-dep Universal Dependencies 1.4 dependency-cg 9 368 198 451 no unspecified no
por-ud-bosque-2.5-dep Universal Dependencies 2.5 dependency-cg 9 365 181 704 no (Accepted) no
por-ud-br-1.3-dep Universal Dependencies 1.3 dependency-cg 12 078 260 126 no unspecified no
por-ud-br-1.4-dep Universal Dependencies 1.4 dependency-cg 12 078 260 126 no unspecified no
por-ud-br-2.1-dep Universal Dependencies 2.1 dependency-cg 12 078 259 281 no unspecified no
por-ud-gsd-2.5-dep Universal Dependencies 2.5 dependency-cg 12 078 259 263 no (Accepted) no
por-ud-pud-2.1-dep Universal Dependencies 2.1 dependency-cg 1 000 19 414 no unspecified no
por-ud-pud-2.5-dep Universal Dependencies 2.5 dependency-cg 1 000 19 414 no (Accepted) no
Romanian (ron) 52 452   998 841
ron-ud-1.3-dep Universal Dependencies 1.3 dependency-cg 6 347 126 348 no unspecified no
ron-ud-1.4-dep Universal Dependencies 1.4 dependency-cg 9 523 190 955 no unspecified no
ron-ud-2.1-dep Universal Dependencies 2.1 dependency-cg 9 524 190 950 no unspecified no
ron-ud-nonstandard-2.1-dep Universal Dependencies 2.1 dependency-cg 1 200 17 521 no unspecified no
ron-ud-nonstandard-2.5-dep Universal Dependencies 2.5 dependency-cg 15 843 269 531 no (Accepted) no
ron-ud-rrt-2.5-dep Universal Dependencies 2.5 dependency-cg 9 524 190 949 no (Accepted) no
ron-ud-simonero-2.5-dep Universal Dependencies 2.5 dependency-cg 491 12 587 no (Accepted) no
Serbian (srp) 8 275   162 126
srp-ud-2.1-dep Universal Dependencies 2.1 dependency-cg 3 891 76 354 no unspecified no
srp-ud-2.5-dep Universal Dependencies 2.5 dependency-cg 4 384 85 772 no (Accepted) no
Vietnamese (vie) 9 000   141 301
vie-ud-1.4-dep Universal Dependencies 1.4 dependency-cg 3 000 47 101 no unspecified no
vie-ud-2.1-dep Universal Dependencies 2.1 dependency-cg 3 000 47 100 no unspecified no
vie-ud-2.5-dep Universal Dependencies 2.5 dependency-cg 3 000 47 100 no (Accepted) no