INESS-logo
Treebank Selection

Treebanks

Tools


Select a set of treebanks to work with. ?
Languages: All · Afrikaans (0/1) · Ancient Greek (to 1453) (2/13) · Arabic (1/7) · Basque (1/6) · Belarusian (0/1) · Bulgarian (1/7) · Buriat (0/1) · Catalan (1/4) · Chinese (1/7) · Church Slavic (1/8) · Classical Armenian (0/1) · Coptic (0/2) · Croatian (1/7) · Czech (3/16) · Danish (1/7) · Dutch (2/9) · English (3/21) · Estonian (1/6) · Finnish (2/13) · French (3/12) · Galician (2/7) · Georgian (0/5) · German (1/16) · Gothic (1/6) · Hebrew (1/6) · Hindi (1/6) · Hungarian (1/8) · Icelandic (0/1) · Indonesian (1/8) · Irish (1/6) · Italian (2/11) · Japanese (1/4) · Kazakh (1/4) · Korean (0/1) · Latin (3/19) · Latvian (1/4) · Lithuanian (0/1) · Marathi (0/1) · Modern Greek (1453-) (1/7) · (0/1) · Northern Kurdish (0/1) · Northern Sami (0/16) · Norwegian Bokmål (1/9) · Norwegian Nynorsk (1/3) · Old English (ca. 450-1100) (0/5) · Old French (842-ca. 1400) (0/1) · Old Norse (0/4) · Old Russian (0/20) · Persian (1/6) · Polish (1/11) · Portuguese (2/15) · Romanian (1/6) · Russian (2/10) · Sanskrit (0/2) · Serbian (0/1) · Slovak (1/3) · Slovenian (2/10) · Spanish (2/10) · Swedish (2/12) · Swedish Sign Language (0/2) · Tamil (0/4) · Telugu (0/1) · Turkish (1/6) · Uighur (1/3) · Ukrainian (1/3) · Upper Sorbian (0/1) · Urdu (1/4) · Vietnamese (1/3) · Wolof (0/1) · Yue Chinese (0/1)
Treebank Collections: All · BulTreeBank (1) · CLARIN-PL (0/3) · GEGO (0/4) · GeoGram (0/2) · HunGram (0/2) · ISWOC (5/9) · JOS (0/1) · Menotec (4) · Mercurius (0/1) · NorGram (0/3) · POLFIE (0/5) · PROIEL (7/10) · ParGram (0/12) · ParTMA (0/14) · Sami-open (15) · Sofie (0/9) · TOROT (22) · TiGer (0/3) · Universal Dependencies 1.1 (11/19) · Universal Dependencies 1.2 (20/36) · Universal Dependencies 1.3 (28/53) · Universal Dependencies 1.4 (34/63) · Universal Dependencies 2.0 (34/63) · Universal Dependencies 2.1 (61/103) · WolGram (0/1)
Treebank Types: All · lfg (1/3) · constituency (0/13) · dependency (0/45) · dependency-cg (34/354)
Show only Parallel Treebanks

Show custom treebank:
Click on a treebank name below to proceed. All selected treebanks will be available for viewing and searching. | Show treebank descriptions
Selected Name Collection Type Sentences Words Indexed License Downloads
all | none 233 116 3 593 785
Arabic (ara) 6 984   202 789
ara-ud-2.0-dep Universal Dependencies 2.0 dependency-cg 6 984 202 789 yes unspecified no
Bulgarian (bul) 10 022   121 588
bul-ud-2.0-dep Universal Dependencies 2.0 dependency-cg 10 022 121 588 yes unspecified no
Church Slavic (chu) 5 196   47 532
chu-ud-2.0-dep Universal Dependencies 2.0 dependency-cg 5 196 47 532 yes unspecified no
English (eng) 19 785   298 898
eng-ud-2.0-dep Universal Dependencies 2.0 dependency-cg 14 545 205 541 yes unspecified no
eng-ud-lin-es-2.0-dep Universal Dependencies 2.0 dependency-cg 3 650 59 406 yes unspecified no
eng-ud-par-tut-2.0-dep Universal Dependencies 2.0 dependency-cg 1 590 33 951 yes unspecified no
Basque (eus) 7 194   81 539
eus-ud-2.0-dep Universal Dependencies 2.0 dependency-cg 7 194 81 539 yes unspecified no
Persian (fas) 5 397   128 862
fas-ud-2.0-dep Universal Dependencies 2.0 dependency-cg 5 397 128 862 yes unspecified no
Finnish (fin) 30 437   279 001
fin-ud-2.0-dep Universal Dependencies 2.0 dependency-cg 13 581 155 823 yes unspecified no
fin-ud-ftb-2.0-dep Universal Dependencies 2.0 dependency-cg 16 856 123 178 yes unspecified no
French (fra) 19 294   406 826
fra-ud-2.0-dep Universal Dependencies 2.0 dependency-cg 16 031 338 773 yes unspecified no
fra-ud-par-tut-2.0-dep Universal Dependencies 2.0 dependency-cg 620 15 680 yes unspecified no
fra-ud-sequoia-2.0-dep Universal Dependencies 2.0 dependency-cg 2 643 52 373 yes unspecified no
Irish (gle) 566   12 582
gle-ud-2.0-dep Universal Dependencies 2.0 dependency-cg 566 12 582 yes unspecified no
Galician (glg) 3 739   111 658
glg-ud-2.0-dep Universal Dependencies 2.0 dependency-cg 3 139 99 018 yes unspecified no
glg-ud-treegal-2.0-dep Universal Dependencies 2.0 dependency-cg 600 12 640 yes unspecified no
Gothic (got) 4 372   45 142
got-ud-2.0-dep Universal Dependencies 2.0 dependency-cg 4 372 45 142 yes unspecified no
Croatian (hrv) 8 289   162 417
hrv-ud-2.0-dep Universal Dependencies 2.0 dependency-cg 8 289 162 417 yes unspecified no
Indonesian (ind) 5 036   95 252
ind-ud-2.0-dep Universal Dependencies 2.0 dependency-cg 5 036 95 252 yes unspecified no
Japanese (jpn) 7 675   171 410
jpn-ud-2.0-dep Universal Dependencies 2.0 dependency-cg 7 675 171 410 yes unspecified no
Kazakh (kaz) 31   417
kaz-ud-2.0-dep Universal Dependencies 2.0 dependency-cg 31 417 yes unspecified no
Latin (lat) 33 166   417 156
lat-ud-2.0-dep Universal Dependencies 2.0 dependency-cg 1 334 15 582 yes unspecified no
lat-ud-ittb-2.0-dep Universal Dependencies 2.0 dependency-cg 16 508 242 167 yes unspecified no
lat-ud-proiel-2.0-dep Universal Dependencies 2.0 dependency-cg 15 324 159 407 yes unspecified no
Latvian (lav) 3 054   37 074
lav-ud-2.0-dep Universal Dependencies 2.0 dependency-cg 3 054 37 074 yes unspecified no
Norwegian Nynorsk (nno) 16 064   247 972
nno-ud-2.0-dep Universal Dependencies 2.0 dependency-cg 16 064 247 972 yes unspecified no
Norwegian Bokmål (nob) 18 106   249 060
nob-ud-2.0-dep Universal Dependencies 2.0 dependency-cg 18 106 249 060 yes unspecified no
Romanian (ron) 8 795   176 676
ron-ud-2.0-dep Universal Dependencies 2.0 dependency-cg 8 795 176 676 yes unspecified no
Swedish (swe) 8 457   127 116
swe-ud-2.0-dep Universal Dependencies 2.0 dependency-cg 4 807 69 066 yes unspecified no
swe-ud-lin-es-2.0-dep Universal Dependencies 2.0 dependency-cg 3 650 58 050 yes unspecified no
Turkish (tur) 4 660   38 259
tur-ud-2.0-dep Universal Dependencies 2.0 dependency-cg 4 660 38 259 yes unspecified no
Uighur (uig) 100   1 568
uig-ud-2.0-dep Universal Dependencies 2.0 dependency-cg 100 1 568 yes unspecified no
Vietnamese (vie) 2 200   34 389
vie-ud-2.0-dep Universal Dependencies 2.0 dependency-cg 2 200 34 389 yes unspecified no
Chinese (zho) 4 497   98 602
zho-ud-2.0-dep Universal Dependencies 2.0 dependency-cg 4 497 98 602 yes unspecified no

Design & implementation: Paul Meurer, CLARINO Bergen Centre, 2019