INESS :: Treebank Selection

Treebank Selection

Select a set of treebanks to work with. ?

Click on a treebank name below to proceed. All selected treebanks will be available for viewing and searching. | Show treebank descriptions

Selected	Name	Collection	Type	Sentences	Words	Indexed	Description	License	Downloads
all \| none				15 767 321	218 271 868
	Hungarian (hun)			4 627	87 791
	hun-pargram (aligned)	hun-pargram (aligned)	HunGram, ParGram	lfg	49	281	yes	The ParGram collection is a collection of parallel treebanks covering a set of chosen syntactic constructions. The ParGram collection is a collaborative effort of the ParGram project, along with the ParSem project, by researcher groups in industrial and academic institutions around the world. The aim of ParGram is to produce wide coverage grammars for a variety of languages. These are written collaboratively within the linguistic framework of LFG (Lexical Functional Grammar) and with a commonly-agreed-upon set of grammatical features. The XLE (Xerox Linguistic Environment) is used as a development platform. ParSem develops semantic structures based on the ParGram syntactic structures. Most of the ParSem systems use the XLE’s XFR system. Regular semiannual meetings are being held to bring together the various research groups involved in ParGram and ParSem. [less] The ParGram collection is a collection of parallel treebanks covering a set of chosen syntactic cons… [more]	CC-BY	no
	hun-partma	hun-partma	HunGram, ParTMA	lfg	45	156	yes	The ParTMA collection is a collaborative effort by researcher groups in academic institutions around the world. The aim of ParTMA is to produce parallel treebanks that cover constructions relevant for the semantics of Tense, Mode and Aspect. The treebank sentences are analyzed with the grammars in the ParGram project. [less] The ParTMA collection is a collaborative effort by researcher groups in academic institutions around… [more]	CC-BY	no
	hun-pron	hun-pron	HunGram	lfg	83	261	no		(Accepted)	no
	hun-ud-2.0-dep	hun-ud-2.0-dep	Universal Dependencies 2.0	dependency-cg	1 351	27 551	no	The “Universal Dependencies 2.0” collection is searchable at the INESS portal; to read further details about the original collection as a whole, or about individual treebanks in the collections, we refer to the original, which is located at the LINDAT/CLARIN Centre for Language Research Infrastructure (http://hdl.handle.net/11234/1-1983). The individual treebanks have individual licenses, which are available through the joint license “Universal Dependencies v2.0 License Agreement”. Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual parser development, cross-lingual learning, and parsing research from a language typology perspective. The annotation scheme is based on (universal) Stanford dependencies (de Marneffe et al., 2006, 2008, 2014), Google universal part-of-speech tags (Petrov et al., 2012), and the Interset interlingua for morphosyntactic tagsets (Zeman, 2008). [less] The “Universal Dependencies 2.0” collection is searchable at the INESS portal; to read further detai… [more]	unspecified	no
	hun-ud-2.1-dep	hun-ud-2.1-dep	Universal Dependencies 2.1	dependency-cg	1 800	36 644	no	The “Universal Dependencies 2.1” collection is searchable at the INESS portal; to read further details about the original collection as a whole, or about individual treebanks in the collections, we refer to the original, which is located at the LINDAT/CLARIN Centre for Language Research Infrastructure (http://hdl.handle.net/11234/1-2515). The individual treebanks have individual licenses, which are available through the joint license “Universal Dependencies v2.1 License Agreement”. Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual parser development, cross-lingual learning, and parsing research from a language typology perspective. The annotation scheme is based on (universal) Stanford dependencies (de Marneffe et al., 2006, 2008, 2014), Google universal part-of-speech tags (Petrov et al., 2012), and the Interset interlingua for morphosyntactic tagsets (Zeman, 2008). [less] The “Universal Dependencies 2.1” collection is searchable at the INESS portal; to read further detai… [more]	unspecified	no
	hun-ud-dep	hun-ud-dep	Universal Dependencies 1.1	dependency-cg	1 299	22 898	yes	The "Universal Dependencies 1.1 - Hungarian" is part of the Universal Dependencies 1.1 collection, which is searchable at the INESS portal; to read further details about the original collection as a whole, or about individual treebanks in the collections, we refer to the original, which is located at the LINDAT/CLARIN Centre for Language Research Infrastructure (http://hdl.handle.net/11234/LRT-1478). The individual treebanks have individual licenses, and the specific license and conditions of use for each treebank are given in the joint license "Universal Dependencies v1.1 License Agreement". In common for all the licenses is that they are in the public domain (some Creative Commons licenses, some GPL licenses). Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual parser development, cross-lingual learning, and parsing research from a language typology perspective. The annotation scheme is based on (universal) Stanford dependencies (de Marneffe et al., 2006, 2008, 2014), Google universal part-of-speech tags (Petrov et al., 2012), and the Interset interlingua for morphosyntactic tagsets (Zeman, 2008). This is the second release of UD Treebanks; a newer version 1.3 is also available. [less] The "Universal Dependencies 1.1 - Hungarian" is part of the Universal Dependencies 1.1 collection, w… [more]	unspecified	no
	Latin (lat)			84 159	1 029 383
	lat-caes-gal-dep	lat-caes-gal-dep	PROIEL	dependency	1 447	26 707	yes	Julius Caesar’s account of the Gallic war, written 58–49 BC. Edition: T. Rice Holmes (1914): C. Iuli Commentarii Rerum in Gallia Gestarum VII A. Hirti Commentarius VII. Oxford: Oxford University Press. Electronic edition: Gregory Crane: De bello gallico. Perseus Digital Library. Tufts University, Medford MA. [less] Julius Caesar’s account of the Gallic war, written 58–49 BC. Edition: T. Rice Holmes (1914): C. Iuli… [more]	CC-BY-NC-SA	no
	lat-cic-att-dep	lat-cic-att-dep	PROIEL	dependency	3 596	40 183	yes	A collection of letters from Marcus Tullius Cicero to Titus Pomponius Atticus, written 68–44 BC. Edition: L. C. Purser (1901): Epistulae ad Atticum. Oxford: Oxford University Press. Electronic edition: Gregory Crane: Letters to Atticus. Perseus Digital Library, Tufts University, Medford MA. [less] A collection of letters from Marcus Tullius Cicero to Titus Pomponius Atticus, written 68–44 BC. Edi… [more]	CC-BY-NC-SA	no
	lat-latin-nt-dep	lat-latin-nt-dep	PROIEL	dependency	9 036	80 186	yes	Late fourth-century Latin New Testament translation. Electronic text: Perseus Project: Vulgate, Perseus_text_1999.02.0060.xml. Tufts University, Medford MA. [less] Late fourth-century Latin New Testament translation. Electronic text: Perseus Project: Vulgate, Pers… [more]	CC-BY-NC-SA	no
	lat-per-aeth-dep	lat-per-aeth-dep	PROIEL	dependency	921	17 525	yes	Fourth-century account of a pilgrimage to the Holy Land. Edition: Wilhelm Heraeus (1908): Silviae vel potius Aetheriae peregrinatio. Heidelberg: Carl Winter. Electronic edition: Itinerarium vel peregrinatio ad loca sancta. Bibliotheca Augustana. [less] Fourth-century account of a pilgrimage to the Holy Land. Edition: Wilhelm Heraeus (1908): Silviae ve… [more]	CC-BY-NC-SA	no
	lat-ud-2.0-dep	lat-ud-2.0-dep	Universal Dependencies 2.0	dependency-cg	1 334	15 582	no	The “Universal Dependencies 2.0” collection is searchable at the INESS portal; to read further details about the original collection as a whole, or about individual treebanks in the collections, we refer to the original, which is located at the LINDAT/CLARIN Centre for Language Research Infrastructure (http://hdl.handle.net/11234/1-1983). The individual treebanks have individual licenses, which are available through the joint license “Universal Dependencies v2.0 License Agreement”. Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual parser development, cross-lingual learning, and parsing research from a language typology perspective. The annotation scheme is based on (universal) Stanford dependencies (de Marneffe et al., 2006, 2008, 2014), Google universal part-of-speech tags (Petrov et al., 2012), and the Interset interlingua for morphosyntactic tagsets (Zeman, 2008). [less] The “Universal Dependencies 2.0” collection is searchable at the INESS portal; to read further detai… [more]	unspecified	no
	lat-ud-2.1-dep	lat-ud-2.1-dep	Universal Dependencies 2.1	dependency-cg	2 273	24 742	no	The “Universal Dependencies 2.1” collection is searchable at the INESS portal; to read further details about the original collection as a whole, or about individual treebanks in the collections, we refer to the original, which is located at the LINDAT/CLARIN Centre for Language Research Infrastructure (http://hdl.handle.net/11234/1-2515). The individual treebanks have individual licenses, which are available through the joint license “Universal Dependencies v2.1 License Agreement”. Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual parser development, cross-lingual learning, and parsing research from a language typology perspective. The annotation scheme is based on (universal) Stanford dependencies (de Marneffe et al., 2006, 2008, 2014), Google universal part-of-speech tags (Petrov et al., 2012), and the Interset interlingua for morphosyntactic tagsets (Zeman, 2008). [less] The “Universal Dependencies 2.1” collection is searchable at the INESS portal; to read further detai… [more]	unspecified	no
	lat-ud-ittb-2.0-dep	lat-ud-ittb-2.0-dep	Universal Dependencies 2.0	dependency-cg	16 508	242 167	no	The “Universal Dependencies 2.0” collection is searchable at the INESS portal; to read further details about the original collection as a whole, or about individual treebanks in the collections, we refer to the original, which is located at the LINDAT/CLARIN Centre for Language Research Infrastructure (http://hdl.handle.net/11234/1-1983). The individual treebanks have individual licenses, which are available through the joint license “Universal Dependencies v2.0 License Agreement”. Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual parser development, cross-lingual learning, and parsing research from a language typology perspective. The annotation scheme is based on (universal) Stanford dependencies (de Marneffe et al., 2006, 2008, 2014), Google universal part-of-speech tags (Petrov et al., 2012), and the Interset interlingua for morphosyntactic tagsets (Zeman, 2008). [less] The “Universal Dependencies 2.0” collection is searchable at the INESS portal; to read further detai… [more]	unspecified	no
	lat-ud-ittb-2.1-dep	lat-ud-ittb-2.1-dep	Universal Dependencies 2.1	dependency-cg	17 258	251 317	no	The “Universal Dependencies 2.1” collection is searchable at the INESS portal; to read further details about the original collection as a whole, or about individual treebanks in the collections, we refer to the original, which is located at the LINDAT/CLARIN Centre for Language Research Infrastructure (http://hdl.handle.net/11234/1-2515). The individual treebanks have individual licenses, which are available through the joint license “Universal Dependencies v2.1 License Agreement”. Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual parser development, cross-lingual learning, and parsing research from a language typology perspective. The annotation scheme is based on (universal) Stanford dependencies (de Marneffe et al., 2006, 2008, 2014), Google universal part-of-speech tags (Petrov et al., 2012), and the Interset interlingua for morphosyntactic tagsets (Zeman, 2008). [less] The “Universal Dependencies 2.1” collection is searchable at the INESS portal; to read further detai… [more]	unspecified	no
	lat-ud-proiel-2.0-dep	lat-ud-proiel-2.0-dep	Universal Dependencies 2.0	dependency-cg	15 324	159 407	no	The “Universal Dependencies 2.0” collection is searchable at the INESS portal; to read further details about the original collection as a whole, or about individual treebanks in the collections, we refer to the original, which is located at the LINDAT/CLARIN Centre for Language Research Infrastructure (http://hdl.handle.net/11234/1-1983). The individual treebanks have individual licenses, which are available through the joint license “Universal Dependencies v2.0 License Agreement”. Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual parser development, cross-lingual learning, and parsing research from a language typology perspective. The annotation scheme is based on (universal) Stanford dependencies (de Marneffe et al., 2006, 2008, 2014), Google universal part-of-speech tags (Petrov et al., 2012), and the Interset interlingua for morphosyntactic tagsets (Zeman, 2008). [less] The “Universal Dependencies 2.0” collection is searchable at the INESS portal; to read further detai… [more]	unspecified	no
	lat-ud-proiel-2.1-dep	lat-ud-proiel-2.1-dep	Universal Dependencies 2.1	dependency-cg	16 462	171 567	no	The “Universal Dependencies 2.1” collection is searchable at the INESS portal; to read further details about the original collection as a whole, or about individual treebanks in the collections, we refer to the original, which is located at the LINDAT/CLARIN Centre for Language Research Infrastructure (http://hdl.handle.net/11234/1-2515). The individual treebanks have individual licenses, which are available through the joint license “Universal Dependencies v2.1 License Agreement”. Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual parser development, cross-lingual learning, and parsing research from a language typology perspective. The annotation scheme is based on (universal) Stanford dependencies (de Marneffe et al., 2006, 2008, 2014), Google universal part-of-speech tags (Petrov et al., 2012), and the Interset interlingua for morphosyntactic tagsets (Zeman, 2008). [less] The “Universal Dependencies 2.1” collection is searchable at the INESS portal; to read further detai… [more]	unspecified	no
	Norwegian Bokmål (nob)			15 659 831	216 984 121
	nob-avis	nob-avis	NorGram, NorGramBank	lfg	246 397	3 157 550	yes	The "NorGramBank – Newspaper text (years 2012, 2013) in Norwegian Bokmål from the Norwegian Newspaper Corpus" treebank is a syntactically annotated corpus based on data taken from the years 2012 and 2013 from the Norwegian Newspaper Corpus (NCC). This treebank is part of INESS NorGramBank collection (see URL in metadata). As of October 2015, the treebank comprises 246397 sentences, 3157558 words and 1543 documents. Note that the available treebank contains only those newspaper articles from 2012 and 2013 that have been manually preprocessed; see details otherwheres in the metadata. [less] The "NorGramBank – Newspaper text (years 2012, 2013) in Norwegian Bokmål from the Norwegian Newspape… [more]	CC-BY	no
	nob-child	nob-child	NorGram, NorGramBank	lfg	389 557	4 110 961	yes	The "NorGramBank children’s fiction in Norwegian Bokmål" treebank is a syntactically annotated corpus based on data taken from bokhylla.no at the National Library of Norway. This treebank is part of INESS NorGramBank collection (see URL in metadata). As of October 2015, the treebank comprises 389564 sentences, 4111213 words and 155 documents. The source text was OCR-read by the National Library of Norway; INESS has preprocessed the source text semi-automatically with regard to OCR errors (misinterpreted letters etc) before syntactic parsing. [less] The "NorGramBank children’s fiction in Norwegian Bokmål" treebank is a syntactically annotated corpu… [more]	CLARIN_ACA	no
	nob-fn	nob-fn	NorGram, NorGramBank	lfg	489 341	8 321 494	yes	The "NorGram Non-fiction text in Norwegian Bokmål from Forskning.no" treebank is a syntactically annotated corpus based on data taken from the Norwegian popular science website Forskning.no. This treebank is part of INESS NorGramBank collection (see URL in metadata). As of October 2015, the treebank comprises 489341 sentences, 8321480 words and 13243 documents. [less] The "NorGram Non-fiction text in Norwegian Bokmål from Forskning.no" treebank is a syntactically ann… [more]	CLARIN_RES-DEP	no
	nob-jrc-acquis (aligned)	nob-jrc-acquis (aligned)	Acquis, NorGram	lfg	101	1 862	yes	The Norwegian part of the META-NORD Acquis Parallel Treebank.	CC-BY	no
	nob-lbk-av	nob-lbk-av	NorGram, NorGramBank	lfg	1 336	18 971	yes	The "NorGramBank Newspaper text in Norwegian Bokmål from the LBK" treebank is a syntactically annotated corpus based on data taken from the Norwegian reference corpus for Norwegian Bokmål, Leksikografisk Bokmålskorpus (LBK). This treebank is part of INESS NorGramBank collection (see URL in metadata). As of October 2015, the treebank comprises 173914 sentences, 2661597 words and 599 documents. [less] The "NorGramBank Newspaper text in Norwegian Bokmål from the LBK" treebank is a syntactically annota… [more]	CLARIN_ACA-NC	no
	nob-lbk-sa	nob-lbk-sa	NorGram, NorGramBank	lfg	189 137	2 911 173	yes	The "NorGramBank non-fiction text in Norwegian Bokmål from the LBK" treebank is a syntactically annotated corpus based on data taken from the Norwegian reference corpus for Norwegian Bokmål, Leksikografisk Bokmålskorpus (LBK). This treebank is part of INESS NorGramBank collection (see URL in metadata). As of October 2015, the treebank comprises 173914 sentences, 2661597 words and 599 documents. [less] The "NorGramBank non-fiction text in Norwegian Bokmål from the LBK" treebank is a syntactically anno… [more]	CLARIN_ACA-NC	no
	nob-lbk-tv	nob-lbk-tv	NorGram, NorGramBank	lfg	18 043	127 844	yes	The "NorGramBank television subtitles in Norwegian Bokmål from LBK" treebank is a syntactically annotated corpus based on data taken from the Norwegian reference corpus for Norwegian Bokmål, Leksikografisk Bokmålskorpus (LBK). This treebank is part of INESS NorGramBank collection (see URL in metadata). As of October 2015, the treebank comprises 18043 sentences, 127844 words and 16 documents. [less] The "NorGramBank television subtitles in Norwegian Bokmål from LBK" treebank is a syntactically anno… [more]	CLARIN_ACA	no
	nob-naob	nob-naob	NAOB, NorGram, NorGramBank	lfg	678 773	9 885 233	yes	The "NorGramBank fiction in Norwegian Bokmål" treebank is a syntactically annotated corpus based on data taken from bokhylla.no at the National Library of Norway. This treebank is part of INESS NorGramBank collection (see URL in metadata). As of October 2015, the treebank comprises 2 469 916 sentences and 26 903 637 words. The source text was OCR-read by the National Library of Norway; INESS has preprocessed the source text semi-automatically with regard to OCR errors (misinterpreted letters etc) before syntactic parsing. [less] The "NorGramBank fiction in Norwegian Bokmål" treebank is a syntactically annotated corpus based on … [more]	CLARIN_ACA	no
	nob-naob_1	nob-naob_1	NAOB, NorGram, NorGramBank	lfg	621 622	9 207 093	yes	The "NorGramBank fiction in Norwegian Bokmål" treebank is a syntactically annotated corpus based on data taken from bokhylla.no at the National Library of Norway. This treebank is part of INESS NorGramBank collection (see URL in metadata). As of October 2015, the treebank comprises 2 469 916 sentences and 26 903 637 words. The source text was OCR-read by the National Library of Norway; INESS has preprocessed the source text semi-automatically with regard to OCR errors (misinterpreted letters etc) before syntactic parsing. [less] The "NorGramBank fiction in Norwegian Bokmål" treebank is a syntactically annotated corpus based on … [more]	CLARIN_ACA	no
	nob-naob_2	nob-naob_2	NAOB, NorGram, NorGramBank	lfg	999 909	15 414 738	yes	The "NorGramBank fiction in Norwegian Bokmål" treebank is a syntactically annotated corpus based on data taken from bokhylla.no at the National Library of Norway. This treebank is part of INESS NorGramBank collection (see URL in metadata). As of October 2015, the treebank comprises 2 469 916 sentences and 26 903 637 words. The source text was OCR-read by the National Library of Norway; INESS has preprocessed the source text semi-automatically with regard to OCR errors (misinterpreted letters etc) before syntactic parsing. [less] The "NorGramBank fiction in Norwegian Bokmål" treebank is a syntactically annotated corpus based on … [more]	CLARIN_ACA	no
	nob-naob_3	nob-naob_3	NAOB, NorGram, NorGramBank	lfg	1 045 847	16 020 711	yes	The "NorGramBank fiction in Norwegian Bokmål" treebank is a syntactically annotated corpus based on data taken from bokhylla.no at the National Library of Norway. This treebank is part of INESS NorGramBank collection (see URL in metadata). As of October 2015, the treebank comprises 2 469 916 sentences and 26 903 637 words. The source text was OCR-read by the National Library of Norway; INESS has preprocessed the source text semi-automatically with regard to OCR errors (misinterpreted letters etc) before syntactic parsing. [less] The "NorGramBank fiction in Norwegian Bokmål" treebank is a syntactically annotated corpus based on … [more]	CLARIN_ACA	no
	nob-naob_4	nob-naob_4	NAOB, NorGram, NorGramBank	lfg	1 077 587	16 643 145	yes	The "NorGramBank fiction in Norwegian Bokmål" treebank is a syntactically annotated corpus based on data taken from bokhylla.no at the National Library of Norway. This treebank is part of INESS NorGramBank collection (see URL in metadata). As of October 2015, the treebank comprises 2 469 916 sentences and 26 903 637 words. The source text was OCR-read by the National Library of Norway; INESS has preprocessed the source text semi-automatically with regard to OCR errors (misinterpreted letters etc) before syntactic parsing. [less] The "NorGramBank fiction in Norwegian Bokmål" treebank is a syntactically annotated corpus based on … [more]	CLARIN_ACA	no
	nob-naob_5	nob-naob_5	NAOB, NorGram, NorGramBank	lfg	953 413	14 093 832	yes	The "NorGramBank fiction in Norwegian Bokmål" treebank is a syntactically annotated corpus based on data taken from bokhylla.no at the National Library of Norway. This treebank is part of INESS NorGramBank collection (see URL in metadata). As of October 2015, the treebank comprises 2 469 916 sentences and 26 903 637 words. The source text was OCR-read by the National Library of Norway; INESS has preprocessed the source text semi-automatically with regard to OCR errors (misinterpreted letters etc) before syntactic parsing. [less] The "NorGramBank fiction in Norwegian Bokmål" treebank is a syntactically annotated corpus based on … [more]	CLARIN_ACA	no
	nob-naob_6	nob-naob_6	NAOB, NorGram, NorGramBank	lfg	942 721	12 090 392	yes	The "NorGramBank fiction in Norwegian Bokmål" treebank is a syntactically annotated corpus based on data taken from bokhylla.no at the National Library of Norway. This treebank is part of INESS NorGramBank collection (see URL in metadata). As of October 2015, the treebank comprises 2 469 916 sentences and 26 903 637 words. The source text was OCR-read by the National Library of Norway; INESS has preprocessed the source text semi-automatically with regard to OCR errors (misinterpreted letters etc) before syntactic parsing. [less] The "NorGramBank fiction in Norwegian Bokmål" treebank is a syntactically annotated corpus based on … [more]	CLARIN_ACA	no
	nob-naob_7	nob-naob_7	NAOB, NorGram, NorGramBank	lfg	1 063 592	14 424 554	yes	The "NorGramBank fiction in Norwegian Bokmål" treebank is a syntactically annotated corpus based on data taken from bokhylla.no at the National Library of Norway. This treebank is part of INESS NorGramBank collection (see URL in metadata). As of October 2015, the treebank comprises 2 469 916 sentences and 26 903 637 words. The source text was OCR-read by the National Library of Norway; INESS has preprocessed the source text semi-automatically with regard to OCR errors (misinterpreted letters etc) before syntactic parsing. [less] The "NorGramBank fiction in Norwegian Bokmål" treebank is a syntactically annotated corpus based on … [more]	CLARIN_ACA	no
	nob-naob_8	nob-naob_8	NAOB, NorGram, NorGramBank	lfg	438 743	6 708 000	yes	The "NorGramBank fiction in Norwegian Bokmål" treebank is a syntactically annotated corpus based on data taken from bokhylla.no at the National Library of Norway. This treebank is part of INESS NorGramBank collection (see URL in metadata). As of October 2015, the treebank comprises 2 469 916 sentences and 26 903 637 words. The source text was OCR-read by the National Library of Norway; INESS has preprocessed the source text semi-automatically with regard to OCR errors (misinterpreted letters etc) before syntactic parsing. [less] The "NorGramBank fiction in Norwegian Bokmål" treebank is a syntactically annotated corpus based on … [more]	CLARIN_ACA	no
	nob-naob_9	nob-naob_9	NAOB, NorGram, NorGramBank	lfg	73 041	1 434 798	yes	The "NorGramBank fiction in Norwegian Bokmål" treebank is a syntactically annotated corpus based on data taken from bokhylla.no at the National Library of Norway. This treebank is part of INESS NorGramBank collection (see URL in metadata). As of October 2015, the treebank comprises 2 469 916 sentences and 26 903 637 words. The source text was OCR-read by the National Library of Norway; INESS has preprocessed the source text semi-automatically with regard to OCR errors (misinterpreted letters etc) before syntactic parsing. [less] The "NorGramBank fiction in Norwegian Bokmål" treebank is a syntactically annotated corpus based on … [more]	CLARIN_ACA	no
	nob-naob-dan	nob-naob-dan	NAOB	lfg	941 702	14 205 518	no	The "NorGramBank fiction in older Norwegian Bokmål" treebank is a syntactically annotated corpus based on data taken from bokhylla.no at the National Library of Norway. This treebank is part of INESS NorGramBank collection (see URL in metadata). As of October 2015, the treebank comprises 2 469 916 sentences and 26 903 637 words. The source text was OCR-read by the National Library of Norway; INESS has preprocessed the source text semi-automatically with regard to OCR errors (misinterpreted letters etc) before syntactic parsing. [less] The "NorGramBank fiction in older Norwegian Bokmål" treebank is a syntactically annotated corpus bas… [more]	CLARIN_ACA	no
	nob-naob-dan_1	nob-naob-dan_1	NAOB	lfg	1 104 420	16 080 811	no	The "NorGramBank fiction in older Norwegian Bokmål" treebank is a syntactically annotated corpus based on data taken from bokhylla.no at the National Library of Norway. This treebank is part of INESS NorGramBank collection (see URL in metadata). As of October 2015, the treebank comprises 2 469 916 sentences and 26 903 637 words. The source text was OCR-read by the National Library of Norway; INESS has preprocessed the source text semi-automatically with regard to OCR errors (misinterpreted letters etc) before syntactic parsing. [less] The "NorGramBank fiction in older Norwegian Bokmål" treebank is a syntactically annotated corpus bas… [more]	CLARIN_ACA	no
	nob-naob-dan_2	nob-naob-dan_2	NAOB	lfg	312 743	4 399 173	no	The "NorGramBank fiction in older Norwegian Bokmål" treebank is a syntactically annotated corpus based on data taken from bokhylla.no at the National Library of Norway. This treebank is part of INESS NorGramBank collection (see URL in metadata). As of October 2015, the treebank comprises 2 469 916 sentences and 26 903 637 words. The source text was OCR-read by the National Library of Norway; INESS has preprocessed the source text semi-automatically with regard to OCR errors (misinterpreted letters etc) before syntactic parsing. [less] The "NorGramBank fiction in older Norwegian Bokmål" treebank is a syntactically annotated corpus bas… [more]	CLARIN_ACA	no
	nob-ndt-lfg	nob-ndt-lfg	NDT, NorGram, NorGramBank	lfg	20 045	276 943	yes	The treebank "NorGram NDT in LFG in Norwegian Bokmål (derivate from the Norwegian Dependency Treebank)" is based on the text material in the Norwegian Dependency Treebank (NDT), available from Språkbanken at National Library of Norway. The sentences have been parsed and disambiguated in the Norwegian LFG treebank using the NorGram LFG grammar. [less] The treebank "NorGram NDT in LFG in Norwegian Bokmål (derivate from the Norwegian Dependency Treeban… [more]	CC-BY	no
	nob-newspaper	nob-newspaper	NorGram	lfg	6 323	79 597	yes	The "NorGram Newspaper text (30 documents from the years 2006 - 2009) in Norwegian Bokmål from the Norwegian Newspaper Corpus" treebank is a syntactically annotated corpus based on 30 documents taken from the years 2006 - 2009 from the Norwegian Newspaper Corpus (NCC). This treebank is part of INESS NorGramBank collection (see URL in metadata). Note that the available treebank contains only those newspaper articles from 2012 and 2013 that have been manually preprocessed; see details otherwheres in the metadata. [less] The "NorGram Newspaper text (30 documents from the years 2006 - 2009) in Norwegian Bokmål from the N… [more]	CC-BY	no
	nob-novel	nob-novel	NorGram, NorGramBank	lfg	271 366	3 111 321	yes	The "NorGramBank fiction in Norwegian Bokmål" treebank is a syntactically annotated corpus based on data taken from bokhylla.no at the National Library of Norway. This treebank is part of INESS NorGramBank collection (see URL in metadata). As of October 2015, the treebank comprises 2 469 916 sentences and 26 903 637 words. The source text was OCR-read by the National Library of Norway; INESS has preprocessed the source text semi-automatically with regard to OCR errors (misinterpreted letters etc) before syntactic parsing. [less] The "NorGramBank fiction in Norwegian Bokmål" treebank is a syntactically annotated corpus based on … [more]	CLARIN_ACA	no
	nob-novel_1	nob-novel_1	NorGram, NorGramBank	lfg	406 280	4 369 998	yes	The "NorGramBank fiction in Norwegian Bokmål" treebank is a syntactically annotated corpus based on data taken from bokhylla.no at the National Library of Norway. This treebank is part of INESS NorGramBank collection (see URL in metadata). As of October 2015, the treebank comprises 2 469 916 sentences and 26 903 637 words. The source text was OCR-read by the National Library of Norway; INESS has preprocessed the source text semi-automatically with regard to OCR errors (misinterpreted letters etc) before syntactic parsing. [less] The "NorGramBank fiction in Norwegian Bokmål" treebank is a syntactically annotated corpus based on … [more]	CLARIN_ACA	no
	nob-novel_2	nob-novel_2	NorGram, NorGramBank	lfg	498 130	5 529 358	yes	The "NorGramBank fiction in Norwegian Bokmål" treebank is a syntactically annotated corpus based on data taken from bokhylla.no at the National Library of Norway. This treebank is part of INESS NorGramBank collection (see URL in metadata). As of October 2015, the treebank comprises 2 469 916 sentences and 26 903 637 words. The source text was OCR-read by the National Library of Norway; INESS has preprocessed the source text semi-automatically with regard to OCR errors (misinterpreted letters etc) before syntactic parsing. [less] The "NorGramBank fiction in Norwegian Bokmål" treebank is a syntactically annotated corpus based on … [more]	CLARIN_ACA	no
	nob-novel_3	nob-novel_3	NorGram, NorGramBank	lfg	441 721	4 639 538	yes	The "NorGramBank fiction in Norwegian Bokmål" treebank is a syntactically annotated corpus based on data taken from bokhylla.no at the National Library of Norway. This treebank is part of INESS NorGramBank collection (see URL in metadata). As of October 2015, the treebank comprises 2 469 916 sentences and 26 903 637 words. The source text was OCR-read by the National Library of Norway; INESS has preprocessed the source text semi-automatically with regard to OCR errors (misinterpreted letters etc) before syntactic parsing. [less] The "NorGramBank fiction in Norwegian Bokmål" treebank is a syntactically annotated corpus based on … [more]	CLARIN_ACA	no
	nob-novel_4	nob-novel_4	NorGram, NorGramBank	lfg	467 551	5 184 128	yes	The "NorGramBank fiction in Norwegian Bokmål" treebank is a syntactically annotated corpus based on data taken from bokhylla.no at the National Library of Norway. This treebank is part of INESS NorGramBank collection (see URL in metadata). As of October 2015, the treebank comprises 2 469 916 sentences and 26 903 637 words. The source text was OCR-read by the National Library of Norway; INESS has preprocessed the source text semi-automatically with regard to OCR errors (misinterpreted letters etc) before syntactic parsing. [less] The "NorGramBank fiction in Norwegian Bokmål" treebank is a syntactically annotated corpus based on … [more]	CLARIN_ACA	no
	nob-novel_5	nob-novel_5	NorGram, NorGramBank	lfg	443 891	4 817 445	yes	The "NorGramBank fiction in Norwegian Bokmål" treebank is a syntactically annotated corpus based on data taken from bokhylla.no at the National Library of Norway. This treebank is part of INESS NorGramBank collection (see URL in metadata). As of October 2015, the treebank comprises 2 469 916 sentences and 26 903 637 words. The source text was OCR-read by the National Library of Norway; INESS has preprocessed the source text semi-automatically with regard to OCR errors (misinterpreted letters etc) before syntactic parsing. [less] The "NorGramBank fiction in Norwegian Bokmål" treebank is a syntactically annotated corpus based on … [more]	CLARIN_ACA	no
	nob-novel_6	nob-novel_6	NorGram, NorGramBank	lfg	395 700	5 121 558	yes	The "NorGramBank fiction in Norwegian Bokmål" treebank is a syntactically annotated corpus based on data taken from bokhylla.no at the National Library of Norway. This treebank is part of INESS NorGramBank collection (see URL in metadata). As of October 2015, the treebank comprises 2 469 916 sentences and 26 903 637 words. The source text was OCR-read by the National Library of Norway; INESS has preprocessed the source text semi-automatically with regard to OCR errors (misinterpreted letters etc) before syntactic parsing. [less] The "NorGramBank fiction in Norwegian Bokmål" treebank is a syntactically annotated corpus based on … [more]	CLARIN_ACA	no
	nob-novel_7	nob-novel_7	NorGram, NorGramBank	lfg	221 444	3 230 446	yes	The "NorGramBank fiction in Norwegian Bokmål" treebank is a syntactically annotated corpus based on data taken from bokhylla.no at the National Library of Norway. This treebank is part of INESS NorGramBank collection (see URL in metadata). As of October 2015, the treebank comprises 2 469 916 sentences and 26 903 637 words. The source text was OCR-read by the National Library of Norway; INESS has preprocessed the source text semi-automatically with regard to OCR errors (misinterpreted letters etc) before syntactic parsing. [less] The "NorGramBank fiction in Norwegian Bokmål" treebank is a syntactically annotated corpus based on … [more]	CLARIN_ACA	no
	nob-novel_8	nob-novel_8	NAOB, NorGram, NorGramBank	lfg	570 543	7 201 100	yes	The "NorGramBank fiction in Norwegian Bokmål" treebank is a syntactically annotated corpus based on data taken from bokhylla.no at the National Library of Norway. This treebank is part of INESS NorGramBank collection (see URL in metadata). As of October 2015, the treebank comprises 2 469 916 sentences and 26 903 637 words. The source text was OCR-read by the National Library of Norway; INESS has preprocessed the source text semi-automatically with regard to OCR errors (misinterpreted letters etc) before syntactic parsing. [less] The "NorGramBank fiction in Norwegian Bokmål" treebank is a syntactically annotated corpus based on … [more]	CLARIN_ACA	no
	nob-novel_9	nob-novel_9	NAOB, NorGram, NorGramBank	lfg	265 790	3 298 570	yes	The "NorGramBank fiction in Norwegian Bokmål" treebank is a syntactically annotated corpus based on data taken from bokhylla.no at the National Library of Norway. This treebank is part of INESS NorGramBank collection (see URL in metadata). As of October 2015, the treebank comprises 2 469 916 sentences and 26 903 637 words. The source text was OCR-read by the National Library of Norway; INESS has preprocessed the source text semi-automatically with regard to OCR errors (misinterpreted letters etc) before syntactic parsing. [less] The "NorGramBank fiction in Norwegian Bokmål" treebank is a syntactically annotated corpus based on … [more]	CLARIN_ACA	no
	nob-nrk	nob-nrk	NorGram, NorGramBank	lfg	3 267	45 428	yes	The «Corona texts from NRK» treebank is a syntactically annotated corpus. It is based on data transcribed from the two newscasts Dagsrevyen and Supernytt produced by the Norwegian Broadcasting Corporation (NRK). [less] The «Corona texts from NRK» treebank is a syntactically annotated corpus. It is based on data transc… [more]	CC-BY	no
	nob-pargram (aligned)	nob-pargram (aligned)	NorGram, ParGram	lfg	112	603	yes	The ParGram collection is a collection of parallel treebanks covering a set of chosen syntactic constructions. The ParGram collection is a collaborative effort of the ParGram project, along with the ParSem project, by researcher groups in industrial and academic institutions around the world. The aim of ParGram is to produce wide coverage grammars for a variety of languages. These are written collaboratively within the linguistic framework of LFG (Lexical Functional Grammar) and with a commonly-agreed-upon set of grammatical features. The XLE (Xerox Linguistic Environment) is used as a development platform. ParSem develops semantic structures based on the ParGram syntactic structures. Most of the ParSem systems use the XLE’s XFR system. Regular semiannual meetings are being held to bring together the various research groups involved in ParGram and ParSem. [less] The ParGram collection is a collection of parallel treebanks covering a set of chosen syntactic cons… [more]	CC-BY	no
	nob-partma (aligned)	nob-partma (aligned)	NorGram, ParTMA	lfg	46	285	yes	The ParTMA collection is a collaborative effort by researcher groups in academic institutions around the world. The aim of ParTMA is to produce parallel treebanks that cover constructions relevant for the semantics of Tense, Mode and Aspect. The treebank sentences are analyzed with the grammars in the ParGram project. [less] The ParTMA collection is a collaborative effort by researcher groups in academic institutions around… [more]	CC-BY	no
	nob-sofie (aligned)	nob-sofie (aligned)	NorGram, NorGramBank	lfg	1 151	15 224	yes	The INESS Sofie Norwegian Treebank. The treebank is a syntactically annotated corpus based on the first chapters of the novel “Sofies verden” by Jostein Gaarder, published by Aschehoug forlag. The sentence-analyses are produced by INESS for the META-NORD project, whose goal was to promote the accessability of existing treebanks for the languages in the project. The corpus is automatically analyzed with the NorGram LFG grammar and all analyses are manually verified. [less] The INESS Sofie Norwegian Treebank. The treebank is a syntactically annotated corpus based on the … [more]	unspecified	no
	nob-sofie-lfg (aligned)	nob-sofie-lfg (aligned)	NorGram, Sofie	lfg	250	3 119	yes	The Norwegian part of the META-NORD Sofie Parallel Treebank, a syntactically annotated parallel corpus based on the first chapters of the novel “Sofies verden” (Sophie's World) by Jostein Gaarder, published by Aschehoug forlag. The treebank consists of grammatical annotations of extracts from the original and was created by the INESS project for META-NORD. For more information, see the metadata description of the META-NORD Sofie Parallel Treebank. [less] The Norwegian part of the META-NORD Sofie Parallel Treebank, a syntactically annotated parallel corp… [more]	unspecified	no
	nob-ndt-dep	nob-ndt-dep	NDT	dependency-cg	20 045	276 789	yes	The treebank "Norwegian Dependency Treebank in Norwegian Bokmål (copy @ INESS)" is a syntactically annotated corpus, created by the National Library of Norway. The copy in INESS allows for searches in this treebank using the INESS search system. The original is downloadable at Språkbanken. [less] The treebank "Norwegian Dependency Treebank in Norwegian Bokmål (copy @ INESS)" is a syntactically a… [more]	CC-BY	no
	nob-ud-2.0-dep	nob-ud-2.0-dep	Universal Dependencies 2.0	dependency-cg	18 106	249 060	no	The “Universal Dependencies 2.0” collection is searchable at the INESS portal; to read further details about the original collection as a whole, or about individual treebanks in the collections, we refer to the original, which is located at the LINDAT/CLARIN Centre for Language Research Infrastructure (http://hdl.handle.net/11234/1-1983). The individual treebanks have individual licenses, which are available through the joint license “Universal Dependencies v2.0 License Agreement”. Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual parser development, cross-lingual learning, and parsing research from a language typology perspective. The annotation scheme is based on (universal) Stanford dependencies (de Marneffe et al., 2006, 2008, 2014), Google universal part-of-speech tags (Petrov et al., 2012), and the Interset interlingua for morphosyntactic tagsets (Zeman, 2008). [less] The “Universal Dependencies 2.0” collection is searchable at the INESS portal; to read further detai… [more]	unspecified	no
	nob-ud-2.1-dep	nob-ud-2.1-dep	Universal Dependencies 2.1	dependency-cg	20 045	275 758	no	The “Universal Dependencies 2.1” collection is searchable at the INESS portal; to read further details about the original collection as a whole, or about individual treebanks in the collections, we refer to the original, which is located at the LINDAT/CLARIN Centre for Language Research Infrastructure (http://hdl.handle.net/11234/1-2515). The individual treebanks have individual licenses, which are available through the joint license “Universal Dependencies v2.1 License Agreement”. Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual parser development, cross-lingual learning, and parsing research from a language typology perspective. The annotation scheme is based on (universal) Stanford dependencies (de Marneffe et al., 2006, 2008, 2014), Google universal part-of-speech tags (Petrov et al., 2012), and the Interset interlingua for morphosyntactic tagsets (Zeman, 2008). [less] The “Universal Dependencies 2.1” collection is searchable at the INESS portal; to read further detai… [more]	unspecified	no
	Old Russian (orv)			18 704	170 573
	orv-afnik-dep	orv-afnik-dep	TOROT	dependency	889	6 471	yes	The treebank "orv-afnik-dep" is part of the TOROT treebank collection. The TOROT Treebank is a dependency treebank with morphosyntactic and information-structure annotation. It includes texts in Old Church Slavonic, Old Russian and Middle Russian and is freely available under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License. The treebank is an expansion of the Slavic part of the PROIEL corpus and was started as part of the research project Birds and Beasts: Shaping Events in Old Russian, which was financed by the Norwegian Research Council. The treebank is still in active development. [less] The treebank "orv-afnik-dep" is part of the TOROT treebank collection. The TOROT Treebank is a depe… [more]	CC-BY-NC-SA	no
	orv-avv-dep	orv-avv-dep	TOROT	dependency	3 238	22 180	yes	The treebank "orv-avv-dep" is part of the TOROT treebank collection. The TOROT Treebank is a dependency treebank with morphosyntactic and information-structure annotation. It includes texts in Old Church Slavonic, Old Russian and Middle Russian and is freely available under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License. The treebank is an expansion of the Slavic part of the PROIEL corpus and was started as part of the research project Birds and Beasts: Shaping Events in Old Russian, which was financed by the Norwegian Research Council. The treebank is still in active development. [less] The treebank "orv-avv-dep" is part of the TOROT treebank collection. The TOROT Treebank is a depend… [more]	CC-BY-NC-SA	no
	orv-const-dep	orv-const-dep	TOROT	dependency	755	8 920	yes	The treebank "orv-const-dep" is part of the TOROT treebank collection. The TOROT Treebank is a dependency treebank with morphosyntactic and information-structure annotation. It includes texts in Old Church Slavonic, Old Russian and Middle Russian and is freely available under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License. The treebank is an expansion of the Slavic part of the PROIEL corpus and was started as part of the research project Birds and Beasts: Shaping Events in Old Russian, which was financed by the Norwegian Research Council. The treebank is still in active development. [less] The treebank "orv-const-dep" is part of the TOROT treebank collection. The TOROT Treebank is a depe… [more]	CC-BY-NC-SA	no
	orv-domo-dep	orv-domo-dep	TOROT	dependency	1 902	22 262	yes	The treebank "orv-domo-dep" is part of the TOROT treebank collection. The TOROT Treebank is a dependency treebank with morphosyntactic and information-structure annotation. It includes texts in Old Church Slavonic, Old Russian and Middle Russian and is freely available under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License. The treebank is an expansion of the Slavic part of the PROIEL corpus and was started as part of the research project Birds and Beasts: Shaping Events in Old Russian, which was financed by the Norwegian Research Council. The treebank is still in active development. [less] The treebank "orv-domo-dep" is part of the TOROT treebank collection. The TOROT Treebank is a depen… [more]	CC-BY-NC-SA	no
	orv-drac-dep	orv-drac-dep	TOROT	dependency	288	2 438	yes	The treebank "orv-drac-dep" is part of the TOROT treebank collection. The TOROT Treebank is a dependency treebank with morphosyntactic and information-structure annotation. It includes texts in Old Church Slavonic, Old Russian and Middle Russian and is freely available under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License. The treebank is an expansion of the Slavic part of the PROIEL corpus and was started as part of the research project Birds and Beasts: Shaping Events in Old Russian, which was financed by the Norwegian Research Council. The treebank is still in active development. [less] The treebank "orv-drac-dep" is part of the TOROT treebank collection. The TOROT Treebank is a depen… [more]	CC-BY-NC-SA	no
	orv-kiev-hyp-dep	orv-kiev-hyp-dep	TOROT	dependency	57	530	yes	The treebank "orv-kiev-hyp-dep" is part of the TOROT treebank collection. The TOROT Treebank is a dependency treebank with morphosyntactic and information-structure annotation. It includes texts in Old Church Slavonic, Old Russian and Middle Russian and is freely available under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License. The treebank is an expansion of the Slavic part of the PROIEL corpus and was started as part of the research project Birds and Beasts: Shaping Events in Old Russian, which was financed by the Norwegian Research Council. The treebank is still in active development. [less] The treebank "orv-kiev-hyp-dep" is part of the TOROT treebank collection. The TOROT Treebank is a d… [more]	CC-BY-NC-SA	no
	orv-lav-dep	orv-lav-dep	TOROT	dependency	7 128	52 316	yes	The treebank "orv-lav-dep" is part of the TOROT treebank collection. The TOROT Treebank is a dependency treebank with morphosyntactic and information-structure annotation. It includes texts in Old Church Slavonic, Old Russian and Middle Russian and is freely available under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License. The treebank is an expansion of the Slavic part of the PROIEL corpus and was started as part of the research project Birds and Beasts: Shaping Events in Old Russian, which was financed by the Norwegian Research Council. The treebank is still in active development. [less] The treebank "orv-lav-dep" is part of the TOROT treebank collection. The TOROT Treebank is a depend… [more]	CC-BY-NC-SA	no
	orv-luk-koloc-dep	orv-luk-koloc-dep	TOROT	dependency	91	872	yes	The treebank "orv-luk-koloc-dep" is part of the TOROT treebank collection. The TOROT Treebank is a dependency treebank with morphosyntactic and information-structure annotation. It includes texts in Old Church Slavonic, Old Russian and Middle Russian and is freely available under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License. The treebank is an expansion of the Slavic part of the PROIEL corpus and was started as part of the research project Birds and Beasts: Shaping Events in Old Russian, which was financed by the Norwegian Research Council. The treebank is still in active development. [less] The treebank "orv-luk-koloc-dep" is part of the TOROT treebank collection. The TOROT Treebank is a … [more]	CC-BY-NC-SA	no
	orv-mst-dep	orv-mst-dep	TOROT	dependency	8	157	yes	The treebank "orv-mst-dep" is part of the TOROT treebank collection. The TOROT Treebank is a dependency treebank with morphosyntactic and information-structure annotation. It includes texts in Old Church Slavonic, Old Russian and Middle Russian and is freely available under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License. The treebank is an expansion of the Slavic part of the PROIEL corpus and was started as part of the research project Birds and Beasts: Shaping Events in Old Russian, which was financed by the Norwegian Research Council. The treebank is still in active development. [less] The treebank "orv-mst-dep" is part of the TOROT treebank collection. The TOROT Treebank is a depend… [more]	CC-BY-NC-SA	no
	orv-novgorod-jaroslav-dep	orv-novgorod-jaroslav-dep	TOROT	dependency	30	410	yes	The treebank "orv-novgorod-jaroslav-dep" is part of the TOROT treebank collection. The TOROT Treebank is a dependency treebank with morphosyntactic and information-structure annotation. It includes texts in Old Church Slavonic, Old Russian and Middle Russian and is freely available under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License. The treebank is an expansion of the Slavic part of the PROIEL corpus and was started as part of the research project Birds and Beasts: Shaping Events in Old Russian, which was financed by the Norwegian Research Council. The treebank is still in active development. [less] The treebank "orv-novgorod-jaroslav-dep" is part of the TOROT treebank collection. The TOROT Treeba… [more]	CC-BY-NC-SA	no
	orv-pskov-dep	orv-pskov-dep	TOROT	dependency	201	2 301	yes	The treebank "orv-pskov-dep" is part of the TOROT treebank collection. The TOROT Treebank is a dependency treebank with morphosyntactic and information-structure annotation. It includes texts in Old Church Slavonic, Old Russian and Middle Russian and is freely available under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License. The treebank is an expansion of the Slavic part of the PROIEL corpus and was started as part of the research project Birds and Beasts: Shaping Events in Old Russian, which was financed by the Norwegian Research Council. The treebank is still in active development. [less] The treebank "orv-pskov-dep" is part of the TOROT treebank collection. The TOROT Treebank is a depe… [more]	CC-BY-NC-SA	no
	orv-pskov-ivan-dep	orv-pskov-ivan-dep	TOROT	dependency	26	331	yes	The treebank "orv-pskov-ivan-dep" is part of the TOROT treebank collection. The TOROT Treebank is a dependency treebank with morphosyntactic and information-structure annotation. It includes texts in Old Church Slavonic, Old Russian and Middle Russian and is freely available under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License. The treebank is an expansion of the Slavic part of the PROIEL corpus and was started as part of the research project Birds and Beasts: Shaping Events in Old Russian, which was financed by the Norwegian Research Council. The treebank is still in active development. [less] The treebank "orv-pskov-ivan-dep" is part of the TOROT treebank collection. The TOROT Treebank is a… [more]	CC-BY-NC-SA	no
	orv-riga-goth-dep	orv-riga-goth-dep	TOROT	dependency	111	1 499	yes	The treebank "orv-riga-goth-dep" is part of the TOROT treebank collection. The TOROT Treebank is a dependency treebank with morphosyntactic and information-structure annotation. It includes texts in Old Church Slavonic, Old Russian and Middle Russian and is freely available under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License. The treebank is an expansion of the Slavic part of the PROIEL corpus and was started as part of the research project Birds and Beasts: Shaping Events in Old Russian, which was financed by the Norwegian Research Council. The treebank is still in active development. [less] The treebank "orv-riga-goth-dep" is part of the TOROT treebank collection. The TOROT Treebank is a … [more]	CC-BY-NC-SA	no
	orv-rig-smol1281-dep	orv-rig-smol1281-dep	TOROT	dependency	13	167	yes	The treebank "orv-rig-smol1281-dep" is part of the TOROT treebank collection. The TOROT Treebank is a dependency treebank with morphosyntactic and information-structure annotation. It includes texts in Old Church Slavonic, Old Russian and Middle Russian and is freely available under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License. The treebank is an expansion of the Slavic part of the PROIEL corpus and was started as part of the research project Birds and Beasts: Shaping Events in Old Russian, which was financed by the Norwegian Research Council. The treebank is still in active development. [less] The treebank "orv-rig-smol1281-dep" is part of the TOROT treebank collection. The TOROT Treebank is… [more]	CC-BY-NC-SA	no
	orv-rusprav-dep	orv-rusprav-dep	TOROT	dependency	421	3 930	yes	The treebank "orv-rusprav-dep" is part of the TOROT treebank collection. The TOROT Treebank is a dependency treebank with morphosyntactic and information-structure annotation. It includes texts in Old Church Slavonic, Old Russian and Middle Russian and is freely available under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License. The treebank is an expansion of the Slavic part of the PROIEL corpus and was started as part of the research project Birds and Beasts: Shaping Events in Old Russian, which was financed by the Norwegian Research Council. The treebank is still in active development. [less] The treebank "orv-rusprav-dep" is part of the TOROT treebank collection. The TOROT Treebank is a de… [more]	CC-BY-NC-SA	no
	orv-sergrad-dep	orv-sergrad-dep	TOROT	dependency	1 441	19 905	yes	The treebank "orv-sergrad-dep" is part of the TOROT treebank collection. The TOROT Treebank is a dependency treebank with morphosyntactic and information-structure annotation. It includes texts in Old Church Slavonic, Old Russian and Middle Russian and is freely available under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License. The treebank is an expansion of the Slavic part of the PROIEL corpus and was started as part of the research project Birds and Beasts: Shaping Events in Old Russian, which was financed by the Norwegian Research Council. The treebank is still in active development. [less] The treebank "orv-sergrad-dep" is part of the TOROT treebank collection. The TOROT Treebank is a de… [more]	CC-BY-NC-SA	no
	orv-smol-pol-lit-dep	orv-smol-pol-lit-dep	TOROT	dependency	23	335	yes	The treebank "orv-smol-pol-lit-dep" is part of the TOROT treebank collection. The TOROT Treebank is a dependency treebank with morphosyntactic and information-structure annotation. It includes texts in Old Church Slavonic, Old Russian and Middle Russian and is freely available under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License. The treebank is an expansion of the Slavic part of the PROIEL corpus and was started as part of the research project Birds and Beasts: Shaping Events in Old Russian, which was financed by the Norwegian Research Council. The treebank is still in active development. [less] The treebank "orv-smol-pol-lit-dep" is part of the TOROT treebank collection. The TOROT Treebank is… [more]	CC-BY-NC-SA	no
	orv-usp-sbor-dep	orv-usp-sbor-dep	TOROT	dependency	2 043	24 927	yes	The treebank "orv-usp-sbor-dep" is part of the TOROT treebank collection. The TOROT Treebank is a dependency treebank with morphosyntactic and information-structure annotation. It includes texts in Old Church Slavonic, Old Russian and Middle Russian and is freely available under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License. The treebank is an expansion of the Slavic part of the PROIEL corpus and was started as part of the research project Birds and Beasts: Shaping Events in Old Russian, which was financed by the Norwegian Research Council. The treebank is still in active development. [less] The treebank "orv-usp-sbor-dep" is part of the TOROT treebank collection. The TOROT Treebank is a d… [more]	CC-BY-NC-SA	no
	orv-ust-vlad-dep	orv-ust-vlad-dep	TOROT	dependency	30	481	yes	The treebank "orv-ust-vlad-dep" is part of the TOROT treebank collection. The TOROT Treebank is a dependency treebank with morphosyntactic and information-structure annotation. It includes texts in Old Church Slavonic, Old Russian and Middle Russian and is freely available under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License. The treebank is an expansion of the Slavic part of the PROIEL corpus and was started as part of the research project Birds and Beasts: Shaping Events in Old Russian, which was financed by the Norwegian Research Council. The treebank is still in active development. [less] The treebank "orv-ust-vlad-dep" is part of the TOROT treebank collection. The TOROT Treebank is a d… [more]	CC-BY-NC-SA	no
	orv-varlaam-dep	orv-varlaam-dep	TOROT	dependency	9	141	yes	The treebank "orv-varlaam-dep" is part of the TOROT treebank collection. The TOROT Treebank is a dependency treebank with morphosyntactic and information-structure annotation. It includes texts in Old Church Slavonic, Old Russian and Middle Russian and is freely available under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License. The treebank is an expansion of the Slavic part of the PROIEL corpus and was started as part of the research project Birds and Beasts: Shaping Events in Old Russian, which was financed by the Norwegian Research Council. The treebank is still in active development. [less] The treebank "orv-varlaam-dep" is part of the TOROT treebank collection. The TOROT Treebank is a de… [more]	CC-BY-NC-SA	no

Design & implementation: Paul Meurer, CLARINO Bergen Centre, 2025 · Accessibility statement (in Norwegian only)