INESS-logo
Project description

Treebanks

Tools


Language infrastructures will not just be repositories, but virtual laboratories.

Treebanks are databases of detailed grammatical analyses of language data. They are useful in the study of linguistic phenomena and important tools in R&D on a wide range of language understanding applications. Parallel treebanks link analyzed sentences to their translations in other languages and are therefore useful in translation studies and in R&D on quality machine translation. Parallel treebanks are constructed by alignment of monolingual treebanks.

INESS, the Norwegian Infrastructure for the Exploration of Syntax and Semantics, is aimed at providing an eScience laboratory for linguistic research. The infrastructure is based on treebanks, which are databases of syntactically and semantically annotated sentences. Such databases are indispensable sources for the development of quantitative models of language competence and language processes, based on statistical and machine learning methods. The development of such models is an important step towards deeper insights into the actual use of linguistic constructions and towards developing the next generation of language technology applications for real life needs, such as information retrieval and multilingual technologies.

Past and present project participants are Victoria Rosén (project leader), Helge Julius Jakhelln Dyvik, Paul Meurer, Koenraad De Smedt, Petter Haugereid, Gyri Smørdal Losnegaard, Gunn Inger Lyse, Martha Thunes, Sindre Sørensen, Olav Smørholm and Marius Bakke.

INESS is a project supported by the Research Council of Norway with grant 195323/V30 under the Infrastruktur program and by the University of Bergen. The consortium consists of the University of Bergen and Uni Research, with partners. The construction phase runs from April 1, 2010 until June 30, 2017. INESS cooperates closely with CLARINO and aims at being compatible with CLARIN standards.

INESS has cooperated with the META-NORD project (2011-2013), supported by the EU under CIP-ICT-PSP grant agreement no. 270899.