INESS-logo
Publications

Treebanks

Tools


Attribution/Acknowledgements

If you use any INESS services in your research, for instance, to search or annotate treebanks, we request an acknowledgement to INESS, mentioning the webpage http://clarino.uib.no/iness and the following reference in your publications:

Victoria Rosén, Koenraad De Smedt, Paul Meurer, and Helge Dyvik. 2012. An open infrastructure for advanced treebanking. In Jan Hajič, Koenraad De Smedt, Marko Tadić, and António Branco (eds.) META-RESEARCH Workshop on Advanced Treebanking at LREC2012, pages 22–29, Istanbul, Turkey, May 2012. European Language Resources Association (ELRA).

You should also acknowledge the creators of the treebanks which you use in your research, and if possible, refer to a treebank with its persistent identifier. For general information on how to cite linguistic data, see the following:

Helene N. Andreassen, Andrea L. Berez-Kroeker, Lauren B. Collister, Philipp Conzett, Christopher Cox, Koenraad De Smedt, and Bradley McDonnell. 2020. Tromsø Recommendations for Citation of Research Data in Linguistics. RDA Linguistics Data Interest Group.

Rosén, Victoria, Helge Dyvik, Paul Meurer, Koenraad De Smedt, Miriam Butt, and Ida Toivonen. 2020. Creating and Exploring LFG Treebanks. In Proceedings of the LFG’20 Conference, pages 328–348. Stanford, CA: CSLI Publications.

Meurer, Paul. 2020. Designing Efficient Algorithms for Querying Large Corpora. Oslo Studies in Language 11 (2): 283–302.

Meurer, Paul, Victoria Rosén, and Koenraad De Smedt. 2020. Interactive Visualizations in INESS. In Miriam Butt, Annette Hautli-Janisz, and Verena Lyding (eds.) LingVis: Visual Analytics for Linguistics, pages 55–85. Stanford, California: CSLI Publications / University of Chicago Press.

Helge Dyvik, Gyri Smørdal Losnegaard & Victoria Rosén. 2019. Multiword expressions in an LFG grammar for Norwegian. In: Yannick Parmentier & Jakub Waszczuk (eds.), Representation and parsing of multiword expressions: Current trends (Phraseology and Multiword Expressions, Vol. 2), pages 69–108. Berlin: Language Science Press.

Victoria Rosén, Helge J. Jakhelln Dyvik, Paul Meurer, and Koenraad De Smedt. 2017. Exploring Treebanks with INESS Search. In Proceedings of the 21st Nordic Conference on Computational Linguistics (NoDaLiDa), NEALT Proceedings Series 29, Linköping Electronic Conference Proceedings 131: 326–329.

Meurer, Paul. 2017. From LFG Structures to Dependency Relations. In Victoria Rosén and Koenraad De Smedt (eds.), _The Very Model of a Modern Linguist — in Honor of Helge Dyvik,_Bergen Language and Linguistics Studies (BeLLS) 8: 183–201.

Victoria Rosén and Kaja Borthen. 2017. Norwegian Bare Singulars Revisited. In Victoria Rosén and Koenraad De Smedt (eds.), The Very Model of a Modern Linguist — in Honor of Helge Dyvik, Bergen Language and Linguistics Studies (BeLLS) 8: 220–40.

Victoria Rosén, Martha Thunes, Petter Haugereid, Gyri Smørdal Losnegaard, Helge Dyvik, Paul Meurer, Gunn Inger Lyse, and Koenraad De Smedt. 2016. The enrichment of lexical resources through incremental parsebanking. Language Resources and Evaluation 50(2), pages 291–319.

Helge Dyvik, Paul Meurer, Victoria Rosén, Koenraad De Smedt, Petter Haugereid, Gyri Smørdal Losnegaard, Gunn Inger Lyse, and Martha Thunes. 2016. NorGramBank: A ‘Deep’ Treebank for Norwegian. In Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Asunción Moreno, Jan Odijk and Stelios Piperidis (eds.), Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16), pages 3555–3562, Portorož, Slovenia. ELRA.

Victoria Rosén, Koenraad De Smedt, Gyri Smørdal Losnegaard, Eduard Bejček, Agata Savary, and Petya Osenova. 2016. MWEs in treebanks: From survey to guidelines. In Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Asunción Moreno, Jan Odijk and Stelios Piperidis, editors, Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16), pages 2323–2330, Portorož, Slovenia. ELRA.

Paul Meurer, Victoria Rosén, and Koenraad De Smedt. Interactive Visualizations in the INESS Treebanking Infrastructure. 2016. In Annette Hautli-Janisz and Verena Lyding (eds.), Proceedings of the LREC'16 workshop VisLR II: Visualization as Added Value in the Development, Use and Evaluation of Language Resources, pages 1–7. Portorož, Slovenia. ELRA.

De Smedt, Koenraad, Gunn Inger Lyse Samdal, Rune Kyrkjebø, Hemed Ali Hemed Al Ruwehy, Øyvind Liland Gjesdal, Victoria Rosén, and Paul Meurer. 2016. The CLARINO Bergen Centre: Development and Deployment. In Selected Papers from the CLARIN Annual Conference 2015, October 14–16, 2015, Wrocław, Poland, 1–12. Linköping Electronic Conference Proceedings. Linköping University Electronic Press.

Koenraad De Smedt, Victoria Rosén, and Paul Meurer. 2015. Studying consistency in UD treebanks with INESS-Search. In Markus Dickinson, Erhard Hinrichs, Agnieszka Patejuk, and Adam Przepiórkowski (eds.), Proceedings of the Fourteenth Workshop on Treebanks and Linguistic Theories (TLT14), pages 258–267, Warsaw, Poland. Institute of Computer Science, Polish Academy of Sciences.

Victoria Rosén, Gyri Losnegaard, Koenraad De Smedt, Eduard Bejček, Agata Savary, Adam Przepiórkowski, Petya Osenova and Verginica Barbu Mititelu. 2015. A survey of multiword expressions in treebanks. In Markus Dickinson, Erhard Hinrichs, Agnieszka Patejuk, and Adam Przepiórkowski, editors, Proceedings of the Fourteenth Workshop on Treebanks and Linguistic Theories (TLT14), pages 179–193, Warsaw, Poland. Institute of Computer Science, Polish Academy of Sciences.

Victoria Rosén, Petter Haugereid, Martha Thunes, Gyri Smørdal Losnegaard, Helge Dyvik, and Paul Meurer. 2014. The interplay between lexical and syntactic resources in incremental parsebanking. In Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Hrafn Loftsson, Bente Maegaard, Joseph Mariani, Asunción Moreno, Jan Odijk and Stelios Piperidis (eds.), Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14), pages 1617–1624, Reykjavik, Iceland, May 2014. European Language Resources Association (ELRA).

Cheikh M. Bamba Dione. 2014. LFG parse disambiguation for Wolof. Journal of Language Modelling Vol. 2, No. 1, pages 105–165.

Cheikh M. Bamba Dione. 2014. Pruning the search space of the Wolof LFG grammar using a probabilistic and a constraint grammar parser. In Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Hrafn Loftsson, Bente Maegaard, Joseph Mariani, Asunción Moreno, Jan Odijk, and Stelios Piperidis (eds.), Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14), pages 2863–2870, Reykjavik, Iceland, May 2014. European Language Resources Association (ELRA).

Victoria Rosén. 2014. Språkteknologiens behov for leksikalsk informasjon. In Ruth Vatvedt Fjeld and Marit Hovdenak, editors, Nordiske studier i leksikografi 12, Rapport fra Konferanse om leksikografi i Norden, Oslo, 13.-16. august 2013, pages 13–41, Novus forlag.

Paul Meurer, Helge Dyvik, Victoria Rosén, Koenraad De Smedt, Gunn Inger Lyse, Gyri Smørdal Losnegaard, and Martha Thunes. 2013. The INESS treebanking infrastructure. In Stephan Oepen, Kristin Hagen, and Janne Bondi Johannessen, editors, Proceedings of the 19th Nordic Conference of Computational Linguistics (NODALIDA 2013), May 22–24, 2013, Oslo University, Norway. NEALT Proceedings Series 16, number 85 in Linköping Electronic Conference Proceedings, pages 453–458. Linköping University Electronic Press.

Helge Dyvik, Martha Thunes, Petter Haugereid, Victoria Rosén, Paul Meurer, Koenraad De Smedt, and Gyri Smørdal Losnegaard. 2013. Studying interannotator agreement in discriminant-based parsebanking. In Sandra Kübler, Petya Osenova, and Martin Volk, editors, Proceedings of the Twelfth Workshop on Treebanks and Linguistic Theories (TLT12), pages 37–48. Bulgarian Academy of Sciences.

Sebastian Sulger, Miriam Butt, Tracy Holloway King, Paul Meurer, Tibor Laczkó, György Rákosi, Cheikh Bamba Dione, Helge Dyvik, Victoria Rosén, Koenraad De Smedt, Agnieszka Patejuk, Özlem Çetinoglu, I Wayan Arka, and Meladel Mistica. 2013. ParGramBank: The ParGram parallel treebank. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, Vol. 1, pages 550–560, Sofia, Bulgaria, August 2013. Association for Computational Linguistics.

Gyri Smørdal Losnegaard, Gunn Inger Lyse, Anje Müller Gjesdal, Koenraad De Smedt, Paul Meurer, and Victoria Rosén. 2013. Linking Northern European infrastructures for improving the accessibility and documentation of complex resources. In Koenraad De Smedt, Lars Borin, Krister Lindén, Bente Maegaard, Eiríkur Rögnvaldsson, and Kadri Vider (eds.), Proceedings of the workshop on Nordic language research infrastructure at NODALIDA 2013, number 89 in Linköping Electronic Conference Proceedings. Linköping University Electronic Press.

Inguna Skadiņa, Andrejs Vasiļjevs, Lars Borin, Krister Lindén, Gyri Losnegaard, Sussi Olsen, Bolette Sandford Pedersen, Roberts Rozis, and Koenraad De Smedt. 2013. Baltic and Nordic parts of the European linguistic infrastructure. In Stephan Oepen, Kristin Hagen, and Janne Bondi Johannessen (eds.), Proceedings of the 19th Nordic Conference of Computational Linguistics (NODALIDA 2013), May 22–24, 2013, Oslo University, Norway. NEALT Proceedings Series 16, number 85 in Linköping Electronic Conference Proceedings, pages 195–211. Linköping University Electronic Press.

Victoria Rosén, Paul Meurer, Gyri Smørdal Losnegaard, Gunn Inger Lyse, Koenraad De Smedt, Martha Thunes, and Helge Dyvik. 2012. An integrated web-based treebank annotation system. In Iris Hendrickx, Sandra Kübler, and Kiril Simov (eds.), Proceedings of the Eleventh International Workshop on Treebanks and Linguistic Theories (TLT11), pages 157–167. Lisbon, Portugal: Edicõ̧es Colibri.

Paul Meurer. 2012. INESS-Search: A search system for LFG (and other) treebanks. In Miriam Butt and Tracy Holloway King (eds.), Proceedings of the LFG ’12 Conference, pages 404–421, Stanford, CA:CSLI Publications.

Gyri Smørdal Losnegaard, Gunn Inger Lyse, Martha Thunes, Victoria Rosén, Koenraad De Smedt, Helge Dyvik, and Paul Meurer. 2012. What we have learned from Sofie: Extending lexical and grammatical coverage in an LFG parsebank. In Jan Hajič, Koenraad De Smedt, Marko Tadić, and António Branco (eds.), META-RESEARCH Workshop on Advanced Treebanking at LREC2012, pages 69–76, Istanbul, Turkey.

Victoria Rosén. Exploring corpora through syntactic annotation. 2012. In Gisle Andersen (ed.), Exploring Newspaper Language, Using the web to create and investigate a large corpus of modern Norwegian, (Studies in Corpus Linguistics, Vol. 49), pages 67–78. John Benjamins, Amsterdam/Philadelphia.

Andrejs Vasiļjevs, Markus Forsberg, Tatiana Gornostay, Dorte Haltrup Hansen, Kristín Jóhannsdóttir, Gunn Lyse, Krister Lindén, Lene Offersgaard, Sussi Olsen, Bolette Pedersen, Eiríkur Rögnvaldsson, Inguna Skadiņa, Koenraad De Smedt, Ville Oksanen, and Roberts Rozis. 2012. Creation of an open shared language resource repository in the Nordic and Baltic countries. In Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Mehmet Uğur Doğan, Bente Maegaard, Joseph Mariani, Jan Odijk, and Stelios Piperidis, editors, Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC’12), pages 1076–1083, Istanbul, Turkey, May 2012. European Language Resources Association (ELRA).

Victoria Rosén and Koenraad De Smedt. Syntactic annotation of learner corpora. 2010. In Hilde Johansen, Anne Golden, Jon Erik Hagen, and Ann-Kristin Helland, editors, Systematisk, variert, men ikke tilfeldig: Antologi om norsk som andrespråk i anledning Kari Tenfjords 60-årsdag, pages 120–132. Novus forlag.

Publication from the XPAR project

Helge Dyvik, Paul Meurer, Victoria Rosén, and Koenraad De Smedt. 2009. Linguistically motivated parallel parsebanks. In Marco Passarotti, Adam Przepiórkowski, Sabine Raynaud, and Frank Van Eynde, editors, Proceedings of the Eighth International Workshop on Treebanks and Linguistic Theories, pages 71–82, Milan, Italy. EDUCatt.

Publications from the TREPIL project

Victoria Rosén, Paul Meurer, and Koenraad De Smedt. 2009. LFG Parsebanker: A toolkit for building and searching a treebank as a parsed corpus. In Frank Van Eynde, Anette Frank, Gertjan van Noord, and Koenraad De Smedt, editors, Proceedings of the Seventh International Workshop on Treebanks and Linguistic Theories (TLT7), pages 127–133, Utrecht. LOT.

Paul Meurer. 2009. A Computational Grammar for Georgian. In P. Bosch, D. Gabelaia, & J. Lang (Eds.), Logic, Language, and Computation. TbiLLC 2007. Lecture Notes in Computer Science, Vol. 5422, pages 1–15. Springer.

Cahill, Aoife, John T. Maxwell III, Paul Meurer, Christian Rohrer, and Victoria Rosén. 2008. Speeding up LFG Parsing Using C-Structure Pruning. In Coling 2008: Proceedings of the Workshop on Grammar Engineering Across Frameworks, 33–40. Manchester, UK.

Victoria Rosén. Mot en trebank for talespråk. 2008. In Janne Bondi Johannessen and Kristin Hagen, editors, Språk i Oslo. Ny forskning omkring talespråk, pages 214–225. Novus forlag, Oslo.

Victoria Rosén, Paul Meurer, and Koenraad De Smedt. 2007. Designing and Implementing Discriminants for LFG Grammars. In Tracy Holloway King and Miriam Butt, editors, The Proceedings of the LFG ’07 Conference, pages 397–417. CSLI Publications, Stanford.

Victoria Rosén and Koenraad De Smedt. 2007. Theoretically motivated treebank coverage. In Proceedings of the 16th Nordic Conference of Computational Linguistics (NODALIDA-2007), pages 152–159. Tartu University Library, Tartu.

Victoria Rosén, Koenraad De Smedt, and Paul Meurer. 2006. Towards a toolkit linking treebanking to grammar development. In Proceedings of the Fifth Workshop on Treebanks and Linguistic Theories, pages 55–66.

Victoria Rosén, Koenraad De Smedt, Helge Dyvik, and Paul Meurer. 2005. TREPIL: Developing methods and tools for multilevel treebank construction. In Montserrat Civit, Sandra Kübler, and Ma. Antònia Martí, editors, Proceedings of the Fourth Workshop on Treebanks and Linguistic Theories (TLT 2005), pages 161– 172.

Victoria Rosén, Paul Meurer, and Koenraad De Smedt. 2005. Constructing a parsed corpus with a large LFG grammar. In Proceedings of LFG’05, pages 371–387. CSLI Publications.

Joakim Nivre, Koenraad De Smedt, and Martin Volk. 2005. Treebanking in Northern Europe: A white paper. In Henrik Holmboe, editor, Nordisk Sprogteknologi 2004. Årbog for Nordisk Sprogteknologisk Forskningsprogram 2000-2004, pages 97–112. Museum Tusculanums Forlag, Copenhagen.

Publications from the NorGram project

Miriam Butt, Helge Dyvik, Tracy Holloway King, Hiroshi Masuichi, and Christian Rohrer. 2002. The Parallel Grammar project. In John Carroll, Nelleke Oostdijk, and Richard Sutcliffe, editors, Proceedings of the Workshop on Grammar Engineering and Evaluation at the 19th International Conference on Computational Linguistics (COLING), Taipei, Taiwan, 2002, pages 1–7. Association for Computational Linguistics.

Helge Dyvik. 2000. Nødvendige noder i norsk: Grunntrekk i en leksikalsk-funksjonell beskrivelse av norsk syntaks [Necessary nodes in Norwegian: Basic properties of a lexical-functional description of Norwegian syntax]. In Øivin Andersen, Kjersti Fløttum, and Torodd Kinn, editors, Menneske, språk og felleskap, pages 25–45. Novus forlag, Oslo.