DSD-Annotated Corpora

SelectFieldWeight
Title
Authors
Availability
Contact person(s)
Corpus acronym
Corpus size (hours)
Corpus size (other)
Corpus size (tokens)
Developer
Document structure
Format
Genre
Genre (detailed)
Languages covered
Method of annotation
Mode
Notes
Previous versions and their release dates
Project URL
Register
Register (2)
Text type
Tools for annotation
Tools for browsing
Tools for querying
Title Authors Languages covered
TED Multilingual Discourse Bank Deniz Zeyrek, Amalia Mendes, Sam Gibbon, Yulia Grishina, Maciej Ogrodniczuk, Murathan Kurfalı English, German, Polish, Portuguese, Russian, Turkish
ANNODIS Stergos D. Afantenos, Nicholas Asher, Farah Benamara, Myriam Bras, Cécile Fabre, Lydia-Mai Ho-Dac, Anne Le Draoulec, Philippe Muller, Marie-Paule Péry-Woodley, Laurent Prévot, Josette Rebeyrolle, Ludovic Tanguy, Marianne Vergez-Couret, laure Vieu French
Catalan Discourse GraphBank Toni Badia, Roser Saurí, Teresa Suñol Catalan
Contexts of subordination Jyrki Kalliokoski, Ilona Herlin, Maria Viluna, Tomi Visakko, Finnish
Corpus for the Analysis of German-English Contrasts in Cohesion Ekaterina Lapshinova-Koltunski, Kerstin Kunz, Katrin Menzel, Erich Steiner, Jose Manuel Martinez Martinez, Stefania Degaetano-Ortlieb, Marilisa Amoia English, German
CSTNews Corpus Thiago Pardo, Lucia Castro, Erick Maziero, Vinicius Uzêda, Pedro Balage Brazilian Portuguese
DiscAn - Towards a Discourse Annotation system for Dutch language corpora Ted Sanders, Kirsten Vis, Daan Broeder Dutch
Disco-SPICE Ines Rehbein, Merel Scholman, Vera Demberg English
DisFrEn Ludivine Crible English, French
Finnish PropBank Haverinen, K.; Laippala, V.; Kohonen, S.; Missilä, A.; Nyblom, J.; Ojala, S.; Viljanen, T.; Salakoski, T. & Ginter, F. Finnish
French Discourse Treebank1 Laurence Danlos, Margot Colinet, Jacques Steinlin French
Greek Sentiment Corpus - Oral Giouli, V., Fotopoulou, A., Mouka, E., Saridakis, I. EL
HuComTech Multimodal Corpus László Hunyadi with the Department of General and Applied Linguistics, University of Debrecen Hungarian
Louvain Corpus of Annotated Speech - French Liesbeth Degand, Anne Catherine.Simon, Laurence J. Martin, Noalig Tanguy, Thomas Van Damme French
LUNA corpus Sara Tonelli, Rashmi Prasad, Giuseppe Riccardi, Aravind Joshi Italian
Penn Discourse TreeBank 2.0 Rashmi Prasad, Aravind Joshi, Eleni Miltsakaki, Alan Lee, Nikhil Dinesh, Livio Robaldo, Geraud Campion, Bonnie Webber English
Potsdam Commentary Corpus 2.0 Arne Neumann, Manfred Stede German
Prague Dependency Treebank 3.0 Eduard Bejček, Eva Hajičová, Jan Hajič, Pavlína Jínová, Václava Kettnerová, Veronika Kolářová, Marie Mikulová, Jiří Mírovský, Anna Nedoluzhko, Jarmila Panevová, Lucie Poláková, Magda Ševčíková, Jan Štěpánek, Šárka Zikánová Czech
Prague Discourse Treebank 2.0 Rysová Magdaléna, Synková Pavlína, Mírovský Jiří, Hajičová Eva, Nedoluzhko Anna, Ocelák Radek, Pergler Jiří, Poláková Lucie, Scheller Veronika, Zdeňková Jana, Zikánová Šárka Czech
RST Signalling Corpus Debopam Das, Maite Taboada English
RST Spanish Treebank Iria da Cunha, Juan-Manuel Torres-Moreno, Gerardo Sierra Spanish
STAC - Linguistic Corpus Nicholas Asher, J. Hunter, M. Morey , F. Benamara , S. Afantenos English
STAC - Situated corpus Nicholas Asher, J. Hunter, M. Morey , F. Benamara, S. Afantenos English
The Basque RST Treebank Mikel Iruskieta, Oier Lopez de Lacalle, Esther Miranda, Kike Fernandez, Maxux Aranzabe, Itziar Gonzalez, Mikel Lersundi, Arantza Diaz de Ilarraza Basque
The Haifa Corpus of Spoken Hebrew Yael Maschler Hebrew
The Multilingual RST Treebank Mikel Iruskieta, Iria da Cunha, Maite Taboada English, Spanish, Basque
The MULTINOT Corpus Julia Lavid English , Spanish
Turkish Discourse Bank Deniz Zeyrek, Ruket Çakıcı, Ümit Deniz Turan, Işın Demirşahin, Ayışığı Sevdik Çallı, Hale Ögel Balaban Turkish
Val.Es.Co. 2.0 Corpus Adrián Cabedo, Salvador Pons Spanish