Catalan Discourse GraphBank

Corpus acronym: 
CatDiG
Developer: 
Pompeu Fabra University
Authors: 
Toni Badia, Roser Saurí, Teresa Suñol
Contact person(s): 
Roser Saurí
Availability: 
Currently negotiating distribution procedure. Please contact via email.
Languages covered: 
Catalan
Available translations: 
N/A
Corpus size (hours): 
N/A
Corpus size (documents): 
127
Corpus size (sentences): 
N/A
Corpus size (tokens): 
48,410
Corpus size (other): 
N/A
Mode: 
written
Genre: 
journalistic
fiction
Genre (detailed): 
Newspaper news, novels.
Register: 
semi-formal
formal
Text type: 
narrative
expository
Years of the data origin: 
1998-2003
Document structure: 
sentence boundaries
Unit of segmentation used: 
N/A
Tools for annotation: 
For delimiting discourse segments: Brandeis Annotation Tool (BAT, http://www.timeml.org/site/bat/). For setting relations among discourse segments: Cytoscape (http://www.cytoscape.org)
Tools for browsing: 
N/A
Tools for querying: 
N/A
Types of DSDs annotated: 
Explicit and implicit. Inter- and intra-sentential.
Number of DSD instances: 
N/A
Method of annotation: 
Manual
Style/theory of annotation: 
A combination of SDRT and GraphBank.
Format: 
XGMML (eXtensible Graph Markup and Modelling Language), XML.
Version number, release date: 

1.0 (in preparation)

Previous versions and their release dates: 

N/A

Citation (text format): 

Badia, T., R. Saurí, T. Suñol (2015). The Catalan Discourse GraphBank. Proceedings of the TextLink First Action Conference. Louvain-La-Neuve, Belgium. 26-28 January 2015.

Citation (bibTeX format): 

@InProceedings{badiaEtAl_2015,author = {Toni Badia and Roser Saur\'{\i} and Teresa Su\~{n}ol},title = {The Catalan Discourse GraphBank},booktitle ={Proceedings of the TextLink First Action Conference},address = {Louvain-La-Neuve, Belgium},year = {2015}}.

Notes: 

N/A

Further info about the discourse relations: 
information about arguments of each relation is available
senses/semantic labels are annotated for the relations
Other annotation layers: 
sentence morphosyntax, parse structure
anaphora (coreference, bridging)
Argument structure and thematic roles, semantic classes of verbs, type of deverbal nouns, word net synsets for nouns, named entities