Potsdam Commentary Corpus 2.0

Corpus acronym: 
PCC 2.0
Developer: 
Applied Computational Linguistics, Univ Potsdam (Germany)
Authors: 
Arne Neumann, Manfred Stede
Contact person(s): 
Manfred Stede
Availability: 
Download from webpage
Languages covered: 
German
Corpus size (documents): 
175
Corpus size (sentences): 
ca 1.600
Corpus size (tokens): 
ca 32.000
Mode: 
written
Genre: 
journalistic
Genre (detailed): 
newspaper commentary (op-ed)
Register: 
formal
Register (2): 
non-spontaneous
Text type: 
argumentative
Years of the data origin: 
2000-2003
Document structure: 
Headings, paragraph boundaries
Tools for annotation: 
MMAX2, Annotate, Conano, RSTTool
Tools for browsing: 
ANNIS
Tools for querying: 
ANNIS
Types of DSDs annotated: 
explicit connectives, both inter-sentential and intra-sentential
Number of DSD instances: 
1100
Method of annotation: 
manual
Style/theory of annotation: 
RST-style, PDTB-style
Format: 
Various XML formats
Version number, release date: 

2.0, October 2014

Previous versions and their release dates: 

PCC 1.0, 2003

Citation (text format): 

M. Stede, A. Neumann. Potsdam Commentary Corpus 2.0: Annotation for Discourse Research.In: Proc. of LREC, Reykjavik, 2014.

Citation (bibTeX format): 

@inproceedings{StedeNeumann:14, author = {Manfred Stede and Arne Neumann}, title = {Potsdam Commentary Corpus 2.0: Annotation for Discourse Research}, booktitle = {Proc. of the International Conference on Language Resources and Evaluation (LREC)}, year = {2014}, address = {Reykjavik}, pages = {925-929}, standort = {PDF}}

Notes: 
Further info about the discourse relations: 
information about arguments of each relation is available
senses/semantic labels are annotated for the relations
Other annotation layers: 
sentence morphosyntax, parse structure
anaphora (coreference, bridging)
RST