COST (European Cooperation in Science and Technology) is a pan-European intergovernmental framework. Its mission is to enable break-through scientific and technological developments leading to new concepts and products and thereby contribute to strengthening Europe’s research and innovation capacities.

ISCH COST Action IS1312

Similar COST Actions

Latest News

Penn Discourse TreeBank 2.0

Primary tabs

Submitted by Anonymous (not verified) on Tue, 20/10/2015 - 11:57

Corpus acronym:

PDTB 2.0

Developer:

University of Pennsylvania, School of Computer & Information Science

Authors:

Rashmi Prasad, Aravind Joshi, Eleni Miltsakaki, Alan Lee, Nikhil Dinesh, Livio Robaldo, Geraud Campion, Bonnie Webber

Contact person(s):

Bonnie Webber

Contact person e-mail(s):

http://bonnie.webber@ed.ac.uk

Project URL:

http://www.seas.upenn.edu/~pdtb

Availability:

Contact the Linguistics Data Consortium, http:///www.ldc.upenn.edu

Languages covered:

English

Available translations:

Penn TreeBank corpus (over whose raw text the PDTB has been annotated) has been translated into Czech, available as the PCEDT.

Corpus size (documents):

~2400 documents

Corpus size (tokens):

~1m words, ~40K annotation tokens

Mode:

written

Genre:

journalistic

Genre (detailed):

Corpus contains news, essays, reviews, letters to the editor, errata

formal

non-spontaneous

Text type:

narrative

expository

Years of the data origin:

1989

Document structure:

Raw text preserves sentence and paragraph boundaries; PDTB 2.0 recovers divisions between letters and between "news summaries" found in the original WSJ documents

Tools for annotation:

Annotation tool available at http://www.seas.upenn.edu/~pdtb

Tools for browsing:

Browser available at http://www.seas.upenn.edu/~pdtb

Types of DSDs annotated:

+ explicit inter-sentential discourse connectives and alternative lexicalizations of discourse connectives.+ explicit intra-sentential discourse connectives+ implicit discourse relations between adjacent sentences within the same paragraph

Number of DSD instances:

~40K

Method of annotation:

Manual, with manual adjudication

Style/theory of annotation:

PDTB style

Annotation manual URL:

http://www.seas.upenn.edu/~pdtb/PDTBAPI/pdtb-annotation-manual.pdf

Format:

Either pipe-delimited fields or multi-line format

Version number, release date:

PDTB 2.0, released 2008

Previous versions and their release dates:

PDTB 1.0, March 2006

Citation (text format):

Rashmi Prasad, Nikhil Dinesh, Alan Lee, Eleni Miltsakaki, Livio Robaldo, Aravind Joshi and Bonnie Webber. The Penn Discourse Treebank 2.0. Proceedings of the 6th International Conference on Language Resources and Evaluation (LREC). Marrakech, Morocco, 2008.Rashmi Prasad, Bonnie Webber and Aravind Joshi.Reflections on the Penn Discourse TreeBank, Comparable Corpora and Complementary Annotation. Computational Linguistics 40(4), December 2014.

Citation (bibTeX format):

@inproceedings{prasad08,author = "Rashmi Prasad and Nikhil Dinesh and Alan Lee and Eleni Miltsakaki and Livio Robaldo and Aravind Joshi and Bonnie Webber",title = "{The Penn Discourse TreeBank 2.0}",booktitle = "{Proceedings, 6th International Conference on Language Resources and Evaluation}",address = "Marrakech, Morocco",year = "2008",pages = "2961--2968"}@article{prasad-etal14,author = {Rashmi Prasad and Bonnie Webber and Aravind Joshi},year = {2014},title = {Reflections on the Penn Discourse TreeBank, Comparable Corpora and Complementary Annotation},journal = {Computational Linguistics},volume = {40(4)},pages = {921-950},doi = {10.1162/COLI_a_00204}}

Notes:

Further info about the discourse relations:

information about arguments of each relation is available

senses/semantic labels are annotated for the relations

Other annotation layers:

sentence morphosyntax, parse structure

anaphora (coreference, bridging)

semantic roles, but all distributed separately, as Penn TreeBank, PropBank, and OntoNotes

Main menu

Secondary menu

Similar COST Actions

Latest News

Penn Discourse TreeBank 2.0

Primary tabs

Main menu

Secondary menu

You are here

Similar COST Actions

Latest News

Search form

Penn Discourse TreeBank 2.0

Primary tabs

User login