The overarching aim of the Textlink Action is to unify (scattered) linguistic resources on discourse structure and build systems searchable by form and meaning to allow cross-linguistic investigations. As discussed in the “Portal Use Case Focus Meeting” in Edinburgh in February 2017, a group of researchers within Textlink has taken the initiative to develop a multilingual, crosslinguistic corpus of TED talks (TED-MDB), where TED talks transcripts are annotated in the PDTB style. Currently this resource includes annotations on six languages (English, European Portuguese, Polish, German , Russian and Turkish) and is intended to be extended to new languages with richer annotations involving aspects of spoken language, which are a component of the TED talks.
To discuss issues related to cross-lingual discourse-level annotation, we will hold a meeting on Annotation of Discourse Relational Devices (DRDs): Multilingual and Multimodal Challenges in Madrid (Spain) on 12-14 June 2017.
The meeting has three main aims: a) to discuss, plan and facilitate the extension of TED-MDB to additional languages; b) to consider complementary aspects of the annotation of DRDs in spoken and written language, in both multilingual and multigenre contexts; c) to explore complementarities between multilingual and multimodal annotation using the TED talks as our testbed.
The meeting has a very practical focus: we will provide extensive hands-on-experience on the multilingual and multimodal annotation of DRDs in the TED talks, revising the methodologies used in the annotation of different languages and the annotation proposals for spoken DRDs in different genres.
More information is available on the workshop website.