WG1: Resources

This WG is in charge of assembling a list of existing corpora in the various languages, checking copyright issues, and developing a systematic description of each in the form of standardised metadata (to support interoperable search and facilitate comparisons between languages, genres, modes, …). The web portal administered by the Action will collect the information on those corpora, and assign appropriate meta-data, i.e. giving information on the language data contained in the corpus at stake. The links to DSD-Annotated Corpora and DSD Inventories can be found in the Resources section.


Dr Jiří Mirovsky, Dr Amalia Mendes

  • Updating the list of discourse-annotated corpora
  • Designing a standardised metadata set
  • Applying the metadata to each of the corpora
  • Extending the list of corpora to additional data sets
  • Common metadata set 
  • Lexicons of DRDs that have been collected for various languages will be harmonised to make them interoperable. This will lead to the first multilingual comprehensive overview of DRD-relevant resources on a European scale.