COST (European Cooperation in Science and Technology) is a pan-European intergovernmental framework. Its mission is to enable break-through scientific and technological developments leading to new concepts and products and thereby contribute to strengthening Europe’s research and innovation capacities.

ISCH COST Action IS1312

Similar COST Actions

Latest News

DisFrEn

Primary tabs

Submitted by Anonymous (not verified) on Tue, 20/10/2015 - 11:57

Corpus acronym:

DisFrEn

Developer:

Université Catholique de Louvain

Authors:

Ludivine Crible

Contact person(s):

Ludivine Crible

Contact person e-mail(s):

ludivine.crible@uclouvain.be

Project URL:

http://uclouvain.be/435162.html

Availability:

annotation complete; extracted annotations available for the database

Languages covered:

English, French

Available translations:

the corpus is comparable, not translated

Corpus size (hours):

Corpus size (documents):

111

Corpus size (tokens):

161700

Mode:

spoken

Genre:

journalistic

science

interactional (social networks, sms, everyday conversation, etc.)

Genre (detailed):

interview (face-to-face and radio), conversation, phone calls, news broadcast, political speech, classroom lessons, sports commentaries

casual

semi-formal

formal

spontaneous

semi-spontaneous

non-spontaneous

Years of the data origin:

1991-2010

Unit of segmentation used:

word

Tools for annotation:

EXMARaLDA (Partitur Editor)

Tools for browsing:

EXMARaLDA (Exakt)

Tools for querying:

EXMARaLDA (Exakt)

Types of DSDs annotated:

explicit only relational & non-relational (e.g. because & well) take scope over at least one unit which is equal to or bigger than a clause (excludes intra-sentential conjunctions)

Number of DSD instances:

8743

Method of annotation:

manual

Style/theory of annotation:

PDTB style

Format:

.exb (EXMARaLDA), Praat TextGrid, XML ...

Version number, release date:

will be version 1.0, finished by end of 2015

Previous versions and their release dates:

Citation (text format):

Crible, L. 2017. "Discourse markers and (dis)fluency across registers: A contrastive usage-based study in English and French". PhD thesis, Université catholique de Louvain.

Citation (bibTeX format):

@phdthesis{crib17, author = "Discourse markers and (dis)fluency across registers: A contrastive usage-based study in English and French", year = "2017"}

Notes:

The transcription and audio files are not mine (either freely available corpora, or available by convention). Most of them underwent technical treatment to be homogenized in the same format ; some of them were sound-aligned. All annotations are mine.

Audio/video annotation:

alignment of the audio/video to the transcriptions

Further info about the discourse relations:

senses/semantic labels are annotated for the relations

Other annotation layers:

syntactic position of the discourse markers ; POS ; disfluency markers

Main menu

Secondary menu

Similar COST Actions

Latest News

DisFrEn

Primary tabs

Main menu

Secondary menu

You are here

Similar COST Actions

Latest News

Search form

DisFrEn

Primary tabs

User login