Louvain Corpus of Annotated Speech - French

Corpus acronym: 
LOCAS-F
Developer: 
University of Louvain, Valibel _ Discourse and Variation
Authors: 
Liesbeth Degand, Anne Catherine.Simon, Laurence J. Martin, Noalig Tanguy, Thomas Van Damme
Contact person(s): 
Liesbeth Degand
Availability: 
N/A
Languages covered: 
French
Available translations: 
N/A
Corpus size (hours): 
3:11
Corpus size (documents): 
N/A
Corpus size (sentences): 
N/A
Corpus size (tokens): 
36,912
Corpus size (other): 
N/A
Mode: 
spoken
Genre: 
journalistic
fiction
interactional (social networks, sms, everyday conversation, etc.)
Genre (detailed): 
interview, narrative, radio news, political speech, academic speech
Register: 
casual
semi-formal
formal
Register (2): 
spontaneous
semi-spontaneous
non-spontaneous
Years of the data origin: 
2010
Document structure: 
N/A
Unit of segmentation used: 
Basuc Discourse Unit, Dependency clause, major intonation unit
Tools for annotation: 
Praat
Tools for browsing: 
Praaline
Tools for querying: 
Praaline
Types of DSDs annotated: 
Intersentential DMs (only DMs, including connectives, outside dependency clause), only explicit
Number of DSD instances: 
1334
Method of annotation: 
manual
Style/theory of annotation: 
N/A
Format: 
txt
Version number, release date: 

Version 1.0, October 2014

Previous versions and their release dates: 

N/A

Citation (text format): 

Degand, Liesbeth, Laurence J. Martin, and Anne-Catherine Simon. 2014. “Unités discursives de base et leur périphérie gauche dans LOCAS-F, un corpus oral multigenres annoté.” In CMLF 2014 - 4ème Congrès Mondial de Linguistique Française 2014, edited by EDP Sciences. Berlin, Allemagne.Degand, L., A. C. Simon, N. Tanguy, T. Van Damme (2014). Initiating a discourse unit in spoken French: Prosodic and syntactic features of the left periphery. In Pons Bordería, Salvador (ed.): Discourse Segmentation in Romance Languages. [Pragmatics and Beyond New Series 250], Amsterdam: John Benjamins, 243-273.

Citation (bibTeX format): 

N/A

Notes: 

Ongoing work:
- adding spontaneous face-to-face data
- semantically label DMs

Audio/video annotation: 
alignment of the audio/video to the transcriptions
annotation of prosody
Other annotation layers: 
intonation/prosody
sentence morphosyntax, parse structure