The Haifa Corpus of Spoken Hebrew

Corpus acronym: 
N/A
Developer: 
University of Haifa, Israel, Department of Hebrew Language
Authors: 
Yael Maschler
Contact person(s): 
Yael Maschler
Availability: 
by request
Languages covered: 
Hebrew
Available translations: 
N/A
Corpus size (hours): 
Over 17.5 hours
Corpus size (documents): 
325 audio files + 325 Word files
Corpus size (sentences): 
N/A
Corpus size (tokens): 
N/A
Corpus size (other): 
N/A
Mode: 
spoken
Genre: 
interactional (social networks, sms, everyday conversation, etc.)
radio phone-in programs
Genre (detailed): 
face-to-face conversation, political radio phone-ins
Register: 
casual
Register (2): 
spontaneous
Text type: 
narrative
argumentative
Years of the data origin: 
1993-2014 (and continuing into the present)
Document structure: 
Prosody: intonation unit boundaries, intonation contour type, length of pauses, primary and secondary stress, etc.
Unit of segmentation used: 
intonation unit
Tools for annotation: 
manual annotation,occasional use of PRAAT
Tools for browsing: 
Any web browser
Tools for querying: 
SketchEngine (for part of the corpus)
Types of DSDs annotated: 
All inter-sentential explicit discourse markers (textual, interpersonal, and cognitive) in approximately 40 minutes of the 17.5 hours have been identified manually (but not annotated in the corpus itself).
Number of DSD instances: 
574 tokens, 92 types in 40 minutes (out of the 17.5 hours)
Method of annotation: 
manual
Style/theory of annotation: 
Transcription conventions, University of California at Santa Barbara Linguistics Department (Bu Bois, forthcoming).
Format: 
Word files, XML format available for the majority of the data.
Version number, release date: 

N/A

Previous versions and their release dates: 

N/A

Citation (text format): 

Maschler Yael, 2014. The Haifa Corpus of Spoken Hebrew. http://weblx2.haifa.ac.il/~corpus/corpus_website/

Citation (bibTeX format): 

Maschler Yael, 2014. The Haifa Corpus of Spoken Hebrew. http://weblx2.haifa.ac.il/~corpus/corpus_website/

Notes: 
Audio/video annotation: 
annotation of prosody
Other annotation layers: 
intonation/prosody