z-logo
open-access-imgOpen Access
Polytonia
Author(s) -
Piet Mertens
Publication year - 2021
Publication title -
journal of speech sciences
Language(s) - English
Resource type - Journals
ISSN - 2236-9740
DOI - 10.20396/joss.v4i2.15053
Subject(s) - annotation , pitch contour , transcription (linguistics) , speech recognition , syllabic verse , computer science , segmentation , pitch detection algorithm , range (aeronautics) , syllable , artificial intelligence , speech processing , pattern recognition (psychology) , linguistics , philosophy , materials science , composite material
This paper first proposes a labeling scheme for tonal aspects of speech and then describes an automatic annotation system using this transcription. This fine-grained transcription provides labels indicating pitch level and pitch movement of individual syllables. Of the five pitch levels, three (low, mid, high) are defined on the basis of pitch changes in the local context and two (bottom, top) are defined relative to the boundaries of the speaker’s global pitch range. For pitch movements, both simple and compound, the transcription indicates direction (rise, fall, level) and size, using size categories (pitch intervals) adjusted relative to the speaker’s pitch range. The automatic tonal annotation system combines several processing steps: segmentation into syllable peaks, pause detection, pitch stylization, pitch range estimation, classification of the intra-syllabic pitch contour, and pitch level assignment. It uses a dedicated and rule-based procedure, which unlike commonly used supervised learning techniques does not require a labeled corpus for training the model. The paper also includes a preliminary evaluation of the annotation system, for a reference corpus of nearly 14 minutes of spontaneous speech in French and Dutch, in order to quantify the annotation errors. The results, expressed in terms of standard measures of precision, recall, accuracy and Fmeasure are encouraging. For pitch levels low, mid and high an F-measure between 0.946 and 0.815 is obtained and for pitch movements a value between 0.708 and 1. Provided additional modules for the detection of prominence and prosodic boundaries, the resulting annotation may serve as an input for a phonological annotation.  

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here