Pitch Determination of Speech Signals: Algorithms and Devices by Wolfgang Hess
Author(s) -
Wolfgang Hess,
Douglas O’Shaughnessy
Publication year - 1984
Publication title -
the journal of the acoustical society of america
Language(s) - English
Resource type - Journals
eISSN - 1520-8524
pISSN - 0001-4966
DOI - 10.1121/1.391349
Subject(s) - computer science , algorithm , speech recognition , acoustics , physics
1. Introduction.- 1.1 Voice Source Parameter Measurement and the Speech Signal.- 1.2 A Short Look at the Areas of Application.- 1.3 Organization of the Book.- 2. Basic Terminology. A Short Introduction to Digital Signal Processing.- 2.1 The Simplified Model of Speech Excitation.- 2.2 Digital Signal Processing 1: Signal Representation.- 2.3 Digital Signal Processing 2: Filters.- 2.4 Time-Variant Systems. The Principle of Short-Term Analysis.- 2.5 Definition of the Task. The Linear Model of Speech Production.- 2.6 A First Categorization of Pitch Determination Algorithms (PDAs).- 3. The Human Voice Source.- 3.1 Mechanism of Sound Generation at the Larynx.- 3.2 Operational Modes of the Larynx. Registers.- 3.3 The Glottal Source (Excitation) Signal.- 3.4 The Influence of the Vocal Tract Upon Voice Source Parameters.- 3.5 The Voiceless and the Transient Sources.- 4. Measuring Range, Accuracy, Pitch Perception.- 4.1 The Range of Fundamental Frequency.- 4.2 Pitch Perception. Toward a Redefinition of the Task.- 4.2.1 Pitch Perception: Spectral and Virtual Pitch.- 4.2.2 Toward a Redefinition of the Task.- 4.2.3 Difference Limens for Fundamental-Frequency Change.- 4.3 Measurement Accuracy.- 4.4 Representation of the Pitch Information in the Signal.- 4.5 Calibration and Performance Evaluation of a PDA.- 5. Manual and Instrumental Pitch Determination, Voicing Determination.- 5.1 Manual Pitch Determination.- 5.1.1 Time-Domain Manual Pitch Determination.- 5.1.2 Frequency-Domain Manual Pitch Determination.- 5.2 Pitch Determination Instruments (PDIs).- 5.2.1 Clinical Methods for Larynx Inspection.- 5.2.2 Mechanic PDIs.- 5.2.3 Electric PDIs.- 5.2.4 Ultrasonic PDIs.- 5.2.5 Photoelectric PDIs (Transillumination of the Glottis).- 5.2.6 Comparative Evaluation of PDIs.- 5.3 Voicing Determination - Selected Examples.- 5.3.1 Voicing Determination: Parameters.- 5.3.2 Voicing Determination - Simple Voicing Determination Algo-rithms (VDAs) Combined VDA-PDA Systems.- 5.3.3 Multiparameter VDAs. Voicing Determination by Means of Pattern Recognition Methods.- 5.3.4 Summary and Conclusions.- 6. Time-Domain Pitch Determination.- 6.1 Pitch Determination by Fundamental-Harmonic Extraction.- 6.1.1 The Basic Extractor.- 6.1.2 The Simplest Pitch Determination Device - Low-Pass Filter and Zero (or Threshold) Crossings Analysis Basic Extractor.- 6.1.3 Enhancement of the First Harmonic by Nonlinear Means.- 6.1.4 Manual Preset and Tunable (Adaptive) Filters.- 6.2 The Other Extreme - Temporal Structure Analysis.- 6.2.1 Envelope Modeling - the Analog Approach.- 6.2.2 Simple Peak Detector and Global Correction.- 6.2.3 Zero Crossings and Excursion Cycles.- 6.2.4 Mixed-Feature Algorithms.- 6.2.5 Other PDAs That Investigate the Temporal Structure of the Signal.- 6.3 The Intermediate Device: Temporal Structure Transformation and Simplification.- 6.3.1 Temporal Structure Simplification by Inverse Filtering.- 6.3.2 The Discontinuity in the Excitation Signal: Event Detection.- 6.4 Parallel Processing in Fundamental Period Determination. Multichannel PDAs.- 6.4.1 PDAs with Multichannel Preprocessor Filters.- 6.4.2 PDAs with Several Channels Applying Different Extraction Principles.- 6.5 Special-Purpose (High-Accuracy) Time-Domain PDAs.- 6.5.1 Glottal Inverse Filtering.- 6.5.2 Determining the Instant of Glottal Closure.- 6.6 The Postprocessor.- 6.6.1 Time-to-Frequency Conversion Display.- 6.6.2 f0 Determination With Basic Extractor Omitted.- 6.6.3 Global Error Correction Routines.- 6.6.4 Smoothing Pitch Contours.- 6.7 Final Comments.- 7. Design and Implementation of a Time-Domain PDA for Undistorted and Band-Limited Signals.- 7.1 The Linear Algorithm.- 7.1.1 Prefiltering.- 7.1.2 Measurement and Suppression of F1.- 7.1.3 The Basic Extractor.- 7.1.4 Problems with the Formant F2. Implementation of a Multiple Two-Pulse Filter (TPF).- 7.1.5 Phase Relations and Starting Point of the Period.- 7.1.6 Performance of the Algorithm with Respect to Linear Distortions, Especially to Band Limitations.- 7.2 Band-Limited Signals in Time-Domain PDAs.- 7.2.1 Concept of the Universal PDA.- 7.2.2 Once More: Use of Nonlinear Distortion in Time-Domain PDAs.- 7.3 An Experimental Study Towards a Universal Time-Domain PDA Applying a Nonlinear Function and a Threshold Analysis Basic Extractor.- 7.3.1 Setup of the Experiment.- 7.3.2 Relative Amplitude and Enhancement of First Harmonic.- 7.4 Toward a Choice of Optimal Nonlinear Functions.- 7.4.1 Selection with Respect to Phase Distortions.- 7.4.2 Selection with Respect to Amplitude Characteristics.- 7.4.3 Selection with Respect to the Sequence of Processing.- 7.5 Implementation of a Three-Channel PDA with Nonlinear Processing.- 7.5.1 Selection of Nonlinear Functions.- 7.5.2 Determination of the Parameter for the Comb Filter.- 7.5.3 Threshold Function in the Basic Extractor.- 7.5.4 Selection of the Most Likely Channel in the Basic Extractor.- 8. Short-Term Analysis Pitch Determination.- 8.1 The Short-Term Transformation and Its Consequences.- 8.2 Autocorrelation Pitch Determination.- 8.2.1 The Autocorrelation Function and Its Relation to the Power Spectrum.- 8.2.2 Analog Realizations.- 8.2.3 "Ordinary" Autocorrelation PDAs.- 8.2.4 Autocorrelation PDAs with Nonlinear Preprocessing.- 8.2.5 Autocorrelation PDAs with Linear Adaptive Preprocessing.- 8.3 "Anticorrelation" Pitch Determination: Average Magnitude Difference Function, Distance and Dissimilarity Measures, and Other Nonstationary Short-Term Analysis PDAs.- 8.3.1 Average Magnitude Difference Function (AMDF).- 8.3.2 Generalized Distance Functions.- 8.3.3 Nonstationary Short-Term Analysis and Incremental Time-Domain PDAs.- 8.4 Multiple Spectral Transform ("Cepstrum") Pitch Determination.- 8.4.1 The More General Aspect: Deconvolution.- 8.4.2 Cepstrum Pitch Determination.- 8.5 Frequency-Domain PDAs.- 8.5.1 Spectral Compression: Frequency and Period Histogram Product Spectrum.- 8.5.2 Harmonic Matching. Psychoacoustic PDAs.- 8.5.3 Determination of f0 from the Distance of Adjacent Spectral Peaks.- 8.5.4 The Fast Fourier Transform, Spectral Resolution, and the Computing Effort.- 8.6 Maximum-Likelihood (Least-Squares) Pitch Determination.- 8.6.1 The Least-Squares Algorithm.- 8.6.2 A Multichannel Solution.- 8.6.3 Computing Complexity, Relation to Comb Filters, Simplified Realizations.- 8.7 Summary and Conclusions.- 9. General Discussion: Summary, Error Analysis, Applications.- 9.1 A Short Survey of the Principal Methods of Pitch Determination.- 9.1.1 Categorization of PDAs and Definitions of Pitch.- 9.1.2 The Basic Extractor.- 9.1.3 The Postprocessor.- 9.1.4 Methods of Preprocessing.- 9.1.5 The Impact of Technology of the Design of PDAs and the Question of Computing Effort.- 9.2 Calibration, Search for Standards.- 9.2.1 Data Acquisition.- 9.2.2 Creating the Standard Pitch Contour Manually, Automatically, and by an Interactive PDA.- 9.2.3 Creating a Standard Contour by Means of a PDI.- 9.3 Performance Evaluation of PDAs.- 9.3.1 Comparative Performance Evaluation of PDAs: Some Examples from the Literature.- 9.3.2 Methods of Error Analysis.- 9.4 A Closer Look at the Applications.- 9.4.1 Has the Problem Been Solved?.- 9.4.2 Application in Phonetics, Linguistics, and Musicology.- 9.4.3 Application in Education and in Pathology.- 9.4.4 The "Technical" Application: Speech Communication.- 9.4.5 A Way Around the Problem in Speech Communication: Voice-Excited and Residual-Excited Vocoding (Baseband Coding).- 9.5 Possible Paths Towards a General Solution.- Appendix A. Experimental Data on the Behavior of Nonlinear Functions in Time-Domain Pitch Determination Algorithms.- A.1 The Data Base of the Investigation.- A.2 Examples for the Behavior of the Nonlinear Functions.- A.3 Relative Amplitude RA1 and Enhancement RE1 of the First Harmonic.- A.4 Relative Amplitude RASM of Spurious Maximum and Autocorrelation Threshold.- A.5 Processing Sequence, Preemphasis, Phase, Band Limitation.- A.6 Optimal Performance of Nonlinear Functions.- A.7 Performance of the Comb Filters.- Appendix B. Original Text of the Quotations in Foreign Languages Throughout This Book.- List of Abbreviations.- Author and Subject Index.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom