<title>Speech-property-based FEC for Internet telephony applications</title>
Author(s) -
H. Sanneck,
Nguyen Tuong Long Le
Publication year - 1999
Publication title -
proceedings of spie, the international society for optical engineering/proceedings of spie
Language(s) - English
Resource type - Conference proceedings
SCImago Journal Rank - 0.192
H-Index - 176
eISSN - 1996-756X
pISSN - 0277-786X
DOI - 10.1117/12.373533
Subject(s) - computer science , codec , encoder , redundancy (engineering) , speech recognition , speech coding , voice activity detection , decoding methods , filter (signal processing) , speech processing , telecommunications , computer vision , operating system
Recently we have seen research efforts on how to protect a real-time speech signal when transmitting over an unreliable packet-switched network like the Internet by open-loop error control. Research has covered the type of Foward Error Correction (generic or voice-specific), the protocol support needed and adaptivity to the current network congestion state. However, the sender does not take into account that some segments of the signal are essential to the speech quality, while others can be extrapolated at the r eceiver from data received earlier in the event of a packet loss. This is especially true for modern frame-based codecs like the G.729 and G.723.1 which contain an internal loss concealment algorithm. Thus, the sender consumes additional bandwidth and aggravates the congestion in the Internet by sending unnecessary redundancy. In this paper we first analyze the concealment performance of the G.729 decoder. We find that the loss of unvoiced frames can be concealed well. Also, the loss of voiced frames is concealed well once the decoder has obtained sufficient information on them. However the decoder fails to conceal the loss of voiced frames at an unvoiced/voiced transition because it extrapolates internal state (filter coefficients and excitation) for an unvoiced sound. Moreover, once the encoder has failed to build the appropriate linear prediction synthesis filter, it takes a long time for the decoder to resynchronize w ith the encoder. Using this result, we then develop a new FEC scheme to support frame-based codecs, which adjusts the amount of added redundancy adaptively to the properties of the speech signal. Objective qu ality measures (ITU P.861A and EMBSD) show that our speech property-based FEC (SPB-FEC) scheme achieves almost the same speech quality as current FEC schemes while approximately halving the amount of necessary redundant data to adequately protect the voice flow.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom