posted on 2014-10-22, 10:20authored byEssa Jafer, Abdulhussain E. Mahdi
This paper proposes a new wavelet-based algorithm for voice/unvoiced classification of speech
segments. The classification process is based on: 1) statistical analysis of the energy-frequency distribution of
the speech signal using wavelet transform, and 2) estimation of the short-time zero-crossing rate of the signal.
First, the ratio of the average energy in the low-frequency wavelet subbands to that of highest-frequency
wavelet subband is computed for each time segment of the pre-emphasised speech using a 4-level dyadic
wavelet transform, and compared to a pre-determined threshold. This is followed by measuring the zerocrossing
rate of the segment and comparing it to a threshold determined by a continually up-dated value of the
median of the zero-crossing rates of the speech signal. An experimentally verified criterion based on the results
of the above two comparison processes is then applied to obtain the classification decision. The performance
of the algorithm has been evaluated on speech data taken from the TIMIT database, and is shown to
yield high classification accuracy and robustness to additive noise.
History
Publication
Open Electrical and Electronic Engineering Journal;2, pp. 8-13