posted on 2016-07-14, 12:38authored byAbdulhussain E. Mahdi, Dorel Picovici
This paper describes a newly developed output-based method for non-intrusive evaluation of
speech quality of voice communication systems, and evaluates its performance. The method, which uses only
the output of the system, is based on measuring perceptually motivated objective auditory distances between
the voiced parts of the speech signal whose quality to be evaluated to appropriately matching reference vectors
extracted from a pre-formulated codebook. The codebook is formed by optimally clustering large number of
perceptually-based parametric vectors extracted from a database of clean speech signals. The auditory distance
measures are then mapped into equivalent subjective score, represented by the Mean Opinion scores (MOS),
using regression. The required clustering and matching processes are achieved by using an efficient neural
network based data mining technique known as the Self-Organizing Map. Perceptual, speaker-independent
parametric representation of the speech is achieved by using Linear Prediction (PLP) and Bark Spectrum
analysis. Reported evaluation results show that the proposed system is robust against speaker, utterance and
distortion variations, and outperforms the ITU-T P.862 Perceptual Evaluation of Speech Quality (PESQ) for
cases of speech degraded by channel impairments.
History
Publication
WSEAS Transactions on Acoustics and Music: 1 (3), pp. 139-144.
Publisher
World Scientific and Engineering Academy and Society (WSEAS)