Higher order spectral (HOS) techniques, such as the
bispectrum, offer robustness to Gaussian noise and the ability to
recover phase information. However, their drawbacks, such as
the high variance of estimates and the need for long data records,
have limited their use in conventional speech processing
applications. As in glottal pulse estimation, all existing inverse
filtering approaches use second-order statistics, it is of interest to
explore the potential of HOS in this area. Using the theory of
HOS factorization and the linear bispectrum, it is shown how
voiced speech can be modelled as a nonGaussian coloured noise
driven system. The linear bispectrum approach can be used to
obtain alternative glottal pulse and vocal tract estimates in
hybrid Iterative Adaptive Inverse Filtering (hIAIF) and the
results are compared with traditional IAIF. Finally, a new
technique which involves joint estimation of the glottal pulse and
vocal tract followed by inverse filtering is presented. This new
technique shows good preliminary results and is much simpler
than previous techniques.
History
Publication
ISCA Tutorial and Workshop on Nonlinear Speech Processing - Nolisp'03;