Skip to main content

Research Repository

Advanced Search

The evaluation of three related techniques for the statistical analysis of clipped speech

The evaluation of three related techniques for the statistical analysis of clipped speech Thumbnail


Abstract

Techniques are described for the statistical analysis of
clipped speech in terms of the time intervals between the zero crossings or the speech waveform or its time derivative. The only statistic of this kind which had been used prior to this study was the time interval
histogram. This suffers from great variability and does not have any perceptual evidence to support its use. The present studies have high-lighted some causes of its limitations, especially the perturbations due to the pitch of a voiced sauna. Its usefulness as a discriminator of the continuant phonemes of the English language has been shown to be fairly restricted.

An analysis of the second order, or digram statistics of the time intervals is described. The use of this for discrimination of speech sounds has some perceptual support and has been found to yield some inter­esting differences in patterns between various speech sounds. A real time visible speech display based on this statistical measure has been developed employing simple ana1ogue circuitry. A wide range of samples of speech
sounds have been examined using this novel method.

A technique of pitch-synchronous analysis of the time-intervals has been developed to facilitate selective rejection of noise during voiced .speech. This method has been found to reduce the effects of conventional noise and of' pitch perturbations in the time interval statistics. A similar technique has been developed for use with the real. time visible speech display.

A PDP - 8 computer was programmed to make quantitative
measurements on the time interval statistics of vowel sounds, in order that the relative discriminability of the sounds analysed by the three techniques of histogram, digram, and pitch-synchronous analysis of the time intervals could be, assessed.

The results of this analysis have shown that for a given
speaker and a limited set of sample utterances greater discrimination between vowels can be achieved using digram rather than histogram statistics. No significant difference in discriminative power was found between histogram and digram analysis when the set of utterances
was not restricted. Pitch-synchronous analysis has been found to reduce the dependance of the statistics on pitch but to give no corres­ponding overall increase in vowel discrimination.
It is concluded that these time domain techniques can form a useful component of the analysis needed for automatic speech recognition, but that other types of signal processing will be required m parallel if
identification is to be reliable.

Publicly Available Date Mar 28, 2024

Files




Downloadable Citations