Timbre Perception of Time-Varying Signals
Every auditory event provides an information-rich signal to the brain. The signal constitutes perceptual attributes of pitch, loudness, timbre, and also, conceptual attributes like location, emotions, meaning, etc. In the present work we examine the timbre perception of time-varying signals in particular. While stationary signal timbre, by-itself is complex perceptually, the time-varying signal timbre introduces an evolving pattern, adding to its multi-dimensionality. To characterize timbre, we conduct psycho-acoustic perception tests with normal-hearing human subjects. We focus on time-varying synthetic speech signals(can be extended to music) because listeners are perceptually consistent with speech. Also, we can parametrically control the timbre and pitch glides using linear time-varying models. In order to quantify the timbre change in time-varying signals, we define the JND(Just noticeable difference) of timbre using diphthongs, synthesized using time-varying formant frequency model. The diphthong JND is defined as a two dimensional contour on the plane of percentage change of formant frequencies of terminal vowels. Thus, we simplify the perceptual probing to a lower dimensional space, i.e, 2-D even for a diphthong, which is multi-parametric. We also study the impact of pitch glide on the timbre JND of the diphthong. It is observed that timbre JND is influenced by the occurrence of pitch glide. Focusing on the magnitude of perceptual timbre change, we design a MUSHRA-like listening test using the vowel continuum in the formant-frequency space. We provide explicit anchors for reference: 0% and 100%, thus quantifying the perceptual timbre change on a 1-D scale. We also propose an objective measure of timbre change and observe that there is good correlation between the objective measure and subjective human responses of percentage timbre change. Using the above experimental methodology, we studied the influence of pitch shift on timbre perception and observed that the perceptual timbre change increases with change in pitch. We used vowels and diphthongs with 5 different types of pitch glides-(i) Constant pitch,(ii) 3-semitone linearly-up,(iii) 3 semitone linearly-down, (iv)V–like pitch glide and (v) hat-like pitch glide. The present study shows that timbre change can be measured on a 1-D scale if the perturbation is along one-dimension. We observe that for bright vowels(/a/and/i/), linearly decreasing pitch glide(dull pitch glide)causes more timbre change than linearly increasing pitch glide(bright pitch glide).For dull vowels(/u/),it is vice-versa. To summarize, in congruent pitch glides cause more perceptual timbre change than congruent pitch glides.(Congruent pitch glide implies bright pitch glide in bright vowel or dull pitch glide in dull vowel and in congruent pitch glide implies bright pitch glide in dull vowel or dull pitch glide in bright vowel.) Experiments with quadratic pitch glides show that the decay portion of pitch glide affects timbre perception more than the attack portion in short duration signals with less or no sustained part. In case of time-varying timbre, bright diphthongs show patterns similar to bright vowels. Also, for bright diphthongs(/ai/), perceived timbre change is most with decreasing pitch glide(dull pitch glide). We also observed that listeners perceive more timbre change in constant pitch than in pitch glides, congruent with the timbre or pitch glides with quadratic changes. The main conclusion of this study is that pitch and timbre do interact and in congruent pitch glides cause more timbre change than congruent pitch glides. In the case of quadratic pitch glides, listener perception of vowels is influenced by the decay than the attack in pitch glide in short duration signals. In the case of time-varying timbre also, in congruent pitch glides cause the most timbre change, followed by constant pitch glide. For congruent pitch glides and quadratic pitch glides in time-varying timbre, the listeners perceive lesser timbre change than otherwise.
Showing items related by title, author, creator and subject.
Nonstationary Techniques For Signal Enhancement With Applications To Speech, ECG, And Nonuniformly-Sampled Signals Sreenivasa Murthy, A (2015-07-22)For time-varying signals such as speech and audio, short-time analysis becomes necessary to compute specific signal attributes and to keep track of their evolution. The standard technique is the short-time Fourier transform ...
Time-Varying Signal Models : Envelope And Frequency Estimation With Application To Speech And Music Signal Compression Chandra Sekhar, S (2011-09-09)
Expectation-Maximization (EM) Algorithm Based Kalman Smoother For ERD/ERS Brain-Computer Interface (BCI) Khan, Md. Emtiyaz (2011-05-13)