This is a step-by-step demonstration of how to modify the melodic and temporal patterns of speech in the Praat software. It accompanies a paper written by Alice Henderson and Radek Skarnitzl, soon to be published in Language Learning & Technology (“A Better Me”: Using Acoustically Modified Learner Voices As Models).
This can be used by pronunciation instructors to create auditory feedback for their students, to help them realize what they need to change in their pronunciation.
You can use the videos below, or you can refer to a cheatsheet describing the main steps.
In the first step, it is easiest to simplify the contour corresponding to the fundamental frequency (f0). This is shown in the following video.
In the second step, we can create the desired melodic contour, as described in the following video.
If you only want to change the duration of few sounds, you can skip most of this first preparatory step. If you plan to change a lot of sounds, however, it will speed things up later if you prepare the file in the way shown in the following video. What we need to do is “fix” the original duration around (mostly) vowels we will subsequently be changing. In the next stage, we will be creating duration “dents”, lengthening or shortening, within the limits we have fixed.
The lengthening and shortening of individual sounds is shown in the last video. The first stage will help us to create “dents” without affecting the duration of those sounds we want to keep at the relative duration of 1. At the end, this video also shows how to change the global speech rate.
When you have created the manipulation, you can create a sound object from it in two ways: in the Manipulation window, by clicking File – Publish resynthesis, or in the Objects window, by clicking Get synthesis (overlap-add). Then save the sound by clicking Save – Save as WAV file. As mentioned above, you can also save the Manipulation file, so that you can return to it later, by clicking Save – Save as binary file.
When performing manipulations like this, we should be aware of the limitations. This concerns especially recordings with poorer signal quality – for example with relatively strong background noise. Praat needs to be able to correctly estimate the fundamental frequency to be able to manipulate it, and noise may prevent this. Correct estimation of f0 may also be a problem in voices which have aperiodic portions. Apart from pathological voices, this happens in speakers with creaky phonation (also called vocal fry).
Good luck with your manipulations!