Using Praat for Pitch Analysis of the Thai Language


This article was originally posted on

  • Get your FREE Thailand Cheat Sheet ​by entering your email below. The ​Sheet, based on ​our experience with living and working in ​Thailand for 10+ years, shows you how to ​save time and money and ​gives you the tools the thrive in Thailand.

What is “Praat”?…

Praat is Dutch for “talk”. It’s a program used for voice analysis. It’s very powerful and has a lot of very advanced functions of which I will I will only discuss the most basic function: obtaining the pitch from an audio fragment.

What is the “Pitch”?…

First I would like to talk about “pure tones”. A “pure tone” is a sound wave that is consist of found of one single frequency. It’s the kind of sound you hear from a tuning fork. If you would display the wave as a graphs in time and in place, both graphs would be a sine wave.


Our voice is not a pure tone. If you would analyse our voice you would see it consists of several sine waves, with different frequencies (tone heights). Each frequency has a different amplitude (strength) and phase (starting point). All these waves are produced by our voice at the same time.



The “pitch” in phonetics is the frequency (or tone height) of the lowest frequency tone wave in our voice. It’s like the basic “hum” of our voice. It is the pitch what we would define as the “tone” in Thai.

The purpose of this document is to show you how you can visualize the pitch. It can help you to analyze and improve your own pronunciation or it can help you to recognize the tone in case you wouldn’t recognize it by listening.

The program we’re going to use can display the pitch of the audio (in time). The result will look like this. The blue line represents the pitch.


How do the tones in Thai look like?…

Basically they look like this:

  • The mid tone is constant (there might be a slight drop on the end).
  • The low tone start low and might even go a little bit lower.
  • The falling tone starts high and drops significantly.
  • The tone start high and goes even higher.
  • The rising to starts low and rises significantly.


Installing Praat…

You can download and install praat from Praat: doing phonetics by computer. It’s available for Linux 32/64bit, for MAC OSX, for FreeBSD and for Windows.

Using Praat…

Once you start Praat you get two windows: the Objects window and Picture window. You’ll only need the Objects window. The pictures window is a window that allows you to draw on and manipulate the pictures Praat generates.

From the Objects windows menu choose “Open – Open long sound file …” and select the audio fragment you want to analyse. This can be any file, just a recording you want to analyse or a recording of your own voice. If possible save your audio fragment always as “.wav” file and not as “.mp3” because a “.mp3” file can cause a tiny time offset between the graphs and the actual audio.

In the Objects window, select your audio fragment (1. LongSound tones in this case) and click on “View”.


Now a new window will appear.

When you click at the play buttons directly under the spectrogram/pitch part you can play the audio left or right of the cursor. When you make a selection the audio will be split into 3 parts : one part before the beginning of your selection, then your selection, and finally a part after your selection and there will be 3 play buttons.


You can use the “in”-button (zooms in), “out”-button (zooms out), “sel”-button (zoom to selection) and “all”-button (zooms to all) below the spectrogram , together with the play buttons and the scroll-bar below the spectrogram to go to any part that might interest you.

Take into account that the pitch and spectrogram will only be displayed when the audio fragment that is visible is less than 10 seconds.

By clicking on the frequency number of the right side of the spectrogram you can zoom-in and zoom-out the frequency scale.


The first part of the picture above looks like a mid-tone. After that we see a low tone, a falling tone, a high tone and a rising tone.


The yellow curve in the diagram represents the intensity.

The high tone might look a bit strange to you. That’s because the big jump at the end has a very low intensity or volume and can be ignored. To show the intensity choose “Intensity-Show Intensity” from the menu. The yellow curve represents the intensity.

How to see the difference between aspirated and unaspirated sounds?…

The difference between aspirated sounds such as พ in พา and an unaspirated sound like the ป in ปา is the voice onset time (VOT). That is the time between the start of the syllable and the first occurrence of the voiced vowel. For aspirated sounds the VOT is much bigger. Usually the start of the blue pitch line indicates the start of the voicing, while the rising part of the yellow intensity line indicates the beginning of the syllable. Voicing is a vibration of the vocal cords. It’s much easier to recognize a pitch in those sounds than in sounds that are made with the mouth. That’s why the blue pitch line starts at the voiced vowel า. The next picture shows the voice onset time in the word ปา. It’s only about 18ms.


This picture shows the voice onset time in the word พา. พ is aspirated consonant. The voice onset time here is 78 ms, which is significantly more than that of the unaspirated consonant. You should play and listen to the selections to make they don’t include any part of the vowel.


PS. Take into account that the time scales of both pictures are not the same.

6 thoughts on “Using Praat for Pitch Analysis of the Thai Language”

  1. I’ve noticed that too. But when I looked at the fact is a curve is convex or concave it began to make more sense to me. The falling tone Thai is concave down and decreasing. The rising tone is concave up and increasing. I think the fact if the curve is increasing or decreasing is not the most important factor determining the tone. The most important factor is the concavity. So, an increasing curve, that is concave down, might in some contexts be a falling tone.
    And then there’s another weird thing : if a word starts with a non-sonorant sound, the whole graph of the whole syllable is representative for the tone. But if the syllable starts with a sonorant, that sonorant often results in a rising part in the beginning of the syllable – as if your vocal cords are speeding to the correct frequency. We would have to ignore that part and look only at the vowel part.
    As said, it’s more or less a theoretical tool, used for advanced analysis, it might not be very practical.
    Also, tones is Thai seems to depend on many factors, such as syllable stress and previous and next syllable, so not only on the tone rules. I think the tool as such works correctly, but Thai is just much more complex than we assume.

  2. Thanks again for the manual, I’ve just used it to look at Isaan tones. Your walk-through was a great help to get started with Praat. I’ve looked at a number of recordings and I have the impression that if you know the tone than you can recognise it with a bit of imagination, but if you want to go the other way, i.e., determine the tone from the pitch, it’s very difficult if not impossible for individual words. I see a lot of variation for one and the same tone (even the same word pronounced several times under different circumstances), and similar pitch contours for words which clearly have different tones. It’s a bit messy. Have you had any further insights into how to use Praat to analyse tones?

  3. Thanks for the comprehensive answer! It looks indeed more like a tool for professional linguists than for us language learners, but it’s good to have it in the tool kit. I hope I’ll find the time to play around with it, it looks like a fun thing to do (at least once 555+) Thanks again for sharing this idea!

  4. I am not really a believer in computer voice analysis. While computer voice synthesis is very good, and google talks more clear Thai than most farang, it’s very different with voice analysis. Humans are able to have a meaningful conversation in a busy shopping mall where the background noise is so loud than a computer wouldn’t know where to start looking/listening. A computer still needs a quiet room, a good microphone, it needs to be trained for certain voices, and the results are still very poor. Humans have a remarkable built in noise filter, recognize regional pronunciation differences, and can analyse the sound not only on what is actual heard, but also based on context.
    As long as we don’t have all the necessary algorithms for a decent voice analysis, using computer based voice analysis as an important study tool is probably out of the question.

  5. Hi Andrej.
    I don’t use it in my studies. I only use it in case I am not sure about the tone that was used in an audio fragment. It’s more like a theoretical thing.
    The results are not consistent, meaning that if you analyse the tones of a Thai speaking Thai, you’ll see that the tones very often are different from what you would expect from the tone rules. Both syllable stress and the tone of the previous syllable seem to influence the tone of a syllable. And another remarkable thing I’ve seen is that very often not the slope, but the change of the slope defines the actual tone. The change of the slope is what we would call the second derivative, or the fact if the curve is concave up or concave down. I think this the second derivative is even more defining for the tone than the actual slope.

  6. This is pretty cool, thanks for sharing! I’ve got two questions: (1) How consistent is the pitch analysis, i.e., if you look at, say, falling tones in a number of words, how consistent is the pattern you see in Praat? (2) What do you personally use this for? Do you incorporate it into your studies, or is it more a fun thing to look at and play around with?


Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.