October 19, 2011

Mandarin Chinese tones – sound only approach

Mandarin Chinese tones – sound only approach
By Vladimir Skultety M.A., B.A.

I would try to talk about and build on a concept I wrote about in my earlier posts – to try to develop a system, in which students would remember Mandarin words without consciously knowing what tones or tonal combinations are in them and pronounce them correctly using less effort.

As the topic is quite complex, I would first like to go back to 2 earlier articles I wrote about tones and develop the thought from there.

Post from 11.30.2011 (edited):

When I first came to Taiwan, I remember being tired after even a 10-15 minute Mandarin conversation. I was unable to use the words I had learned before effortlessly even after I’ve used them a hundred times in conversation practice. Each time I wanted to use these words I had to make at least some effort in recalling them and constantly think of the tones, which was very tiring.

Then one time I mispronounced the word 比利時, a friend of mine corrected me, I pronounced it again correctly and since then somehow used the word effortlessly, without having to think about the tones in the word, without being tired trying to say the word itself and I started to wonder how it happened. I thought that maybe it was because I never saw the pinyin or characters for this word. I was familiar with it, because I’ve heard it before. When I mispronounced it, my friend corrected me and because of my friend‘s exaggerated pronunciation which was in contrast to my own, the word instantly sticked in my head. Up until now I don’t know and I don’t want to know what the tones in the word比利時 are. This was the first time I’ve ever learned a Chinese word so effortlessly and so quickly and I started to think, that this could support my theory that Chinese – at the beginner stages - should be learned using a sound-only approach.

In another post from 3.8.2011 I wrote (edited):


Tones are another problem mainly because of the way that they are explained and taught to us western students. Tones that can change the meaning of a syllable are something that we are not very used to. We do have tones in our western languages but they rarely change the meaning of a syllable, which is a permanent feature in Chinese. When it comes to tones, more then ever I hate myself for studying Chinese the way I did and the way I was instructed. There are 4 tones in Mandarin but I think it is a mistake to tell students this fact in the beginning of their studies. Chinese themselves do not know that they have 4 tones in their language and speak Mandarin perfectly well. I don’t understand why us, the students of Mandarin should know this.
Tones are very important of course, but the way they are explained to students is far too academic and not practical at all. Virtually in every Mandarin class today (the ones I have heard of at least) tones are explained and taught in a very scientific way, which might be suitable for some, but is very tiring and energy consuming in the long run. There is simply too much detail in these explanations and as my friend Luca Lampariello said, it is as if you’d try to learn Italian by learning the pronunciation and intonation of individual syllables first and tried to put them together into an entire sentence later.
A minimal necessary explanation in my opinion should be, that the pitch of Mandarin syllables is very important and that students should make their best effort in trying to produce them based on what they hear because they might not be understood otherwise. They should try and not look at tonal graphs, charts, numbers, explanations or ask about the number of tones. All of this information put together is too complex for the brain to process in the short amount of time it has and it is instead much more practical to concentrate on the simple sound and rely purely on aural memory.
Figure 1:                                                                   
Source: http://www.chinesepod.com/                                 

 Figure 2:

Figure 1 is an image most of the students of Chinese are familiar with. Based on this tonal chart (and accompanying audio) students try to pronounce Mandarin tones – a task which is truly not that easy, because in a real situation, tones as Figure 2 shows are not as simplistic as they are in Figure 1. A number of things happen when students try to do this: First, they try to pronounce something they see. 10 different students might have 10 different opinions on what a direct ascending line from the 3rd position all the way into the 5th position sounds like. Of course the teacher pronounces the tone for the students to hear, but a side effect is that students associate this sound with an image (an image which is not a perfect notation of the sound itself) which will cause interference later. Figure 2 on the other hand represents a perfect notation of the four tones of Mandarin, but can you imagine trying to reproduce a tone based on what you see in Figure 2? Compare this effort to trying to imitate this following sound based on what you hear only:
Figures 1 and 2 both represent tones in single isolated syllables. Regular speech consists of many of these syllables which in addition change their isolated pitch under certain conditions and combinations so if a student tries to reproduce every syllable as he speaks thinking about tonal graphs, the process will become very tiring and the image interference will turn into a serious problem.[1]
Students are also expected to get very close to the correct pronunciation on their first try. This is almost impossible, because you need to hear and not see the tone many times to realize what it approximately sounds like. Graphs might help you realize which tones are higher and which are lower, but are still only an approximate reference (which no one mentions) and big interference (which no one realizes). I still can remember half of our class tilting their heads up and getting slightly off of their chairs while trying to pronounce the 2nd tone. Talk about interference! Again – Chinese do not know that the 2nd tone is a rising tone, nor do they know that their language has tones whatsoever. They have to learn this at school, just like us westerners have to learn what is past participle, what is the Datvie case ect. 
I also think that it is very confusing to extract the 4 Mandarin tones and talk about them in isolation rather then permanently merge them with a specific syllable and meaning. The reason for this is that among other things Mandarin is multiplied by four. A good example is the syllable Gei3. This syllable can only be pronounced in the third tone and only has one possible meaning - to give. So the only way you can hear or say the word to give in Mandarin is to pronounce the word Gei in a third tone, but you as a student of Mandarin do not know this. If you study Mandarin the traditional way, you are prepared for the fact, that the syllable Gei might have 4 tones and thus might have at least 4 different meanings, so when you hear this syllable in speech, you don't trust your ears and literally overhear a word that you know how to pronounce as the tone itself doesn't make it distinctive enough for you as a beginner/intermediate student. I say forget about the fact that there are 4 tones, but be aware that there are some syllables that can be pronounced high or low (I will explain later why) and based on this they can under some circumstances have completely different meanings.
When I started learning Mandarin I was constantly asking myself – is it really possible to speak a language this way? Every syllable can have 4 possible tones. How can anyone speak Mandarin and have in mind all this information and all possible combinations? All these different meanings that syllables with the same tone might have, the difficult pronunciation and on top of that 4 possible tones, which change in combination with other tones…and of course no one can speak effortlessly like that.

Our languages also have tones, but they rarely change the meaning of words. We use them naturally and don’t even notice that we are doing so. We are perfectly capable of learning tones as adults as well. Mike Campbell for instance has described a fairly complicated system of permanently present tones in American English, where some of them have the same features as Cantonese tones for instance (low rising, mid rising and so on)[2]. A lot of foreigners studying English do master these tones without even noticing. I don’t know who and why applied a scientific approach to learning Mandarin and why that person decided that it was the correct one, but he made things a lot more complicated and difficult than they already were. I also don’t know why everyone else including me has followed.

With all this said and however it may be, the fact still is and remains, that Mandarin does have  4 tones which do change the meaning of syllables, are very important, need to be learned and even later when I completely changed my approach, they were an additional burden throughout the learning process.

Sound only approach

So the question now is how to effectively bypass the image-sound association and learn the words correctly concentrating on their sound only.

I wrote these two articles quite some time ago and since I first learned the word 比利時, I was trying to figure out how I did it, how it could be done again and I was trying to come up with a system which hopefully would work under any circumstances. I came up with something which I guess could be called the Basic pronunciation table which looks something like this:

1-1 醫生/犧牲/誇張

1-2 出門

1-3 喝酒/分享

1-4 發現

2-1 孰知

2-2 敵人

2-3 游泳

2-4 融化

3-1 火鍋/早餐

3-2 五十/可能

3-3 好醜

3-4 冷氣

4-1 鬧鐘/露天

4-2 不行

4-3 熱水

4-4 麥片


1-0 他的

2-0 誰的

3-0 妳的

4-0 算了

I’ve been filling out this table over the course of my studies with words like 比利時 – words that I had learned effortlessly, pronounce them correctly and do not know what tones are in them when I pronounce them. The table represents the 16+4 tonal combinations in Mandarin disyllabic words and expressions so that in the first row and the first column you have a 1st tone 1st tone combination and in the fourth row and the fourth column you have a 4th tone 4th tone combination. 

A very raw approach to building your own pronunciation table would be something along the lines of the following: 

  1. Create your own similar 4x4+4 table and start filling it out with words which you definitely know and feel you pronounce correctly, but do not consciously know what the tones in them are.
  2. Learn how to pronounce every new word you learn using the model word in your pronunciation table. Say your model word for the 1st tone 1st tone combination is 先生and you want to learn the word 醫生 you will pronounce it based on your ‘safe’ pronunciation of 先生. Ideally you should have a conversation partner that would tell you that 醫生 is pronounced as 先生 (instead of the traditional “it’s two first tones” explanation). This way you could completely avoid consciously knowing what the tones in the word 醫生are. Getting your conversation partner to memorize your pronunciation table could be a challenge though.
  3. As you are slowly filling out your table, if you find a good candidate word, do not ask your conversation partner for the tone the word has, only ask him or her to pronounce it slowly, listen to it and try your best in pronouncing it. You have to train yourself to do so, because up until now you’ve probably been trying to pronounce words according to the tonal graphs that you had memorized during your first Mandarin lesson. If you don’t get the word right, return to it later or move on. It took me a very long time to fill in all 20 spaces for my pronunciation table, so there is no rush. The key is also letting your conversation partner know, how to correct you. If you struggle with a part of a syllable (initials or finals), he should let you move on with these and only tell you that you're not getting the tones right and say the word again. The 4 Mandarin tones, especially in disyllabic combinations are so 'outstanding' and unique that you will eventually understand what they sound like and learn how to reproduce them. 
I don’t know how well this concept can work for others. For me it works fine, but unfortunately I came up with it too late. I know most of the words in Mandarin I need to know and what I do now is I try to re-learn my old vocabulary based on my safe pronunciation table model words, which is pretty hard - especially in the most basic combinations that we've learned during our first Mandarin lessons.

There are at least two main problems that arise as you are learning words using this system:

  1. you need to get the tones right in the beginning, otherwise you will not have a model word for your blank spot in the pronunciation table. You need to be corrected by a native speaker or a native like speaking foreigner until you realize what you're doing wrong. 
  2. because the point is not to write down anything until you’ve completely filled out your table, remembering words for the table could be a challenge.
You experience both of these problems as you learn Chinese the traditional way as well, the difference now is, that you will only use your aural memory and not your visual one (you can't write anything down). After you’ve completely filled out your table, you could group new words that you learn into these 16+4 categories, or write model pronunciation words next to each new word you learn. This of course excludes bulk-learning of vocabulary as it would still be very overwhelming (writing your model pronunciation word next to every new Mandarin word you learn would be very time consuming), but bulk-learning vocabulary is pointless in my opinion with Chinese anyway.

My pronunciation table is composed of disyllabic words for a number of reasons, most notably because most words as well as most common expressions in Mandarin are made up of 2 syllables (我是, 我有, 不是, 不會 ect.) and because syllables in pairs (and their tones) are much easier to remember than syllables in isolation. I talked about this in an earlier article from 9.7.2011:

When it comes to tones, it helped me a lot that I was always trying to learn syllable pairs instead of single syllables. It is much easier to learn a syllable pair (basically most of Mandarin words are syllable pairs: 說話看書 ect.) rather than to learn isolated syllables like ,  on their own. Learning words this way, you have a relation of two sounds that are either in contrast or are the same and it is much easier to remember them rather than trying to remember the sound of a single syllable. Subsequently it is also much easier to isolate a syllable from a syllable pair and then use it in a different word remembering its sound (for instance you learn the sound of 說話, then isolate the  and use it in 普通話).

I feel that in a way, this pronunciation table is analogical to grammar tables of Italian or Slovak for instance. You learn the basic declinations of parlare and then decline all the following verbs that end in -are in the same way.

If there would be any students willing to test this concept and share their results or problems they encountered, it would be great. Maybe this concept could be turned into a real study tool that could help students get around the obstacle of Mandarin pronunciation and allow them to concentrate on the rest of the challenges that Mandarin poses.

Of course in that case, since the words I eventually used for my pronunciation table aren’t very practical, a more reasonable table could look something like this[3]:



[1] I know the example I gave is overly simplistic and I used it only to make a point. There are several problems that arise as you try to pronounce (and remember) Mandarin syllables based on their sound only and I will try to address this further in this article.
[2] Please see his video for reference - http://www.youtube.com/watch?v=QvWXg_lxVTM
[3] Ideally, the table should be built up of all the sounds of Mandarin and during the first lessons students could memorize something catchy or funny built up of these sounds.


  1. Great idea Vlad! No language pronounces it's vowels exactly the same way in isolation as they do in combination with other sounds. Why should Mandarin? Your idea of referencing tone combinations that one already says well when learning new words makes a lot of sense.

    Thanks for sharing.

  2. Thank you for the comment Ryan. I would like to develop this a little more and see how it'd work with complete beginners that know nothing about Mandarin at all.

    Obviously if you're not used to the subtle sound changes in Mandarin and can't rely on any written explanations telling you how to correctly pronounce the sounds, a native speaker or a similar person who would correct you and explain to you what it is that you are doing wrong is absolutely crucial. Most of the classes in the world today do have native speakers so this should be ok.

  3. Interesting idea! Some of the examples you gave are off though: 他門 is 1-0, 事情 is 4-0, I haven't managed to double-check all of them. However it's quite often that the second character of two-character combinations acquires the neuter tone, no matter which tone it would have on its own. There are combinations of 1-2 and 4-2, just not these.

  4. Judith,

    thank you for the comment.

    Actually 他門 is 1-2 and 事情 is 4-2. The thing is I am in Taiwan and forgot that the pronunciation is different in China.

    1. Actually 他们 is 1-0 and 事情 is 4-0 in connected speech. The tones you give for the individual characters are correct, but in these combinations they are as Judith stated.
      If you listen closely to native speakers you won't ever hear 事情 pronounced 4-2. In some combinations, the second character will lose it's tone.

      Hope that's helpful.


    2. Dear David,

      I do not want to argue, but I can assure you that all I hear in Taiwan is 事情 pronounced as 4-2 and 他们 as 1-2 - whether in connected speech or isolated, or in individual morphemes (which is very rare for 们 and 情 actually). I have been in Taiwan for almost 3 years now and I don't think I ever heard anyone pronounce these words different - but then again I didn't consciously look for the tones of these particular words in speech either. It struck me as odd and pleasant at the same time that they keep the tones in such frequent expressions when I first came to Taiwan and I have actually read some studies about this before too, so this is a frequent phenomenon in Taiwan.

      In either way, I can make a recorded interview with any of my Taiwanese friends, which would be the easiest way to verify what I say. I might be wrong, who knows:)

      kind regards


    3. Try another one.

    4. 當然是 2-3. 這跟我寫的不太有關係吧

  5. Super post Vladimir,

    I just started learning Mandarin.

    1. Dear Naomi,

      thank you for your comment. I'm glad you found the post helpful.

      good luck with your studies


  6. I strongly agree, and I think you are on the right track. Like you, I have experienced learning a particular word, 白糖糕, without knowing either the tones or the detailed standard pronunciation -- which I only recently figured out. Regardless, I have never failed to have that word understood instantly whenever I ask for that nice sweet -- which has been quite a few times.

    At this point, I am not even sure that what I have been saying has the correct initial consonant of the 2nd character as is shown in the pinyin. The way I learned it, I thought it was similar to English "k", at the back of the throat, rather than English "t", near the front of the mouth. Also, what I heard in the first syllable sounded to me like "ba" rather than "bai".

    As an aside, you might be interested to know that I got to your page by searching Google for "16 mandarin tone combinations" (without the quotes). That fact that I arrived here shows something about how powerful Google searching has become. Before doing the search, in fact, I had independently coming up with the idea of memorizing 16 bi-tones, and was trying to find a table with examples.

    (Actually, I think it is somewhat more complicated than 16 bi-tones, because of neutral tones and also the difference in sound between two separate single-character words versus a two-syllable word using the same two tones.)

    For me, it seems almost pointless to try learning pronunciation by reading pinyin or memorizing the isolated tone sounds. That may be putting it too strongly, but that is the way I feel.

  7. Hello Ralph,

    thank you for the reply and observations.

    "At this point, I am not even sure that what I have been saying has the correct initial consonant of the 2nd character as is shown in the pinyin. The way I learned it, I thought it was similar to English "k", at the back of the throat, rather than English "t", near the front of the mouth. Also, what I heard in the first syllable sounded to me like "ba" rather than "bai"."

    If you're interested in this, try looking up a video called Mandarin surface pronunciation
    (or something like that, I can't remember the exact name anymore) by Glossika. He covers some of these situations very well.

    "(Actually, I think it is somewhat more complicated than 16 bi-tones, because of neutral tones and also the difference in sound between two separate single-character words versus a two-syllable word using the same two tones.)"

    I also made a table for the neutral tone combinations. When it comes to the pronunciation of two separate single character words and using the same two characters in one word where a tonal change would occur, whatever the resulting change would be, it is also in the table, you would only need to treat the new pronunciation completely independently and not worry about tonal changes (which is what Chinese people do as well actually).

    The whole table is only a concept and I was hoping for comments like yours to give it more shape and maybe eventually see if it could ever be used as a supplement to Mandarin teaching. Thank you very much for the input:)

    wish you all the best with your studies


  8. Nice approach. The other day I read some comment on Youtube which went a little like this: "I wish Cantonese was the official language of PRC. Call me crazy, but I think it's easier to speak it than Mandarin. Whenever I speak Mandarin it feels forced. etc"

    This got me thinking and I agree that thinking about every syllable and the tone even in pairs is tiresome and sounds unnatural.

    For example when I say 桌子. It sounds like "taaaaaaable" would in English. Too exaggerated.

    This guy/girl who wrote the comment is probably one of many students who say 1 sentence million times wondering if they got it right while being too lousy about their Cantonese.

    So people should pay attention to the right pronunciation including the intonation, but not too much nor too little.

  9. Hello and thank you for the comment.

    my 'nice approach' is starting to show some flaws:) It is not very practical when it comes to mass vocabulary learning. When learning new words, it is very time consuming to write the English meaning of the word, pinyin and the model word for the tonal combination just for one informational unit. Using pinyin with numbers that mark tones is much faster.

    I think however, that it works very well when it comes to learning and understanding how tones work and should be pronounced. If a student in the beginning stages would first learn this table (preferably through funny, easy to remember dialogues and not as an isolated table) and create a steady point of reference, it wouldn't be necessary to write the model pronunciation word after the pinyin during vocabulary acquisition anymore and the traditional approach (e.g. zhong1wen2, shuo1hua4) would be sufficient. One would already have the tonal combinations internalized and associated to the pronunciation table which could always be used as a reference.

    On a completely different note, I thought that Cantonese words might be easier to remember since there are more tonal combinations and thus more unique or rarely used combinations and students would not be confused that much.


  10. Haha. Not to mention tonal languages which DO have tones implemented in their ortohraphy such as Vietnamese or Hmong.

    I still think it's not about the number of tones as it is more about how much apart the tones are. For example in Vietnamese - one of the six tones is broken into two by a glottal stop, so it's pretty easy to recognize and produce. Sẽ = se-e.

    Thanks for your reply

    P.S. Is Cantonese still on your hit list?

  11. I wish it was:) Hope one day I'll live somewhere where the langauge is spoken on a daily basis. I just can't find the motivation to learn difficult languages otherwise.

    On a side note, I was wondering the other day, how people use dictionaries in languages like Vietnamese. There must be like 5 pages of entries of the same syllable, followed by another 5 pages of a different one with nothing to distinguish them by. Do you happen to know if Chinese characters are present in dictionaries like this as an aid?


  12. I hope this helps:


    From what I see. They start with a letter and then add different tones before another letter which they do the same thing with.

    a à á ạ ã ả
    ă ằ ắ ặ ẵ ẳ
    â ầ ấ ậ ẫ ẩ

    Not sure if it's all visible, but a, ă and â are different letters with different pronunciation but the same tone (ngang).

    The Chinese characters are no longer used, but they are still encountered in Vietnam on some monuments and sights.

    Actually thanks to these closed vowels and open vowels, there are not as many homophones as in Mandarin, but they still do exist.

    Bóng đá - football
    đá - a rock / to kick

  13. I only fliped through the dictionary a bit, but as you say the situation with homophones is not so bad. In the end it is not so bad with Chinese either. Even if some words are perfect homophones, it is very hard to confuse words like that in a real context.

    What I was wondering about was, whether there was some sort of a morphemic dictionary, something that could be analogous to Chinese character dictionaries. Maybe there you might have a long list of ă entries, followed by a long list of ằ entries.

  14. I really like your idea about our teaching the stupid four tones to scare away many eager people. I am a Chinese teacher. Usually we will try to use communicational Chinese for a month and then go to tones. There is a disadvantage that it's hard for the students to correct their wrong tones if there are any I didn't help them with in time.Anyway I still would like to try not teach tones at first.


  15. Hello again,

    a lot of time has passed since I wrote this article and I still don't know what could replace the traditional Chinese teaching methods. Maybe some sort of a combination between a tonal-awareness system and a non tonal-awareness system maybe. I really don't know.

    It feels to me like when Chinese people try to learn English and memorise the many verbal tenses and endless lists of vocabulary, but when they have to speak, it is really very difficult for them and often also very difficult to understand.

    A solution to this would be to learn whole sentences and not think about grammar at all, but that would be very time consuming and difficult.. I really don't know:) I'm starting to be convinced that the only way really 'anyone' can learn Chinese perfectly is to spend time in China as a child.

  16. It seems to me that chinese people LOVE the idea that their language is “terribly difficult”...I started some four months ago and now I can have simple conversations in mandarin with chinese I meet on the streets of my city...I mean talking...(reading writing is another league!!!)
    Compared to Spanish (my mother language) Portuguese (I am a highschool teacher) or German (I am fluent in it) - Hanyü is a piece of cake!!! Gustavo Pereyra (www.8belts.com)

  17. Everyone's entitled to their opinion :)