Showing posts with label Chinese characters. Show all posts
Showing posts with label Chinese characters. Show all posts

October 16, 2024

Two third tones rule in Mandarin.

An interesting thing happened to me yesterday. For the first time since I started learning Mandarin - and that would include 3 years of demanding studies at our university's Chinese department, 5 years of complete immersion in Taiwan and 10 years of working as an interpreter for the police, judiciary, prosecution.. all in all a daily grind of 17 years..  I managed to 'feel' and not 'know' that a 3-3 tonal combination in Mandarin should be a 2-3 combination. 

In other words, I 'felt' and not 'knew because I had learned the rule', that when there are two third tones in close proximity in Mandarin, that it "can't be like that" and that the first of the two tones must be pronounced as the second tone. (The full rule is a bit more complex, but for the sake of simplicity, this is what I want to go with for now). 

Before that, all that time, the whole 17 years, since I started studying Mandarin, when I spoke Mandarin, I had to pay attention to what tones would follow in my speech, and all that time when a 3-3 combination was approaching, I had to consciously 'initialize' the 3-3 > 2-3 rule and pronounce the combination as 2-3. 

Even after 17 years it was still a bit difficult and a bit tiring in the sense that it cost me my attention, and it put slight pressure on the syntax and the overall sequence of thoughts around the 3-3 point in my speech. 

Yesterday, as a form of practice, I was reading out loud a transcript of a podcast with a native-speaker friend. We've been reading it regularly for about 2 weeks now and suddenly I came to a point in the text where "Alice很努力" appeared and because I was reading too quickly I went on to read it as "Alice hěn nǔlì " and I said... "Wait a second, that can't be." 

November 02, 2023

Analysis of Lion-Eating Poet in the Stone Den 施氏食獅史



The following is my short analysis of the famous Lion-Eating Poet in the Stone Den (施氏食獅史) poem. 


About the poem:


"Lion-Eating Poet in the Stone Den" (Chinese: 施氏食獅史; pinyin: Shī-shì shí shī shǐ) is a short narrative poem written in Classical Chinese that is composed of about 94 characters (depending on the specific version) in which every word is pronounced shi ([ʂɻ̩]) when read in present-day Standard Mandarin, with only the tones differing.[1]


The poem was written in the 1930s by the Chinese linguist Yuen Ren Chao as a linguistic demonstration. The poem is coherent and grammatical in Classical Chinese, but due to the number of Chinese homophones, it becomes difficult to understand in oral speech. In Mandarin, the poem is incomprehensible when read aloud, since only four syllables cover all the words of the poem. The poem is more comprehensible—but still not very intelligible—when read in other varieties of Chinese such as Cantonese, in which it has 22 different syllables, or Hokkien Chinese, in which it has 15 different syllables.


Source: https://en.wikipedia.org/wiki/Lion-Eating_Poet_in_the_Stone_Den


My Analysis


Chinese:


《施氏食獅史》


石室詩士施氏,嗜獅,誓食十獅。
氏時時適市視獅。
十時,適十獅適市。
是時,適施氏適市。
氏視是十獅,恃矢勢,使是十獅逝世。
氏拾是十獅屍,適石室。
石室濕,氏使侍拭石室。
石室拭,氏始試食是十獅。
食時,始識是十獅屍,實十石獅屍。
試釋是事。

December 25, 2022

愛 爱 Love Chinese character etymology and structure


A friend of mine, a western native speaker of Mandarin Chinese, who grew up in Taiwan, recently embarked on the journey of learning how to read Chinese. He sent me a few pages of a book that he is reading to help him understand the Chinese writing system better and asked for my opinion. 

A few minutes of reading it turned into a few hours of thinking, writing and research and I thought I would publish my reply to him about one specific section in this book as an article on my blog as it covers a few interesting concepts and recurring themes. 

In the introduction, the author uses the character 愛 as an example to teach his students that quote "100% (not a single exception) of Chinese words is composed of root words. (sic)". The author's writing is a bit difficult to understand, and there was context before and after this sentence that would make it a bit clearer, but what the author essentially tried to say was that with every single Chinese character, it is possible to tell what this or that Character means just by understanding what roots it consists of, which can always be seen clearly in the character itself. A simple example would be looking at the Chinese character 人 and seeing that it is a person. 

The author then proceeds to further demonstrate this with the character 愛 and as we will see, his system unfortunately falls apart. He writes:

愛 (love) is the composite of (sic):

1. Top part of 受 (receiving) which means holding hands (sic)

2. 心 (hearts (sic))

3. Bottom part of 夏 (Summer) which means walking slowly (sic).

So love is that hearts hold hands and walk slowly together (sic).

I think it should be obvious that this is storytelling and not scientific research, and I think it is also important to prove why the author is wrong. 

First of all, arbitrarily deciding that the 愛 character is formed by ripping off the top of 受 and the bottom of 夏 and putting a 心 between them because it fits our explanation is like working with a completely faulty set of equations while solving a math problem and then arbitrarily changing the resulting number after the equal sign to the one we want manually so that it fits our teacher's correct result. 

As for the etymology of 愛 I never researched this character before, I convened my little etymology research team consisting of me and my TW friend:) and this is what we found out:

First, let's make the character a bit bigger:


Just by looking at it we cannot really tell what elements/roots/radicals/standalone characters etc. the character 愛 is made of, as the author says. We see 心 xin1 - heart in the middle, we see 夊 sui1 - walk slowly at the bottom. The top however is 爫over 冖 which is clearly a simplification/fusion of something that was there before but we cannot recognise it now. 

February 24, 2019

Chinese character Zen storytelling

So an innocent question under one of my videos about Chinese character etymology
(https://youtu.be/Svb7rulL5aE) led me to about an hour of research and I wrote a reply to the comment which I thought was worth publishing as an entire article on my blog. Gotta love science :)

The main reason why I thought this comment was worth publishing as an article was (apart from the fact that it was hopefully good research and took some time), that it is absolutely paramount to understand that people should be scientific and very careful not to interpret the structure of Chinese characters purely based on what they see today and resort to or believe Chinese character Zen story telling. I really can't stress this enough.

As Wikipedia teaches us about the Scientific method: "It (the Scientific method) involves careful observation, applying rigorous skepticism about what is observed, given that cognitive assumptions can distort how one interprets the observation."

This could not be more true when it comes to Chinese characters.

The character I analyzed 黎 (which is today pronounced lí and today means 'many, numerous') is today structurally made up of 禾 (grain) 人 (person) 水 (water) and a mysterious 勹 + 丿

I could come up with 20 different Zen combinations as to how grain + person + water + (勹 + 丿)  could mean 'many, numerous'. Try it yourself before you read the rest of the article and compare it to what I wrote. Just for the fun of it and just off the top of my head:

黎 character

Modern meaning: many, numerous
Modern pronunciation: lí

Structural composition today:

禾 (grain)
人 (person)
水 (water)
and a mysterious 勹 + 丿

Top of head, seemingly cool interpretation:

'It's a person having to endure the burden of a lot of work because he has to irrigate a lot of grain with a lot of water'. All pointing to the meaning 'many, numerous'.

Let's pretend the 勹 + 丿 is not even there.

It took me, as someone who has spent a lot of time researching Chinese characters, about 30-60 minutes of research with a lot of modern tools to really understand the structure of this character and there still are blind spots in the analysis as you will see. What I'm trying to say is that if someone gives you a cool, funny, mysterious, 'Zen' interpretation of a character (like: 'It's a person having to endure the burden of a lot of work because he has to irrigate a lot of grain with a lot of water' in the case of 黎), please be very skeptical. There usually is much much more to it. Based on the difficulty of researching only this one character hopefully you will be able to appreciate why being scientific is a good thing.

December 14, 2018

The most complex Chinese character

What is the most complex Chinese character?

When it comes to the character with the greatest number of strokes and the greatest number of elements I was able to find, it is this character, which is pronounced dhō, with 341 strokes:

http://uncyclopedia.wikia.com/wiki/File:Chinese_character_extreme.svg

It supposedly means “Impossibly complex pictogram-based writing system that takes a person a thousand thousand years to learn.” This character however should not count in my opinion for a number of reasons. The only reference to it I was able to find was on the Uncyclopedia website, which is a website where you can: “Discover, share and add your best comedic writing!” So dhō is thus very probably just a recently invented character invented for fun, where the author took several very complicated existing and non-existing characters, added them together, added a few non-standard strokes, called the character dhō and gave it the meaning I mentioned earlier.

October 06, 2018

Some thoughts on the reliability of 說文解字

Public Domain, https://commons.wikimedia.org/w/index.php?curid=35309
Disclaimer: This article will be very technical and very probably very uninteresting if you are not
familiar with Chinese character etymology. My apologies in advance. 


I got into a debate with someone online under one of my videos recently. The video was about the 辡 character phonetic series. In the beginning of the video I argued that 辡 was a character formed by two 辛 characters. According to my sources, 辡  means 'litigation' and one of the older meanings of 辛 was 'criminal' and that 辡 'litigation' is a semantic compound character with one 辛 'criminal' and another 辛 'criminal' pointing to its meaning (two criminals litigating in front of a court).

Since 辛  doesn't mean 'criminal' today, someone correctly asked in the comments, what my sources were.

I wrote:

(I am) Inferring (that 辛 had the meaning of criminal) from the following:

《說文》《辡部》辡:辠人相與訟也。从二辛。凡辡之屬皆从辡。

And the existence and ancient interpretation of characters like 宰 辠 and 辜

《說文》《宀部》宰:辠人在屋下執事者。从宀从辛。辛,辠也。
《說文》《辛部》辠:犯法也。从辛从自,言辠人蹙鼻苦辛之憂。秦以辠似皇字,改為罪。
《說文》《辛部》辜:辠也。从辛古聲。

To which the person argued, that the 說文解字 dictionary is not a reliable source and that it regularly misinterprets characters, among other things also because it uses an extremely limited data set and that not a single entry in the entire work makes use of 甲骨文 data (due to it being unavailable).

A debate ensued which went on ad infinitum and produced enough material to be published as a small article:

February 28, 2016

Pictograms

Pictograms are Chinese characters which really look like pictures of what they represent. 人 'person' for instance, is a picture of a person, 女 'woman' is a picture of a woman and 月 'moon' is a picture of a moon.


Woman

The problem with pictograms is, that since they consist of only a couple of strokes, at first glance it isn't always clear what they represent. The reason for that is, that they were created a long time ago (2500+ years ago) and they changed visually very much. When they were created, these characters resembled what they represented much more. 

February 19, 2016

Chinese character types

Chinese characters look the same to us Westerners in the sense that they all seem equally complicated, but when we look closer, we find that there are actually many structurally different types of Chinese characters. Some of them are really pictures of whatever they represent, so for instance 人 'person' is really a picture of a person and 女 'woman' is really a picture of a woman. 

Then there are other types of Chinese characters, much more abstract and much more complicated, with fancy names like the phono-semantic compounds or derived characters and I would like to introduce them to you one by one. 


This series of articles will be an end result of a project which I have been working on for over three and a half years. I will be publishing a book on Chinese characters and wanted to give you a glimpse of what will be inside as well ask you for any comments or suggestions you might have to make the book as enjoyable and useful as possible. 

In line with my philosophy of minimalism and effectivisim, the book will be very clean and easy to use, combining the absolutely best modern Chinese character research with the best learner experience. 

A lot of time and effort has been put into transforming the complicated research data into easy to understand 'look once, understand immediately' chunks. No clutter, or lumping of information onto the reader. Just an enjoyable learning experience.

For more information and regular updates about this and my other projects feel free to subscribe to my mailing list

Learn more about:
Pictograms
Compound pictograms

August 21, 2014

Derived characters

While I was still at the Chinese department, during our lectures on Chinese writing, our professors taught us about 6 Chinese character types: pictograms (象形字), simple indicatives (指事字), semantic compounds (會意字), phono-semantic compounds (形聲字), phonetic loans (假借字) and derived characters (轉注字). (For further reading on Chinese character types see this post).

While they explained the first 5 quite in detail, when talking about the last sixth category, we were told that these still require further research and that no one really understands them well. Or so they said.

 wang4 'hope, expect' is a derived character. Let's look at its definition from the 說文解字 (100 CE) dictionary first:

November 17, 2013

Chinese character etymology and Chinese character phonetic series

First lecture in the hopefully longer series on Chinese character etymology and Chinese character phonetic series. In this lecture I try to explain what phono-semantic compound characters (形聲字) are, explain the 才 phonetic series and etymology of all characters in it.




Characters in this video:

才 cai2 - talent, material. Leading phonetic character of the group.

財 cai2 - money, wealth
材 cai2 - material
在 zai4 - to be located at
載 zai4 - to give someone a ride
裁 cai2 - to cut
戴 dai4 - to wear (clothes), to put on

September 23, 2013

Understanding Chinese Characters

Introduction

Chinese characters are a very complex system of recording the Chinese language into writing. Most of what seems to be a mix of illegible symbols is part of a logical but complex writing system that has been gradually developed around 2300 - 3000 years ago, with oldest confirmed characters dating back to around 1200 - 1050 BC. In this article I will try to briefly explain what one needs to know in order to understand Chinese characters and what you should know before you start studying them

Some basic facts:

  • The earliest confirmed evidence of the Chinese script yet discovered is the body of inscriptions on oracle bones (cattle bones and turtle shells used in divination and fortune telling ceremonies) from the late Shang dynasty (1200-1050 BC) - Wikipedia
  • According to some studies (including my own), you need to know only about 2500 characters to read the newspaper.
  • About 80% of all characters are made up of two elements - one telling you how to read it and the other one telling you what it means. This is good news, because based on these two elements you should know how to pronounce and understand the meaning of 4 out of 5 Chinese characters. 80% is a huge number and if you learn how to read this type of characters and understand their system, your learning progress will be much faster. 
Benchmarks in character evolution

Oracle bone script

November 15, 2012

New Youtube channel

Hello everyone, 

I have launched a new Youtube channel as a supplement to my blog, where I would like to share some ideas about langauge learning. I'm currently working on the How to write Chinese characters playlist in which you can find videos explanaining in detail how to write Chinese characters. In each video I explain how to write these characters, explain what writing rules apply to them and what details to look out for when writing them in order to write them correctly and give a little background about their structure and history. The characters for these videos were selected based on my character frequency research starting from the most frequent one. You can find more information about my character frequency study here.





In the future, I would like to do more videos like this on Mandarin Chinese pronunciation and other langauges as well. I would also like to record interviews with other fellow language learners and post them on my channel.


Hope you enjoy the channel and if you the videos useful, feel free to subscribe.



Vladimir

November 05, 2012

Chinese character frequency list - Interview articles

Abstract

In this study I tried to analyze the Chinese character composition of about 60  interview articles in two Taiwanese online magazines,  evaluate the data, produce a character frequency chart, character knowledge vs text recognition chart, do absolute character prediction calculations and compare the data with previous analyses that I have done. I sampled a total of 45 235 characters and found that there was a total of 1865 unique characters in this sample. Based on my calculations I also found that in order to recognize 100% (using the word 'to recognize' and not 'to understand' on purpose throughout the article) of any given number of interview articles, one needs to know 2084 unique Chinese characters. When comparing this data to my previous news character analyses I found, that the interview character frequency list contains much more direct speech elements than the news article character frequency one does and I've mathematically proven, that interview articles are easier to read for beginner and intermediate students of Mandarin Chinese than news articles are.


Introduction


In the past posts I was trying to analyze the frequency of words and characters based on the data that I sampled over the period of 6 weeks from 4 section of Taiwanese news (please see the Character frequency analysisWord frequency analysis and Character prediction analysis articles for more information). 


March 19, 2012

Amount of characters and words necessary to read news articles

Abstract

Hello everyone and welcome to my never ending study again. In the last two posts I was trying to count the number of unique Chinese characters and words in Taiwanese news by analyzing 80 news articles from Taiwan over the period of six weeks. In my study I found there there was a total of 2105 unique characters and 5901 unique words in the 80 articles I analyzed which were separated into four sections: 國際 (international), 政治 (domestic politics), 社會 (society) and 財經 (economics), but as I said, 80 articles was not enough and I tried to extend the study. Using the sampled data I did some calculations and tired to predict what the number of unique characters and words in any given number of articles would be. I found that there would be a total of 2174 unique characters and 8424 unique words and a person would thus need to know this many characters and words to recognize 100% of any given number of news articles, if these news articles were from the same 4 news sections I analyzed.

Introduction

The main task was to predict what the evolution of the unique character and word charts would be and at what point on the y-axis they'd stop ascending. The corresponding x-axis value to that point would be the total amount of characters necessary for a person to know in order to recognize 100% of a random news article as long as it would be from one of the 4 sampled news sections. As you can see by looking at the following two charts, both of them have ascending trends with the Word knowledge chart having a sharply ascending ending with seemingly no approximation to any number.


March 06, 2012

Chinese word frequency list - News

In the last post I analyzed 80 news articles from Taiwan over the period of 6 weeks, provided some basic statistics and tried to come up with a Chinese characters frequency list, by counting the occurrence of unique characters in these articles. In this post I would like to write about the word frequency analysis of these articles. 

Research method

I again analyzed the same 80 articles which were divided into 4 areas: 國際 (international), 政治 (domestic politics), 社會 (society) and 財經 (economics) with 20 articles in each area.

During the whole word frequency analysis process, the biggest problem was to actually separate Mandarin words from each other. Like I mentioned in the previous post, as most of those studying or speaking Chinese know, words are not separated by spaces in Chinese. Counting the occurrence of unique words as opposed to counting the occurrence of unique characters therefore requires much more work, because unless you want to count word frequency with a pen and paper and would like to use a computer program to do the work for you, there has to be something that separates words from one another, in order for the program to know what to count. There are fairly complicated computer programs that can do this sort of indexation for Mandarin automatically, but since I didn't have any of those, I had to do indexation manually.

In order to count the occurrence of unique words in an English article for instance, the process would be much easier, because spaces between words in English texts mark very clearly where a word starts and where a word ends and a computer program can thus use these spaces as index markers to count words and consider everything in between those spaces to be separate word units. In Mandarin this is unfortunately not possible.

Take the following sentence for example:

February 21, 2012

Chinese character frequency list - News articles

I think a lot of those studying Mandarin Chinese have sooner or later started to wonder how many characters one really needs in order to normally function in a Chinese-only world or what for instance the most frequent 500 Chinese characters are. I personally have heard a lot of numbers and saw several Chinese character frequency lists, but often didn't understand why this or that character made it to the top 500 or why the list said I needed this or that number of characters to read something when I had the feeling the number was either overstated or understated so I decided to try to do a little study on my own. 

I tried to analyze how many characters and words are approximately necessary to read news in Mandarin. I chose four sections of Taiwanese news - politics, international, society and finance, all written in traditional Chinese characters during a 6 week sample period.

If I'm correct, the field of computational linguistics deals with projects of this kind and I'm sure that there are several teams of experts at linguistic departments worldwide that must have done similar researches using much more sophisticated methods than I have and after the amount of effort it took me to analyze these few articles, I have a lot of respect for what they do. 

January 06, 2012

Efficiency of Chinese characters

Efficiency of Chinese characters
By Vladimir Skultety, M.A., B.A.

A lot of people say that Chinese characters are inefficient, because they are too complicated and there is too may of them. By contrast they say that western alphabetic scripts are much easier to learn, much easier to write and are thus much more efficient.

In this article, I tried to somewhat objectively analyze the situation, which was a bit hard, because I like Chinese characters a lot, but either way I looked at it, I still think that characters are at least as efficient and in some cases even much more efficient than western alphabetic scripts. 

Negatives:
  • There’s a lot of them. I don’t like numbers but it is true, that you need to know at least 2500 – 3000 characters to read something.  (Edit 5.5.2012 - strangely enough, after my study I found that you would actually only need about 2180 characters to read the newspaper)
  • It’s much more difficult to remember characters compared to the simple 35 or so letters of an alphabetic script
  • They are easy to forget
  • They are easy to confuse
  • You not only need to learn how to recognize them, you need to learn how to write them by hand which doubles your effort
  • They are unpractical when you need to look up something in a list (dictionary, telephone list)
Positives:

January 05, 2011

The Chinese script

Listen to MP3

Hello everyone.

Since there is a fairly large amount of youtube channels, blogs or podcasts, where people can get very good information on language learning or anything related to this field, I thought that I might be talking about things that have been said many times before and decided to try to do this recording in a more academic way. I’d like to discuss a rather specific topic, but a one that still might be interesting to listeners not so familiar with the subject – the Chinese script.

I hope you’ll enjoy it and wish you a belated Happy New Year.


Chinese script

-         writing is a form in which you can express language units

Characteristics of the Chinese scritp:

  1. morphemographic
  2. syllabic
 Characters:

Han dynasty reform
8 strokes:

                                    Yong3