In the past day or two, a couple of people have asked me (in various forms) about how Chinese writing works. The best book to read on this subject is Ramsey’s phenomenal book on the languages in China, the Wikipedia article on Chinese writing isn’t bad either.

This is the first thing to remember: Chinese is not written in an alphabet.

Think about that for a second. What’s the conventional definition for a “word”? Spaces on both sides of a string of letters? That definition isn’t gonna fly in the case of Chinese. We simply don’t write with spaces.
In Chinese writing, each character represents one idea, and combinations of characters can be put together to form bigger word-units. Not being an alphabet, explicit information on how the word is/ should be pronounced is generally not encoded as part of the character.

Here’s an example:

English: I am writing in Chinese.

Chinese: 我在寫中文。

The Chinese sentence consists of 5 characters. 我 means “I”, 在 is the equivalent of “am …-ing”, 寫 is “to write” and 中文 means Chinese (lit. Central writing).

Now a Mandarin speaker would pronunce the sentence one way, and a Cantonese speaker another way. The pronunciation of these characters can be sufficiently different such that the two are not mutually-intelligible.

Of course, it is not only the pronunciation that is different among the dialects. Grammar and word choices can also differ. For example, the way to form the -ing in Cantonese is “… 緊”, so a more natural way to write the sentence above in sentence in “Cantonese” (i.e. with Cantonese grammar) would be「我寫緊中文」.

Many of the words used in the dialects (particularly those that not used in Mandarin) were written with characters that are now obscure or have been phased out. This is because the writing of Chinese has been standardized to be Mandarin Chinese, regardless of which dialect one speaks natively. Hence, even for a native Cantonese speaker (like me), it could be quite challenging to read prose written in “Cantonese”, simply because it is not a usual way to write. There are many ways in use to replace the “lost” characters with modern variants, which makes it difficult to understand prose written “in dialect” without reading it aloud, for the common technique is to choose a character that sounds similar to the one that’s missing, even though the meaning is not appropriate.

Hope this answers the question once and for all.

