The Structure of Chinese


What comes to mind when you think of Chinese? Time and time again I’ve heard people saying things like “it’s a picture for every word”, and many a language’s version of “it’s all Greek to me” points to Chinese instead. It’s always this exotic, unknown tongue in a faraway land. In the first article in my new Bel Canton section, I’ll start with the mother of Cantonese – Chinese, breaking it down to you how it’s actually composed, and showing you why it isn’t as mystic as it appears to be.

But before I talk about the structure of Chinese languages, I’d like to briefly describe the structure of the Chinese language family so as to clear up some ambiguities concerning what I’ll be discussing.

The Chinese language family

You won’t believe how many times I’ve had this conversation, but the linguistic status of Chinese languages are pretty much still undefined, especially among Chinese-speaking communities themsleves. The confusion usually comes from the unity in writing: despite the many tongues existing in China, everyone writes the same way, with (mostly) the same set of characters, and can understand one another through the script, even if they don’t share the same speech. This has a long history: for a long time China has used classical Chinese as the written standard, from which the spoken languages deviate, pretty much like Latin and the Romance languages. It isn’t until a century ago that they decided to reform the written standard according to Mandarin, the language of government at the time. Hence to this day, the other spoken languages are still labeled dialects (方言). That’s putting aside the political subtext which clearly influences this choice of terminology, since dialects are subordinate to a language, but this is not the right place for that. Nevertheless, it is indeed said that it’s the “square characters” – our logograms, as opposed to alphabet – that has given us mutual intelligibility in writing and thus engendered a tendency towards unification throughout Chinese history.

But since this is a blog on languages instead of history or politics, I’ll take a linguistic standpoint here: among linguists, it is pretty much agreed that Chinese is a family of languages rather than one language with dialects. This is mostly due to the lack of mutual intelligibility, i.e. if I hadn’t learnt Mandarin, I would be to a large extent unable to make myself understood to a northerner, and vice versa. Today there are seven major languages or dialect groups, each having their own dialects. These are quite divergent dialect groups as well, since there is no standardisation within each dialect group. Nowadays when you go to China, you’re most likely to hear Mandarin, whereas in Chinatowns in countries like Britain or Canada, you’re likely to hear Cantonese (though this has been changing quite quickly due to Chinese emigration). I can discuss the differences in another post.

Even though Chinese consists of many different languages, with diverging vocabulary and grammatical details, there are some features common to the family, just like any other language family. And these are what I’ll be talking about in the rest of the post.

The basic unit of Chinese writing

Let’s begin with my favourite activity – breaking down stuff, shall we? Now before I delve into this section, I want to remind you that spoken and written language can be very different. Of course there is a strong connection between the two, since script is supposed to notate speech, but sometimes we should keep them apart. As you can see in the subheading, I’ll be dealing with writing first; a lot of the concepts can be applied to speech too, but that’s a story for another day.

Words and characters

Think about the English phrase

One language is never enough.

Now look at a Chinese version of it. For the sake of this blog I translated it to Cantonese, which is hardly different from standard Chinese:


If you can’t read Chinese yet, try not to jump to the conclusion of “it’s all squiggles”. Instead examine it carefully, and you should see eight roughly square blocks. Of course, these blocks are what we know as characters (字 [zi6]). And when pronounced, each character corresponds to exactly one syllable, just like the English words “fun” or “do”; no more, no less. If I were to transcribe it using Jyutping, it would look something like this (each “word” shows the pronunciation of one character):

jat1 zung2 jyu5 jin4 wing5 jyun5 m4 gau3

You might have noticed I’ve been avoiding the term ‘word’ to describe Chinese. This is because…well, it’s not that easy to define a word. Unlike most languages, particularly European ones, we don’t use spaces to separate words. Instead, we just string the entire sentence together, which sometimes may make it hard to judge whether a particular section of the sentence constitutes one word or more. Some say a character is a word, but I reckon it’s too confusing. I’d rather anchor the definition onto the conventional Chinese term 詞 [ci4]: the majority of modern Chinese words, by this convention, consist of two characters/syllables. A certain group of them, or idioms (成語 [sing4 jyu5]) have four. And monosyllabic (one-syllable) words are aplenty too, more so in Cantonese than in standard written Chinese / Mandarin.

On this blog, I’ll stick to this conventional usage of ‘word’, which sometimes coincide with ‘character’, but usually describes something above the latter.

Building meaning from meaning

After briefly defining the basic units of the language, let’s take a look at how they contribute to the conveyance of meaning. This is where the fun begins. You see, Chinese is known as an isolating language, which is just fancy jargon for saying each wordcharacter is one self-contained, or isolated, unit of meaning. That is the basic unit of Chinese. The absolute majority of characters carry their own definitions. Some are deprecated and no longer used as words on their own (in fact a good portion are only used in longer words), but can still be looked up in a dictionary. If you randomly write down some English letters, like pdfh or ig, they may or may not constitute a meaningful item, whereas whenever you write down any Chinese character, even if it’s just a line (一), you’ve written down meaning.

Look at that sentence again, now broken down into words (again, it isn’t totally clear what is or is not a word):

一 種 語言 永遠 唔 夠 one type(counter) language forever not enough

As you see, each word corresponds pretty well to an English word. At this level, it is already apparent that one character can carry a meaning that takes English quite some letters to express. But what if we break down the longer words even further?

  • 語言: both characters mean “speech”, as in things you say. Classical Chinese used to have a large amount of one-syllable words. But in the course of its evolution, many pairs of similar words have been grouped together to form two-syllable words, which are now the norm.
  • 永遠: forever long. Same thing, except the two don’t have identical meaning. 永 itself carries the meaning of ‘forever’, as evidenced by other words such as 永不 (forever not, i.e. never). 遠 not only completes the two-syllable rhythmic pattern, but also gives it a metaphorical image.

This is why when learning Chinese vocabulary, apart from using mnemonic devices as a help, another boost would be to break them down, so that you know what’s happening in the word. Of course, as you may have heard, characters themselves consist of radicals, which are generally hints at the character’s own pronunciation and ‘category’. But since this article is getting long, I’ll skip this for now.

So far we’ve gone through the hierarchy of the language that differs so greatly from what you’re probably used to. From radicals to characters to words to phrases and sentences, meaning is to be found at almost every level. Next time when I introduce some new words, I’ll start by breaking them down like I did, like a guide to build Lego towers from scratch. (Not sponsored.)

Now that you’ve come to appreciate the importance and ubiquity of meaning in every single level of the Chinese writing hierarchy, you can start to appreciate jokes about foreign names transliterated into Chinese having unwanted meanings. A classic one is 德國 [dak1 gwok3] (lit. moral country), which refers to Germany. The ‘moral’ part (dak1) actually comes from an abbreviation of Germany’s name for itself Deutschland. Dak1…Deu…see the resemblance? Ugh, I know it sounds a bit far-fetched. It isn’t easy to approximate foreign words with Chinese characters, where every syllable has all sorts of restrictions. But that is why we actually dedicate a certain set of characters almost exclusively to transliterating foreign names, such as 斯 [si1] and 爾 [ji5]. (The former represents ‘s’ sounds, while the latter is the closest we have to an ‘r’, since we don’t have one at all and take the Mandarin transliteration instead, where 爾 is pronounced er.) The meanings of these characters are mostly forgotten or ignored/deprecated, probably as a reason for or a result of this – who knows?

From now on I’ll start creating content pertaining to different aspects of the Chinese/Cantonese language, and interesting collections of words or usage examples that you might miss from textbooks or be baffled by when watching a film. Stay tuned and join me again next time!