måndag 6 februari 2012

Presenting Chinese in a pdf

Presenting Chinese in a way that is useful for learners is tricky. The main challenge lies in that the pronunciation of a character will often not be known to the learner. A second challenge is that the words are not divided in standard Chinese text. However, just breaking up the characters one by one makes no sense either - one will somehow have to divide the text into words, or possibly "natural units", that make reading and understanding easier.

Automatically making this split is a challenging theoretical problem, but there are already methods which come pretty close to solving it. More about those challenges in later posts.

One of my early ideas was to put the tone of the characters above the characters, and the pinyin pronunciation below. In a pdf file, this becomes aesthetically rather pleasing, as below.

If possible, I would have been able to copyright this idea, as it is not something I have seen before. However, some people have used LaTeX to present tone marks above characters, just  without the pinyin pronunciation.

In addition, I have chosen to highlight "difficult words" in bold, adding pinyin to them (only tones if they are common enough) and by underlining all names. For the particular file, I have made some manual adjustments, but a lot of the work to be able to generate files automatically in this format is done  already.

The logic behind this is that the annotations should make it easier for the student, but without cluttering the page too much. For example, adding pinyin to all characters, as is done in some books, just makes reading more difficult for someone who already knows most of the characters.

All in all, this solution works pretty well for viewing a text in print, and I would argue that this display is more pleasing that most Chinese textbooks.

Inga kommentarer:

Skicka en kommentar