PinHua - A Chinese Low-Conflict Romanization System

Goal

One-to-many mapping between a Chinese character and a code

Low conflict rate

Based on Pinyin system

Add two extra letters based on the two leading strokes to reduce conflict rate

Choose a Pinyin for each character, using the data source A and data source B.

If a character is in data source B, choose the first Pinyin.

If a character is not in data source B, and has a Pinyin using tone 5, chose the one with tone 5

Otherwise, choose the first PinYin in alphabetical order

Use "yu" to represent "ü"

Add "e" for PinYin without vowel: "ng" -> "eng", "hng" -> "heng"

Pick the two leading strokes of each character, using data source C

If there is only one stroke, repeat the first stroke

Convert the two strokes to letters as table below

Insert the two stroke-based letters and the tone leter after the first letter

A. Frequency and Pinyin of Chinese characters from http://lingua.mtsu.edu/chinese-computing/statistics/char/list.php?Which=MO (Modern Chinese Character Frequency List by Jun Da (jda@mtsu.edu))