PinHua - A Chinese Low-Conflict Romanization System

Goal

One-to-many mapping between a Chinese character and a code

Low conflict rate

Based on Pinyin system

Add two extra letters based on the two leading strokes to reduce conflict rate

Choose a Pinyin for each character, using the data source A and data source B.

If a character is in data source B, choose the first Pinyin.

If a character is not in data source B, and has a Pinyin using tone 5, chose the one with tone 5

Otherwise, choose the first PinYin in alphabetical order

Use "uy" to represent "ü"

Add "e" for PinYin without vowel: "ng" -> "eng", "hng" -> "heng"

Pick the two leading strokes of each character, using data source C

If there is only one stroke, pick only the first stroke

If the two leading strokes are the same, pick only the first stroke

Convert the one or two strokes to letters as table below

Insert the one or two stroke-based letters and tone leter before the first vowel letter

A. Frequency and Pinyin of Chinese characters from http://lingua.mtsu.edu/chinese-computing/statistics/char/list.php?Which=MO (Modern Chinese Character Frequency List by Jun Da (jda@mtsu.edu))