PinHua - A Chinese Low-Conflict Romanization System


One-to-many mapping between a Chinese character and a code
Low conflict rate
Based on Pinyin system
Add two extra letters based on the two leading strokes to reduce conflict rate


1. Generate mapping from character to Pinyin

Choose a Pinyin for each character, using the data source A and data source B.
If a character is in data source B, choose the first Pinyin.
If a character is not in data source B, and has a Pinyin using tone 5, chose the one with tone 5
Otherwise, choose the first PinYin in alphabetical order
Use "yu" to represent "ü"
Add "e" for PinYin without vowel: "ng" -> "eng", "hng" -> "heng"

2. Generate stroke letters

Pick the two leading strokes of each character, using data source C
If there is only one stroke, repeat the first stroke
Convert the two strokes to letters as table below

3. Generate PinHua

Insert the two stroke-based letters and the tone leter after the first letter

Stroke-to-Letter Conversion Table

- w
| o
/ r
\ u
~ v

Tone-To-Letter Table

1 - w
2 / r
3 v v
4 \ u
5 . o

Data Sources

A. Frequency and Pinyin of Chinese characters from (Modern Chinese Character Frequency List by Jun Da (
B. Popular proununciation of Chinese characters from (The most common Chinese characters in order of frequency, by © 2003 – 2009 Patrick Hassel Zein )
C. Stroke of Chinese characters from (From