CharacterDB:About
From CharacterDB
CharacterDB is an open source database on the structure of Han characters. Chinese characters (Hanzi, Kanji, Hanja) have a complex structure that can be exploited for varying tasks, be it for character input methods or didactic reasons.
In a collaborative effort we want to gather semantic data on the structure of Chinese characters' appearances (glyphs) for Hanzi, Kanji and Hanja. A structured input system is offered for covering several aspects:
- order of strokes used to write a Chinese character and which is based on a classification of stroke types,
- decomposition of a character's glyph into single components that again are characters in their own right,
- mapping of glyphs to a certain locale, indicating writing variations between Chinese, Japanese and Korean usage.
This data can be exported and used in any system designed with a focus on CJK languages. This initiative though is derived from the cjklib project which aims to offer a programming library on top of characters encoded by Unicode.

