FOSS handwriting recognition input with Linux

Handwrtting recognition system was standard tools in China and Japan, since 1980′s using graphics tablets pads, packaged with specialized software, beside lot of pure keyboard systems. In Europe, we needed to wait until end of 1990′s for handwritting systems in opensource software, and more recently for their democratisation. Those method can also help people without hands to write using foots, mouth or other device that allow to manipulate a more traditionnal writting tool like a pen or a brush.

Kanjipad (english language web site), is probably one of the older (since 1997) handwriting recognition system for Linux, but it is not really active those days.

Some new free softwares reached the opensources ecosystems since

* Cellwriter, (about 2007) is a more general approach that need the learning of the user first, by filling char charts. if this method can be done in few tens of minutes for alphabetic writting with some tens of characters, this is not the case with complexe writing systems, including pictographs and spatial combination of basic characters, that in combination gie some thousands of characters. For chinese Han writing for example, learning of 215 basic pictographs, they key version (at left, at top), and most frequent combination, could help a lot. Other chinese writing systems like Dongba are purely pictographics and need to learn thousands of different picture, without any easy method.

See the excellent doc.ubuntu-fr.org site (in french) as nice and short introduction to Cellwriter.

Learning session in Cellwriter (Credit : doc.ubuntu-fr.org)

Learning session in Cellwriter (Credit : doc.ubuntu-fr.org)

Most active projects in the handwriting recognition input methods for Linux today

* Tomoe (english and japanese), an handwriting recognition system that progress quickly.
* Stroke-editor (english), by Huzheng (chinese), creator of the excellent and still reference, Stardict (english and chinese), multilanguage free dictionnary. Stroke editor allow to help Tomoe to learn and writting recognition, feeding Tomoe server.
* Tegaki (english), specialized in chinese and japanese handwriting recognition, it uses several engines; Zinnia (english and japanese) and Wagomu (developed by Tegaki team) and more experimental KanjiVG, using partially Tomoe. Tegaki has an interface for iBus, most complete, advanced, and managing the more languages about complex input method engine for Linux.

They are both created with Far-East Asiang language in mind, mainly hanzi (han caracters in chinese, alseo called kanji in japanese, hanja in korean, hán tự in Vietnamese,…), they are also able to analyze most of planet writings. 27 setptember update : There are some probleme with character like 道 (dao, “the path’, Taoism, also used as “do” in japanese, like in “karate do”, “bushi do”, etc…) recognition. Some one proposed a solution in 2010 on a forum about cantonese (in english), that was to update the (XML based database with this character. It looks like every characters using the same 廴 / 辵 key have the same problem. You need to write the key before the inside character, as in japanese, in chinese method (that is the original writing method), you should need to first write inside character and then outside key.

Online

* qhanzi
* kanji.sljfaq

In java, so I didn’t tested

* Hanzi Recognizer, dictionnary + handwritting recognition in a jar.
* Hanzidict, dictionnary + handwritting recognition also in java.