genpyt - Man Page

generate the PINYIN lexicon

Synopsis

genpyt lexicon-file result-file log-file slm-file

Description

genpyt is used to generate the PINYIN lexicon.  It only works on zh_CN.UTF-8 locale.

Arguments

lexicon-file

Specify a dictionary file. It should be a line-based text file in utf-8 encoding . Each line looks like:

   CCC  id  [pinyin'pinyin'pinyin]*

A default dictionary file can be found at /usr/share/sunpinyin/dict.utf8.

result-file

The output binary PINYIN lexicon file. This lexicon contains a trie presenting the key tree of PINYIN. And all of the candidate words are sorted using the unigram in slm-file. This file can be used with sunpinyin input method engines.

log-file

Specify the file to where the log goes. The log-file can be seen as the human-readble presentation of the binary output file.

slm-file

The language model from which the unigram information are retrieved. Typically, the slm-file is generated by slmthread.

Author

Originally written by Phill.Zhang <phill.zhang@sun.com>. Currently maintained by Kov.Chai <tchaikov@gmail.com>.

See Also

slmthread(1).

Info

2024-01-27 perl v5.38.2 User Contributed Perl Documentation