Your company here — click to reach over 10,000 unique daily visitors

Package sunpinyin-data

Little-endian data files for sunpinyin


The sunpinyin-data package contains necessary lexicon data and its index data
files needed by the sunpinyin input methods.

Version: 3.0.0

General Commands

genpyt generate the PINYIN lexicon
getwordfreq print word freq information from language model
idngram_merge merge idngram file into one
ids2ngram generate n-gram data file from ids file
mmseg maximum matching segment Chinese text.
slmbuild generate language model from idngram file
slminfo get information of a back-off language model
slmpack convert the ARPA format of SunPinyin back-off language model to its binary representation
slmprune prune the back-off language model to a reasonable size
slmseg maximum matching segment Chinese text.
slmthread threads the language model
tslmendian change the byte-order of sunpinyin's threaded back-off language model
tslminfo get information of a threaded back-off language model