makehmmerdb man page

makehmmerdb — build a HMMER binary database file from a sequence file


makehmmerdb [options] <seqfile> <binaryfile>


makehmmerdb is used to create a binary file from a DNA sequence file. This  binary file may be used as a target database for the DNA search tool nhmmer.Usingdefaultsettingsin nhmmer, this yields a roughly 10-fold acceleration with small loss of  sensitivity on benchmarks. (This method has been extensively tested,  but should still be treated as somewhat experimental.)



Other Options

--informat <s>

Assert that the sequence database file is in format <s>. Accepted formats include fasta, embl, genbank, ddbj, uniprot, stockholm, pfam, a2m, and afa. The default is to autodetect the format of the file.

--bin_length <n>

Bin length. The binary file depends on a data structure called the  FM index, which organizes a permuted copy of the sequence in bins  of length <n>. Longer bin length will lead to smaller files (because data is  captured about each bin) and possibly slower query time. The  default is 256. Much more than 512 may lead to notable reduction  in speed.

--sa_freq <n>

Suffix array sample rate. The FM index structure also samples from  the underlying suffix array for the sequence database. More frequent  sampling (smaller value for <n>) will yield larger file size and faster search (until file size becomes large enough to cause I/O to be a bottleneck). The default value is 8. Must be a power of 2.

--block_size <n>

The input sequence is broken into blocks of size <n> million letters. An FM index is built for each block, rather than  building an FM index for the entire sequence database. Default is  50. Larger blocks do not seem to yield substantial speed increase.

See Also

See hmmer(1) for a master man page with a list of all the individual man pages for programs in the HMMER package.

For complete documentation, see the user guide that came with your HMMER distribution (Userguide.pdf); or see the HMMER web page ().


