esl-selectn - Man Page

select random subset of lines from file

Synopsis

esl-selectn [options] nlines filename

Description

esl-selectn selects nlines lines at random from file filename and outputs them on stdout.

If filename is - (a single dash), input is read from stdin.

Uses an efficient reservoir sampling algorithm that only requires only a single pass through filename, and memory storage proportional to nlines (and importantly, not to the size of the file filename itself). esl-selectn can therefore be used to create large scale statistical sampling  experiments, especially in combination with other Easel miniapplications.

Options

-h

Print brief help; includes version number and summary of all options, including expert options.

--seed <d>

Set the random number seed to <d>, an integer >= 0. The default is 0, which means to use a randomly selected seed. A seed > 0 results in reproducible identical samples from different runs of the same command.

See Also

http://bioeasel.org/

Author

http://eddylab.org

Info

Nov 2020 Easel 0.48 Easel Manual