Sponsor:

Your company here — click to reach over 10,000 unique daily visitors

dbcolscorrelate - Man Page

find the coefficient of correlation over columns

Synopsis

    dbcolscorrelate column1 column2 [column3...]

Description

Compute the Pearson coefficient of correlation over two (or more) columns.

The output is one line of correlations.

With exactly two columns, a new column correlation is created.

With more than two columns, correlations are computed for each pairwise combination of rows, and each output column is given a name which is the concatenation of the two source rows, joined with an underscore.

By default, we compute the population correlation coefficient (usually designed rho, ρ) and assume we see all members of the population. With the --sample option we instead compute the sample correlation coefficient, usually designated r. (Be careful in that the default here to full-population is the opposite of the default in dbcolstats.)

This program requires a complete copy of the input data on disk.

Options

--sample

Select a the sample Pearson product-moment correlation coefficient (the "sample correlation coefficient", usually designated r).

--no-sample

Select a the Pearson product-moment correlation coefficient (the "correlation coefficient", usually designated ρ).

--weight COL

Weight the correlation by column COL.

-f FORMAT or --format FORMAT

Specify a printf(3)-style format for output statistics. Defaults to %.5g.

-T TmpDir

where to put tmp files. Also uses environment variable TMPDIR, if -T is  not specified. Default is /tmp.

This module also supports the standard fsdb options:

-d

Enable debugging output.

-i or --input InputSource

Read from InputSource, typically a file name, or - for standard input, or (if in Perl) a IO::Handle, Fsdb::IO or Fsdb::BoundedQueue objects.

-o or --output OutputDestination

Write to OutputDestination, typically a file name, or - for standard output, or (if in Perl) a IO::Handle, Fsdb::IO or Fsdb::BoundedQueue objects.

--autorun or --noautorun

By default, programs process automatically, but Fsdb::Filter objects in Perl do not run until you invoke the run() method. The --(no)autorun option controls that behavior within Perl.

--help

Show help.

--man

Show full manual.

Sample Usage

Input

    #fsdb i_x i_y
    10.0   8.04
    8.0    6.95
    13.0   7.58
    9.0    8.81
    11.0   8.33
    14.0   9.96
    6.0    7.24
    4.0    4.26
    12.0  10.84
    7.0    4.82
    5.0    5.68

Command

    cat TEST/anscombe_quartet.in | dbcolscorrelate i_x i_y

Output

    #fsdb correlation:d
    0.81642
    #  | dbcolscorrelate i_x i_y

See Also

Fsdb, dbcolstatscores, dbcolsregression, dbrvstatdiff.

Info

2024-07-01 perl v5.40.0 User Contributed Perl Documentation