# dbrvstatdiff man page

dbrvstatdiff — evaluate statistical differences between two random variables

## Synopsis

``````    dbrvstatdiff [-f format] [-c ConfRating]
[-h HypothesizedDifference] m1c sd1c n1c m2c sd2c n2c``````

`OR`

``    dbrvstatdiff [-f format] [-c ConfRating] m1c n1c m2c n2c``

## Description

Produce statistics on the difference of sets of random variables. If a hypothesized difference is given (with `-h`), to does a Student's t-test.

Random variables are specified by:

`m1c`, `m2c`

The column names of means of random variables.

`sd1c`, `sd2c`

The column names of standard deviations of random variables.

`n1c`, `n2c`

Counts of number of samples for each random variable

These values can be computed with dbcolstats.

Creates up to ten new columns:

`diff`

The difference of RV 2 - RV 1.

`diff_pct`

The percentage difference (RV2-RV1)/1

`diff_conf_{half,low,high}` and `diff_conf_pct_{half,low,high}`

The half half confidence intervals and low and high values for absolute and relative confidence.

`t_test`

The T-test value for the given hypothesized difference.

`t_test_result`

Given the confidence rating, does the test pass?  Will be either “rejected” or “not-rejected”.

`t_test_break`

The hypothesized value that is break-even point for the T-test.

`t_test_break_pct`

Break-even point as a percent of m1c.

Confidence intervals are not printed if standard deviations are not provided. Confidence intervals assume normal distributions with common variances.

T-tests are only computed if a hypothesized difference is provided. Hypothesized differences should be proceeded by <=, >=, =. T-tests assume normal distributions with common variances.

## Options

-c FRACTION or --confidence FRACTION

Specify FRACTION for the confidence interval. Defaults to 0.95 for a 95% confidence factor (alpha = 0.05).

-f FORMAT or --format FORMAT

Specify a printf(3)-style format for output statistics. Defaults to `%.5g`.

-h DIFF or --hypothesis DIFF

Specify the hypothesized difference as `DIFF`, where `DIFF` is something like `<=0` or `>=0`, etc.

This module also supports the standard fsdb options:

-d

Enable debugging output.

-i or --input InputSource

Read from InputSource, typically a file name, or `-` for standard input, or (if in Perl) a IO::Handle, Fsdb::IO or Fsdb::BoundedQueue objects.

-o or --output OutputDestination

Write to OutputDestination, typically a file name, or `-` for standard output, or (if in Perl) a IO::Handle, Fsdb::IO or Fsdb::BoundedQueue objects.

--autorun or --noautorun

By default, programs process automatically, but Fsdb::Filter objects in Perl do not run until you invoke the run() method. The `--(no)autorun` option controls that behavior within Perl.

--help

Show help.

--man

Show full manual.

## Sample Usage

### Input

``````    #fsdb title mean2 stddev2 n2 mean1 stddev1 n1
example6.12 0.17 0.0020 5 0.22 0.0010 4``````

### Command

``    cat data.fsdb | dbrvstatdiff mean2 stddev2 n2 mean1 stddev1 n1``

### Output

``````    #fsdb title mean2 stddev2 n2 mean1 stddev1 n1 diff diff_pct diff_conf_half diff_conf_low diff_conf_high diff_conf_pct_half diff_conf_pct_low diff_conf_pct_high
example6.12 0.17    0.0020  5       0.22    0.0010  4       0.05    29.412  0.0026138       0.047386        0.052614        1.5375  27.874  30.949
#  | dbrvstatdiff mean2 stddev2 n2 mean1 stddev1 n1``````

### Input 2

(example 7.10 from Scheaffer and McClave):

``````    #fsdb title x2 sd2 n2 x1 sd1 n1
example7.10 9 35.22 24.44 9 31.56 20.03``````

### Command 2

``    dbrvstatdiff -h '<=0' x2 sd2 n2 x1 sd1 n1``

### Output 2

``````    #fsdb title n1 x1 sd1 n2 x2 sd2 diff diff_pct diff_conf_half diff_conf_low diff_conf_high diff_conf_pct_half diff_conf_pct_low diff_conf_pct_high t_test t_test_result
example7.10 9 35.22 24.44 9 31.56 20.03 3.66 0.11597 4.7125 -1.0525 8.3725 0.14932 -0.033348 0.26529 1.6465 not-rejected
#  | /global/us/edu/ucla/cs/ficus/users/johnh/BIN/DB/dbrvstatdiff -h <=0 x2 sd2 n2 x1 sd1 n1``````

### Case 3

A common use case is to have one file with a set of trials from two experiments, and to use dbrvstatdiff to see if they are different.

Input 3:

``````    #fsdb case trial value
a  1  1
a  2  1.1
a  3  0.9
a  4  1
a  5  1.1
b  1  2
b  2  2.1
b  3  1.9
b  4  2
b  5  1.9``````

### Command 3

``````    cat two_trial.fsdb |
dbmultistats -k case value |
dbcolcopylast mean stddev n |
dbrow '_case eq "b"' |
dbrvstatdiff -h '=0' mean stddev n copylast_mean copylast_stddev copylast_n |
dblistize``````

Output 3:

``````        #fsdb -R C case mean stddev pct_rsd conf_range conf_low conf_high conf_pct sum sum_squared min max n copylast_mean copylast_stddev copylast_n diff diff_pct diff_conf_half diff_conf_low diff_conf_high diff_conf_pct_half diff_conf_pct_low diff_conf_pct_high t_test t_test_result t_test_break t_test_break_pct
case: b
mean: 1.98
stddev: 0.083666
pct_rsd: 4.2256
conf_range: 0.10387
conf_low: 1.8761
conf_high: 2.0839
conf_pct: 0.95
sum: 9.9
sum_squared: 19.63
min: 1.9
max: 2.1
n: 5
copylast_mean: 1.02
copylast_stddev: 0.083666
copylast_n: 5
diff: -0.96
diff_pct: -48.485
diff_conf_half: 0.12202
diff_conf_low: -1.082
diff_conf_high: -0.83798
diff_conf_pct_half: 6.1627
diff_conf_pct_low: -54.648
diff_conf_pct_high: -42.322
t_test: -18.142
t_test_result: rejected
t_test_break: -1.082
t_test_break_pct: -54.648

#  | dbmultistats -k case value
#   | dbcolcopylast mean stddev n
#   | dbrow _case eq "b"
#   | dbrvstatdiff -h =0 mean stddev n copylast_mean copylast_stddev copylast_n
#   | dbfilealter -R C``````

(So one cannot say that they are statistically equal.)