# dbrvstatdiff man page

dbrvstatdiff — evaluate statistical differences between two random variables

## Synopsis

```
dbrvstatdiff [-f format] [-c ConfRating]
[-h HypothesizedDifference] m1c sd1c n1c m2c sd2c n2c
```

`OR`

`dbrvstatdiff [-f format] [-c ConfRating] m1c n1c m2c n2c`

## Description

Produce statistics on the difference of sets of random variables. If a hypothesized difference is given (with `"-h"`

), to does a Student's t-test.

Random variables are specified by:

- "m1c", "m2c"
- The column names of means of random variables.
- "sd1c", "sd2c"
- The column names of standard deviations of random variables.
- "n1c", "n2c"
- Counts of number of samples for each random variable

These values can be computed with dbcolstats.

Creates up to ten new columns:

- "diff"
- The difference of RV 2 - RV 1.
- "diff_pct"
- The percentage difference (RV2-RV1)/1
- "diff_conf_{half,low,high}" and "diff_conf_pct_{half,low,high}"
- The half half confidence intervals and low and high values for absolute and relative confidence.
- "t_test"
- The T-test value for the given hypothesized difference.
- "t_test_result"
- Given the confidence rating, does the test pass? Will be either "rejected" or "not-rejected".
- "t_test_break"
- The hypothesized value that is break-even point for the T-test.
- "t_test_break_pct"
- Break-even point as a percent of m1c.

Confidence intervals are not printed if standard deviations are not provided. Confidence intervals assume normal distributions with common variances.

T-tests are only computed if a hypothesized difference is provided. Hypothesized differences should be proceeded by <=, >=, =. T-tests assume normal distributions with common variances.

## Options

**-c FRACTION**or**--confidence FRACTION**- Specify FRACTION for the confidence interval. Defaults to 0.95 for a 95% confidence factor (alpha = 0.05).
**-f FORMAT**or**--format FORMAT**- Specify a printf(3)-style format for output statistics. Defaults to
`"%.5g"`

. **-h DIFF**or**--hypothesis DIFF**- Specify the hypothesized difference as
`"DIFF"`

, where`"DIFF"`

is something like`"<=0"`

or`">=0"`

, etc.

This module also supports the standard fsdb options:

**-d**- Enable debugging output.
**-i**or**--input**InputSource- Read from InputSource, typically a file name, or
`"-"`

for standard input, or (if in Perl) a IO::Handle, Fsdb::IO or Fsdb::BoundedQueue objects. **-o**or**--output**OutputDestination- Write to OutputDestination, typically a file name, or
`"-"`

for standard output, or (if in Perl) a IO::Handle, Fsdb::IO or Fsdb::BoundedQueue objects. **--autorun**or**--noautorun**- By default, programs process automatically, but Fsdb::Filter objects in Perl do not run until you invoke the
*run()*method. The`"--(no)autorun"`

option controls that behavior within Perl. **--help**- Show help.
**--man**- Show full manual.

## Sample Usage

### Input

```
#fsdb title mean2 stddev2 n2 mean1 stddev1 n1
example6.12 0.17 0.0020 5 0.22 0.0010 4
```

### Command

`cat data.fsdb | dbrvstatdiff mean2 stddev2 n2 mean1 stddev1 n1`

### Output

```
#fsdb title mean2 stddev2 n2 mean1 stddev1 n1 diff diff_pct diff_conf_half diff_conf_low diff_conf_high diff_conf_pct_half diff_conf_pct_low diff_conf_pct_high
example6.12 0.17 0.0020 5 0.22 0.0010 4 0.05 29.412 0.0026138 0.047386 0.052614 1.5375 27.874 30.949
# | dbrvstatdiff mean2 stddev2 n2 mean1 stddev1 n1
```

### Input 2

(example 7.10 from Scheaffer and McClave):

```
#fsdb title x2 sd2 n2 x1 sd1 n1
example7.10 9 35.22 24.44 9 31.56 20.03
```

### Command 2

`dbrvstatdiff -h '<=0' x2 sd2 n2 x1 sd1 n1`

### Output 2

```
#fsdb title n1 x1 sd1 n2 x2 sd2 diff diff_pct diff_conf_half diff_conf_low diff_conf_high diff_conf_pct_half diff_conf_pct_low diff_conf_pct_high t_test t_test_result
example7.10 9 35.22 24.44 9 31.56 20.03 3.66 0.11597 4.7125 -1.0525 8.3725 0.14932 -0.033348 0.26529 1.6465 not-rejected
# | /global/us/edu/ucla/cs/ficus/users/johnh/BIN/DB/dbrvstatdiff -h <=0 x2 sd2 n2 x1 sd1 n1
```

### Case 3

A common use case is to have one file with a set of trials from two experiments, and to use dbrvstatdiff to see if they are different.

*Input 3:*

```
#fsdb case trial value
a 1 1
a 2 1.1
a 3 0.9
a 4 1
a 5 1.1
b 1 2
b 2 2.1
b 3 1.9
b 4 2
b 5 1.9
```

### Command 3

```
cat two_trial.fsdb |
dbmultistats -k case value |
dbcolcopylast mean stddev n |
dbrow '_case eq "b"' |
dbrvstatdiff -h '=0' mean stddev n copylast_mean copylast_stddev copylast_n |
dblistize
```

*Output 3:*

```
#fsdb -R C case mean stddev pct_rsd conf_range conf_low conf_high conf_pct sum sum_squared min max n copylast_mean copylast_stddev copylast_n diff diff_pct diff_conf_half diff_conf_low diff_conf_high diff_conf_pct_half diff_conf_pct_low diff_conf_pct_high t_test t_test_result t_test_break t_test_break_pct
case: b
mean: 1.98
stddev: 0.083666
pct_rsd: 4.2256
conf_range: 0.10387
conf_low: 1.8761
conf_high: 2.0839
conf_pct: 0.95
sum: 9.9
sum_squared: 19.63
min: 1.9
max: 2.1
n: 5
copylast_mean: 1.02
copylast_stddev: 0.083666
copylast_n: 5
diff: -0.96
diff_pct: -48.485
diff_conf_half: 0.12202
diff_conf_low: -1.082
diff_conf_high: -0.83798
diff_conf_pct_half: 6.1627
diff_conf_pct_low: -54.648
diff_conf_pct_high: -42.322
t_test: -18.142
t_test_result: rejected
t_test_break: -1.082
t_test_break_pct: -54.648
# | dbmultistats -k case value
# | dbcolcopylast mean stddev n
# | dbrow _case eq "b"
# | dbrvstatdiff -h =0 mean stddev n copylast_mean copylast_stddev copylast_n
# | dbfilealter -R C
```

(So one cannot say that they are statistically equal.)

## See Also

Fsdb. dbcolstats. dbcolcopylast.

## AUTHOR and COPYRIGHT

Copyright (C) 1991-2015 by John Heidemann <johnh@isi.edu>

This program is distributed under terms of the GNU general public license, version 2. See the file COPYING with the distribution for details.