compsize - Man Page

calculate compression ratio of a set of files on btrfs

Synopsis

compsize file-or-dir [ file-or-dir ... ]

Description

compsize takes a list of files on a btrfs filesystem (recursing directories) and measures used compression types and the effective compression ratio.

As talking about compression ratio for a partial extent doesn't quite make any sense, every used extent is considered in its entirety.  Every extent is also counted exactly once, even if it's reflinked multiple times.

The program gives a report similar to:
Processed 90319 files.
Type       Perc     Disk Usage   Uncompressed Referenced
TOTAL       79%      1.4G         1.8G         1.9G
none       100%      1.0G         1.0G         1.0G
lzo         53%      446M         833M         843M

The fields above are:

Type

compression algorithm

Perc

disk usage/uncompressed (compression ratio)

Disk Usage

blocks on the disk; this is what storing these files actually costs you (save for RAID considerations)

Uncompressed

uncompressed extents; what you would need without compression - includes deduplication savings and pinned extent waste

Referenced

apparent file sizes (sans holes); this is what a traditional filesystem that supports holes and efficient tail packing, or tar -S, would need to store these files

Let's see this on an example: a file 128K big is stored as a single extent A which compressed to a single 4K page.  It then receives a write of 32K at offset 32K, which also compressed to a single 4K page, stored as extent B.

The file now appears as:
        +-------+-------+---------------+
extent A | used  | waste | used          |
        +-------+-------+---------------+
extent B         | used  |
                +-------+

The "waste" inside extent A can't be gotten rid until the whole extent is rewritten (for example by defrag).  If compressed, the whole extent needs to be read every time that part of the file is being read, thus the "waste" is still required.

In this case, we have: Disk Usage: 8KB, Uncompressed: 160K, Referenced: 128K.

Options

-b/--bytes

Show raw byte counts rather than human-friendly sizes.

-x/--one-file-system

Skip files and directories on different file systems.

Signals

USR1

Displays partial data for files processed so far.

Caveats

Recently written files may show as not taking any space until they're actually allocated and compressed; this happens once they're synced or on natural writeout, typically on the order of 30 seconds.

The ioctls used by this program require root.

Inline extents are considered to be always unique, even if they share the same bytes on the disk.

This program doesn't currently support filesystems above 8TB on 32-bit machines  but neither do other btrfs tools.

Info

2017-09-04 btrfs