public-inbox-xcpdb - Man Page

upgrade Xapian DB formats

Synopsis

public-inbox-xcpdb [Options] INBOX_DIR

public-inbox-xcpdb [Options] --all

Description

public-inbox-xcpdb is similar to copydatabase(1) for upgrading to the latest database format supported by Xapian (e.g. "glass" or "honey"), but is designed to tolerate and accept parallel Xapian database modifications from public-inbox-watch(1), public-inbox-mda(1), public-inbox-learn(1), and public-inbox-index(1).

This command is rarely used, as Xapian DB formats rarely change.

Options

--all

Copy all inboxes configured in ~/.public-inbox/config. This is an alternative to specifying individual inboxes directories on the command-line.

-c
--compact

In addition to performing the copy operation, run xapian-compact(1) on each Xapian shard after copying but before finalizing it. Compared to the cost of copying a Xapian database, compacting a Xapian database takes only around 5% of the time required to copy.

Compared to public-inbox-compact(1), use of this option is preferable for gigantic inboxes where the coarse-grained lock currently required for public-inbox-compact(1) can cause the compaction to take hours at-a-time.

-R N
--reshard=N

Reshard the Xapian database on a v2 inbox to N shards .  Since xapian-compact(1) is not suitable for merging, users can rely on this switch to reshard the existing Xapian database(s) to any positive value of N.

This is useful in case the Xapian DB was created with too few or too many shards given the capabilities of the current hardware.

--blocksize
--no-full
--fuller

These options are passed directly to xapian-compact(1) when used with --compact.

--no-fsync

Disable fsync(2) and fdatasync(2). See "--no-fsync" in public-inbox-index(1) for caveats.

Available in public-inbox 1.6.0+.

--sequential-shard

Copy each shard sequentially, ignoring --jobs.  This also affects indexing done at the end of a run.

--batch-size=BYTES
--max-size=BYTES

See public-inbox-index(1) for a description of these options.

These indexing options indexing at the end of a run. public-inbox-xcpdb may run in parallel with with public-inbox-index(1), and public-inbox-xcpdb needs to reindex changes made to the old Xapian DBs by public-inbox-index(1) while it was running.

Environment

PI_CONFIG

The default config file, normally "~/.public-inbox/config". See public-inbox-config(5)

XAPIAN_FLUSH_THRESHOLD

The number of documents to update before committing changes to disk.  This environment is handled directly by Xapian, refer to Xapian API documentation for more details.

Default: 10000

Upgrading

This tool is intended for admins upgrading Xapian search databases used by public-inbox, NOT users upgrading public-inbox itself.

In particular, it DOES NOT upgrade the schema used by the PSGI search interface (see public-inbox-index(1)).

Limitations

Do not use public-inbox-purge(1) or public-inbox-edit(1) while this is running; old (purged or edited data) may show up.

Normal invocations public-inbox-index(1) can safely run while this is running, too.  However, reindexing via the "--reindex" in public-inbox-index(1) switch will be a waste of computing resources.

Contact

Feedback welcome via plain-text mail to <mailto:meta@public-inbox.org>

The mail archives are hosted at <https://public-inbox.org/meta/> and <http://4uok3hntl7oi7b4uf4rtfwefqeexfzil2w6kgk2jn5z2f764irre7byd.onion/meta/>

See Also

copydatabase(1), xapian-compact(1), public-inbox-index(1)

Referenced By

public-inbox-edit(1), public-inbox-purge(1), public-inbox-v1-format(5), public-inbox-v2-format(5).

1993-10-02 public-inbox.git public-inbox user manual