public-inbox-clone - Man Page

"git clone --mirror" wrapper

Synopsis

public-inbox-clone [Options] INBOX_URL [INBOX_DIR]

public-inbox-clone [Options] ROOT_URL [DESTINATION] # public-inbox 2.0+

Description

public-inbox-clone is a wrapper around git clone --mirror for making the initial clone of a remote HTTP(S) public-inbox.  It allows cloning multi-epoch v2 inboxes with a single command and zero configuration.

In public-inbox 2.0+, public-inbox-clone can create and maintain a mirror of multiple inboxes or code repositories using manifest.js.gz files like grok-pull(1) from grokmirror.  public-inbox-fetch(1) is NOT required when using this mode.

It does not run public-inbox-init(1) nor public-inbox-index(1).  Those commands must be run separately if serving/searching the mirror is required.  As-is, public-inbox-clone is suitable for creating a git-only backup without Xapian and SQLite indices.

When cloning a single inbox, public-inbox-clone creates a Makefile with handy targets to update the inbox once indexed. This Makefile may be edited by the user; it will not be rewritten by public-inbox-fetch(1) unless it is removed completely.

public-inbox-clone does not use nor require any extra configuration files (not even ~/.public-inbox/config), but it can download snippets suitable for adding to any public-inbox-config(5) file.

public-inbox-fetch(1) may be used to keep a single INBOX_DIR up-to-date.

For v2 inboxes, it will create a $INBOX_DIR/manifest.js.gz file to speed up subsequent public-inbox-fetch(1).

Options

--epoch=RANGE

Restrict clones of public-inbox-v2-format(5) inboxes to the given range of epochs.  The range may be a single non-negative integer or a (possibly open-ended) LOW..HIGH range of non-negative integers.  ~ may be prefixed to either (or both) integer values to represent the offset from the maximum possible value.

For example, --epoch=~0 alone clones only the latest epoch, --epoch=~2.. clones the three latest epochs.

Default: 0..~0 or 0.. or ..~0 (all epochs, all three examples are equivalent)

-I PATTERN
--include=PATTERN

When cloning a top-level with multiple inboxes via manifest, only clone inboxes and repositories matching a given wildcard pattern (using *? and [] is supported).

This is a new option in public-inbox 2.0+

--exclude=PATTERN

When cloning a top-level with multiple inboxes via manifest, ignore inboxes and repositories matching the given wildcard pattern. Supports the same wildcards as "--include"

This is a new option in public-inbox 2.0+

--inbox-config=always|v2|v1|never

Whether or not to retrieve the $INBOX/_/text/config/raw HTTP(S) endpoint when cloning.

Since we can't deduce v1 inboxes from code repositories, setting this to v2 or never can allow faster clones of code repositories if no v1 inboxes are present.

Default: always

This is a new option in public-inbox 2.0+

--inbox-version=NUM

Force a remote public-inbox version (must be 1 or 2). This is auto-detected by default, and this option exists mainly for testing.

This is a new option in public-inbox 2.0+

--objstore=DIR

Enables space savings when the remote manifest.js.gz includes forkgroup entries as generated by grokmirror 2.x.

If DIR does not start with /, ./, or ../, it is treated as relative to the DESTINATION directory.  If only --objstore= is specified where DIR is an empty string (""), then objstore ($DESTINATION/objstore) is the implied value of DIR.

This is a new option in public-inbox 2.0+

--manifest=FILE

When incrementally updating an existing mirror, load the given manifest (typically manifest.js.gz) to speed up updates.

By default, public-inbox writes the retrieved manifest to $DESTINATION/manifest.js.gz, this directive also changes the destination to the specified FILE

If FILE does not start with /, ./, or ../, it is treated as relative to the DESTINATION directory.  If only --manifest= is specified where FILE is an empty string (""), then manifest.js.gz ($DESTINATION/manifest.js.gz) is the implied value of FILE.

This is a new option in public-inbox 2.0+

--remote-manifest=URL|RELATIVE_PATH

Use an alternate location for the remote manifest.js.gz file. This may be specified as a full absolute URL (e.g --remote-manifest=https://80x24.org/lore/pub/manifest.js.gz), or a pathname relative to the ROOT_URL (e.g --remote-manifest=pub/manifest.js.gz when ROOT_URL is https://80x24.org/lore/

By default, ROOT_URL/manifest.js.gz is used.

This is a new option in public-inbox 2.0+

--project-list=FILE

When cloning code repos from a manifest, generate a cgit-compatible project list.

If FILE does not start with /, ./, or ../, it is treated as relative to the DESTINATION directory.  If only --project-list= is specified where FILE is an empty string (""), then projects.list ($DESTINATION/projects.list) is the implied value of FILE.

This is a new option in public-inbox 2.0+

--post-update-hook=COMMAND

Hooks to run after a repository is cloned or updated, COMMAND will have the bare git repository destination given as its first and only argument.

For v2 inboxes, this operates on a per-epoch basis.

May be specified multiple times to run multiple commands in the order specified on the command-line.

This is a new option in public-inbox 2.0+

-p
--prune

Pass the --prune and --prune-tags flags to git-fetch(1) calls on incremental clones.

This is a new option in public-inbox 2.0+

--purge

Deletes entire repos which no longer exist in the remote manifest, or are filtered out by --include= or --exclude=.

This is only useful when using --manifest

This is a new option in public-inbox 2.0+

--exit-code

Exit with 127 if no updates are done when relying on a manifest. Updates include fingerprint mismatches in the manifest, new symlinks, new repositories, and removed repositories from the --project-list

This is a new option in public-inbox 2.0+

-k
--keep-going

Continue as much as possible after an error.

This is a new option in public-inbox 2.0+

-n
--dry-run

Show what would be done, without making any changes.

This is a new option in public-inbox 2.0+

-q
--quiet

Quiets down progress messages, also passed to git-fetch(1).

-v
--verbose

Increases verbosity, also passed to git-fetch(1).

--torsocks=auto|no|yes
--no-torsocks

Whether to wrap git(1) and curl(1) commands with torsocks(1).

Default: auto

-j JOBS =item --jobs=JOBS

The number of parallel processes to spawn at once for various network operations using git(1) and/or curl(1).

Examples

To mirror the most recent epochs of dwarves and LTTng inboxes:
  public-inbox-clone --epoch=~0 \
        --include='*lttng*' --include='*dwarves' \
        https://80x24.org/lore/ /path/to/inbox-mirror

https://lore.kernel.org/ may be used instead of https://80x24.org/lore/

To mirror all code repos of the sparse project:
  public-inbox-clone --objstore= --project-list= --prune \
        --include='*sparse*' --inbox-config=never \
        --remote-manifest=https://80x24.org/lore/pub/manifest.js.gz \
        https://80x24.org/lore/ /path/to/code-mirror

https://git.kernel.org/ may be used instead of https://80x24.org/lore/ and the --remote-manifest option can be omitted.

Contact

Feedback welcome via plain-text mail to <mailto:meta@public-inbox.org>

The mail archives are hosted at <https://public-inbox.org/meta/> and <http://4uok3hntl7oi7b4uf4rtfwefqeexfzil2w6kgk2jn5z2f764irre7byd.onion/meta/>

See Also

public-inbox-fetch(1), public-inbox-init(1), public-inbox-index(1)

Info

1993-10-02 public-inbox.git public-inbox user manual