Your company here, and a link to your site. Click to find out more.

Package python3-datalad

Keep code, data, containers under control with git and git-annex


DataLad makes data management and data distribution more accessible. To do
that, it stands on the shoulders of Git and Git-annex to deliver a
decentralized system for data exchange. This includes automated ingestion of
data from online portals and exposing it in readily usable form as Git(-annex)
repositories, so-called datasets. The actual data storage and permission
management, however, remains with the original data providers.

The full documentation is available at docs.datalad.org and
handbook.datalad.org provides a hands-on crash-course on DataLad


A number of extensions are available that provide additional functionality for
DataLad. Extensions are separate packages that are to be installed in addition
to DataLad. In order to install DataLad customized for a particular domain, one
can simply install an extension directly, and DataLad itself will be
automatically installed with it. An annotated list of extensions is available
in the DataLad handbook.


The documentation for this project is found here: docs.datalad.org

If you have a problem or would like to ask a question about how to use DataLad,
please submit a question to NeuroStars.org with a datalad tag. NeuroStars.org
is a platform similar to StackOverflow but dedicated to neuroinformatics.

All previous DataLad questions are available here:

Version: 1.0.1

General Commands

datalad comprehensive data management solution
datalad-add-archive-content add content of an archive under git annex control.
datalad-add-readme add basic information about DataLad datasets to a README file
datalad-addurls create and update a dataset from a list of URLs.
datalad-check-dates find repository dates that are more recent than a reference date.
datalad-clean clean up after DataLad (possible temporary files etc.)
datalad-clone obtain a dataset (copy) from a URL or local directory
datalad-configuration get and set dataset, dataset-clone-local, or global configuration
datalad-copy-file copy files and their availability metadata from one dataset to another.
datalad-create create a new dataset from scratch.
datalad-create-sibling create a dataset sibling on a UNIX-like Shell (local or SSH)-accessible machine
datalad-create-sibling-gin create a dataset sibling on a GIN site (with content hosting)
datalad-create-sibling-gitea create a dataset sibling on a Gitea site
datalad-create-sibling-github create dataset sibling on GitHub.org (or an enterprise deployment).
datalad-create-sibling-gitlab create dataset sibling at a GitLab site
datalad-create-sibling-gogs create a dataset sibling on a GOGS site
datalad-create-sibling-ria creates a sibling to a dataset in a RIA store
datalad-create-test-dataset create test (meta-)dataset.
datalad-diff report differences between two states of a dataset (hierarchy)
datalad-download-url download content
datalad-drop drop content of individual files or entire (sub)datasets
datalad-export-archive export the content of a dataset as a TAR/ZIP archive.
datalad-export-archive-ora export an archive of a local annex object store for the ORA remote.
datalad-export-to-figshare export the content of a dataset as a ZIP archive to figshare
datalad-foreach-dataset run a command or Python code on the dataset and/or each of its sub-datasets.
datalad-get get any dataset content (files/directories/subdatasets).
datalad-install install one or many datasets from remote URL(s) or local PATH source(s).
datalad-no-annex configure a dataset to never put some content into the dataset's annex
datalad-push push a dataset to a known sibling.
datalad-remove remove components from datasets
datalad-rerun re-execute previous `datalad run` commands.
datalad-run run an arbitrary shell command and record its impact on a dataset.
datalad-run-procedure run prepared procedures (DataLad scripts) on a dataset
datalad-save save the current state of a dataset
datalad-shell-completion display shell script for enabling shell completion for DataLad.
datalad-siblings manage sibling configuration
datalad-sshrun run command on remote machines via SSH.
datalad-status report on the state of dataset content.
datalad-subdatasets report subdatasets and their properties.
datalad-uninstall dEPRECATED: use the DROP command
datalad-unlock unlock file(s) of a dataset
datalad-update update a dataset from a sibling.
datalad-wtf generate a report about the DataLad installation and configuration