gdal-vector-pipeline - Man Page

Name

gdal-vector-pipeline — Process a vector dataset

Added in version 3.11.

Synopsis

Usage: gdal vector pipeline [OPTIONS] <PIPELINE>

Process a vector dataset.

Positional arguments:

Common Options:
  -h, --help              Display help message and exit
  --json-usage            Display usage as JSON document and exit
  --config <KEY>=<VALUE>  Configuration option [may be repeated]
  --progress              Display progress bar

<PIPELINE> is of the form: read|concat [READ-OPTIONS] ( ! <STEP-NAME> [STEP-OPTIONS] )* ! write [WRITE-OPTIONS]

A pipeline chains several steps, separated with the ! (exclamation mark) character. The first step must be read or concat, and the last one write. Each step has its own positional or non-positional arguments. Apart from read, concat and write, all other steps can potentially be used several times in a pipeline.

Potential steps are:

Details for options can be found in gdal vector concat.

Details for options can be found in gdal vector clip.

Details for options can be found in gdal vector edit.

Details for options can be found in gdal vector filter.

Details for options can be found in gdal vector geom.

Details for options can be found in gdal vector reproject.

Details for options can be found in gdal vector select.

Details for options can be found in gdal vector sql.

Description

gdal vector pipeline can be used to process a vector dataset and perform various processing steps.

Gdalg Output (on-the-Fly / Streamed Dataset)

A pipeline can be serialized as a JSON file using the GDALG output format. The resulting file can then be opened as a vector dataset using the GDALG: GDAL Streamed Algorithm driver, and apply the specified pipeline in a on-the-fly / streamed way.

The command_line member of the JSON file should nominally be the whole command line without the final write step, and is what is generated by gdal vector pipeline ! .... ! write out.gdalg.json.

{
    "type": "gdal_streamed_alg",
    "command_line": "gdal vector pipeline ! read in.gpkg ! reproject --dst-crs=EPSG:32632"
}

The final write step can be added but if so it must explicitly specify the stream output format and a non-significant output dataset name.

{
    "type": "gdal_streamed_alg",
    "command_line": "gdal vector pipeline ! read in.gpkg ! reproject --dst-crs=EPSG:32632 ! write --output-format=streamed streamed_dataset"
}

Examples

Example 1: Reproject a GeoPackage file to CRS EPSG:32632 (“WGS 84 / UTM zone 32N”)

$ gdal vector pipeline --progress ! read in.gpkg ! reproject --dst-crs=EPSG:32632 ! write out.gpkg --overwrite

Example 2: Serialize the command of a reprojection of a GeoPackage file in a GDALG file, and later read it

$ gdal vector pipeline --progress ! read in.gpkg ! reproject --dst-crs=EPSG:32632 ! write in_epsg_32632.gdalg.json --overwrite
$ gdal vector info in_epsg_32632.gdalg.json

Example 3: None

Union 2 source shapefiles (with similar structure), reproject them to EPSG:32632, keep only cities larger than 1 million inhabitants and write to a GeoPackage

$ gdal vector pipeline --progress ! concat --single --dst-crs=EPSG:32632 france.shp belgium.shp ! filter --where "pop > 1e6" ! write out.gpkg --overwrite

Author

Even Rouault <even.rouault@spatialys.com>

Info

Jul 12, 2025 GDAL