mass-prebuild - Man Page

the mass-prebuild set of tools

Description

The mass pre-builder (mpb(1)) is a set of tools aimed to help the user to create mass rebuilds around a limited set of packages, in order to assess the stability of a given update.

The idea is rather simple. Given a package or a set of packages, namely "main packages", the mass pre-builder will calculate the list of its direct reverse dependencies: packages that explicitly mark one of the main packages in their "BuildRequires" field. The tooling first builds the main packages using the distribution’s facilities, which should include a set of test cases that validate general functionalities. Assuming these packages are built successfully, they are then used as base packages in order to build the reverse dependencies and execute their own test cases.

That gives a first set of results, there may be successful builds (hopefully the majority), but also failures that may or may not be due to the changes introduced by modifications of the main packages. In order to reduce the uncertainty, and give a limited list of packages to analyze, as soon as a failure is detected, the mass pre-builder will create another mass build, in parallel to the original one, but without the changes that were introduced into the main packages: a pristine build. This pristine build will therefore only include a sublist of the reverse dependencies, the ones that failed on the original run.

Once all the package builds are done, there will therefore be 3 major categories:

  1. The successful ones
  2. The ones that failed only with the modified packages
  3. The ones that failed with both the modifications and the pristine version

Out of these, the first category can likely be ignored, since the packages don’t seem to have been affected by the changes.

The second category needs much more attention. These are the ones that cry: "Hey there seems to be a big issue with your changes !". Failure needs to be analyzed, in order to figure out if the problem being raised is due to changes that have been introduced, or maybe a mistake from the final user (e.g. Use of a deprecated feature that got removed).

The last category is a bit trickier. Since the build failed with the pristine packages, there may be hidden failures among them that originated from the new changes.

Examples of workflows

The package pre-release use case (medium sized)

This is the primary motivation for mpb and may therefore give some insights on decisions taken for default behavior.

This use case is basically the one for autoconf: an official release will come out soon, and considering the impact of this component, it makes sense to pre-test it and provide feedback to the community. The idea is, autoconf 2.70 led to failures that were not spotted during tests executed by the community. This resulted in an autoconf 2.71 release soon after 2.70, to fix the major ones. The Mass pre-builder was created to check for autoconf 2.72 candidates regularly, and avoid the same kind of problems.

In this example, an autoconf pre-release tarball is created, which has a specific name generated through git describe. This name is going to be re-used as the project name for MPB. If failures are detected there may be more tarballs to be created while using git bisect.

Each SRPM created that way gets its own folder, the data collected by MPB is to be centralized.

The dependencies need to be automatically calculated, they should be built to their last known version. Whenever possible the build should be ordered so that if C depends on B, while both depend on Autoconf, then C should wait for B to be built before starting. This should help detect C build failures due to a B misconfiguration (which itself could be due to the new Autoconf version).

By experience, I know that there are transient failures during build in COPR, so builds should be retried once or twice.

In principle, we should run the build for all architectures, but x86_64 is enough to have a good idea on the status. The latest version of each package needs to be built, so we go for rawhide.

Thus, the configuration looks as follows:

name: autoconf-2.72a.17.0f330
package:
  autoconf:
    src_type: file
    src: /home/anexample/work/fedora/autoconf/autoconf-2.72a.17.0f330-1.fc38.src.rpm
retry: 2
data: /home/anexample/work/mpb/autoconf-2.72

The data collected by MPB will be stored under /home/anexample/work/mpb/autoconf-2.72/autoconf-2.72a.17.0f330/. If failures are detected, assuming the problem comes from autoconf, a new build will be generated, with a dedicated name. This new build may either be a full build if there is confidence on the quality of the fix, or a limited one, using a configuration file generated by mpb-failedconf

Note

Here it shouldn’t be forgotten that the whole idea is to validate the main package. Any new version should therefore go through a full rebuild at one point.

If a failure is due to a bug in a reverse dependency, it falls into 2 categories:

  • There are a handful of failures, they can be restarted manually while modifying the commit ID to be used for the build.
  • The amount of failure is too big to restart manually, mpb-failedconf is used to generate a base config which only contains the failures from the previous build.

In the second case, the generated configuration may need to be modified, e.g. by replacing all the committish: * lines with committish: "@last_build", to ensure that newer version of the packages will be built.

The package release use case (medium size, complex dependencies)

This use case is similar to the previous one, which means that in principle, there may not be any differences. Yet, let’s modify slightly the scenario.

As packager, a fork of the original distgit is used to prepare the new release. If downstream patch need to be made, they will be added into distgit and applied on the fly.

Since the dependency graph is complex, MPB may not be able to calculate it, and brute force is to be used for the builds. The project naming is irrelevant.

package:
  ruby:
    src_type: git
    src: https://src.fedoraproject.org/fork/anexample/rpms/ruby
retry: dynamic
data: /home/anexample/work/mpb/ruby-3.5

Since no name for the project is provided, MPB will automatically attribute one. By default, the naming will be mpb.N where N is the build ID.

The data collected will therefore be stored under /home/anexample/work/mpb/ruby-3.5/mpb.5/ (assuming build ID 5).

Note the package fields. The default value for src_type is distgit, which refers to the official package repository for a given distribution. When using non-official distribution git repositories (like in this case, a fork), it is recommended to define this value to git. The default value for committish is the branch corresponding to the fedora release being build for, in our case rawhide. For each new MPB build, the latest version of our fork of the ruby package will therefore be used.

The retry field is set to dynamic: brute force is applied on the reverse dependency builds. A simplified explanation of this is that as long as there are successful builds, the failed packages will be rebuilt (that isn’t exactly true, but the idea is there).

The rest of the workflow is similar to the previous one (playing with mpb-failedconf whenever necessary).

The multi-package use case

It may happen that a package owner is responsible for multiple packages that depend on each other and therefore need to be built in a specific order.

This can be achieved through the following configuration snippet:

packages:
  componentA:
    priority: 0
  componentB:
    priority: 1
  componentC:
    priority: 2

That way, MPB will make sure that componentA is built first, followed by componentB and then componentC.

The package pre-release (big size, overly complex dependencies)

This use case is typically the one that could be expected from a compiler, where thousands of packages will need to be rebuilt.

arch: all
packages:
  gcc:
    deps_only: True
enable_priorities: False
data: /home/anexample/work/mpb/gcc-14
copr:
  additional_repos:
  - https://anexample.fedorapeople.org/fedora-gcc14-${arch}/

Let’s go through all these new options.

The arch field is set to all, which is an MPB special option that selects all the architectures supported by COPR for rawhide.

The deps_only tells MPB that it isn’t necessary to rebuild gcc, and assume the build is a success. The reason for that is that it takes about 30h in COPR, and other means where used for this package, like Koji or a dedicated build machine. The resulting packages were put in a custom repository, which is given to COPR through the additional_repos field.

Since there are about 9k components that depend on gcc, it makes no sense to try to calculate the priorities for the builds, they will all be executed in the same COPR batch (with lower priority). For the same reason, it may make sense to have a smoke-test for this project, where a limited set of packages will be built:

arch: all
name: gcc-14_smoketest
packages:
  gcc:
    deps_only: True
revdeps:
  automake:
    committish: "@last_build"
  libtool:
    committish: "@last_build"
data: /home/anexample/work/mpb/gcc-14
copr:
  additional_repos:
  - https://anexample.fedorapeople.org/fedora-gcc14-${arch}/

Resources

More information about the mass-prebuilder can be found at https://gitlab.com/fedora/packager-tools/mass-prebuild.

See Also

mpb(1), mpb.config(5), mpb-failedconf(1), mpb-report(1), mpb-whatrequires(1)

Referenced By

mpb(1), mpb.config(5), mpb-failedconf(1), mpb-report(1), mpb-whatrequires(1).

2025-09-24