Intent
Goals:
ISO build process SHOULD continue to only include the packages it needs
Modular packages MUST install correctly from the SIMP ISO or the local mirror created from the SIMP ISO
Modular packages installed from the SIMP ISO MUST upgrade correctly
Circumstances:
If necessary, the ISO MAY mirror the entire base OS’s DVD
AppStream/
repository in order to avoid redHowever, the ISO MUST NOT mirror an entire repository (like
epel-modular
) just to provide a few packages from a single module stream (389-directory-server:stable
)No matter what, the old
createrepo
command MUST NOT EVER be run on a repo with modular packages—usecreaterepo_c
instead, or modulemd-tools (in EPEL8, but very WIP and buggy).
Circumstances:
The introduction of 389ds in 6.6.0 will require the
389-directory-server:stable
from theepel-modular
repository
Conclusion:
Repackages eve though they ship as a subset from the upstream repository’s
Approach
Download the modulemd metadata from the source repo of each modular RPM at the same time the RPM is acquired.
Use unique name + stream + version + context + architecture (N:S:V:C:A) combinations from the resulting modular RPMs to determine which “slim” module streams to reconstruct. (We don’t care about /P for this.)
For each unique “slim” modular stream: generate modulemd metadata for all relevant RPMs
Combine all “slim” modules' modulemd data into a single modulemd.yaml data structure
Create the modular repository using
createrepo_c
(orcreaterepo_mod
—just notcreaterepo
RPM data
Implementing slim modular repos
Problems that are probably solved
yumdownloader
can’t see RPMs in modules/streams that aren’t enabledAdd a field to
packages.yaml
to specify N:S: for each modular RPMIdentify and enable all unique N: from
packages.yaml
(fail if there are conflicting S:)dnf module enable
each N:S: before beginning to useyumdownloader
Individual
yumdownloader
runs can change repository mirrors, which may be out of sync with each other and have different modulemd data.(When using the
yumdownloader
) the modulemd metadata must be fetched at the same time as the RPM is downloaded, in order to preserve the precise state of that RPM’s modular metadata.
In upstream repositories, it’s possible that a single RPM could be part of multiple streams (nothing in the modulemd data prevents this). We need a way to decide which stream to use.
We will need to add an explicit N:S: to
packages.yaml
anyway, to decide which modules/streams to enable. This will explicitly set the stream.However: there is no way to hint streams in the simple
*pkglist.txt
files for minimal BaseOS packages (unless we do something elaborate, like add comment keywords)Most BaseOS EL8 modules have a default stream; use that if it exists
We can also default to the only stream.
This is hacky, but it will work for EL8.3—Base OS (i.e., AppStream) modules without a default stream are currently very rare, and at the moment all of them have a single stream:# dnf module --disablerepo=\* --enablerepo=appstream list | grep -v '\[d\]' CentOS Linux 8 - AppStream Name Stream Profiles Summary 389-ds 1.4 389 Directory Server (base) libselinux-python 2.8 common Python 2 bindings for libselinux mod_auth_openidc 2.3 Apache module suporting OpenID Connect authentication parfait 0.5 common Parfait Module pki-core 10.6 PKI Core module for PKI 10.6 or later pki-deps 10.6 PKI Dependencies module for PKI 10.6 or later
This leaves a rare edge case (current population: 0) that will fail where Base OS modules with multiple streams don’t have a default stream.
We should probably have a way of formally declaring N:S for
*pkglist.txt
Base OS RPMs in the future. Some possibilities:A separate
*pkglist.modularity.txt
file?N:S-declaring directives in the comments of
*pkglist.txt
?Could this be combined with
packages.yaml
? (not easy to see how)
Unsolved problems
🎗 What are the specific flow differences in “Fetch RPM” between Base OS (AppStream) and External
yumdownloader
(epel-modular)?With
yumdownloader
, how can we get the source repo’s modulemd data for each RPM?Option 1: see if
yumdownloader
can be convinced to display it, like--urls
(haven’t found it yet)Option 2: walk up the dir tree until we find metadata (hacky, expensive)
Option 3: find/define the DNF cache and somehow fish out the modulemd data for that specific package
With
yumdownloader
, Different RPMs could be sourced from different versions (V:) of the same module stream if yumdownloader pulls them from different repo mirrors that are out of sync with each other. Using the heuristic of a “slim” module stream per unique N:S:V:C:A , this would result in multiple module streams instead of one.This is a rare edge case that V: is specifically intended to catch. We should probably fail and refuse to proceed with the build using RPMs from multiple different versions of the same module stream; there’s a small chance that not all the RPMs are still intended to be part of the module. There may be other reasons to do with inter-modular dependencies. Not sure if failing is the best way—input welcome.
Unknown Unknowns
Should/how would we persist cached modulemd metadata for already-downloaded RPMs between builds?