There are (at least) two types of package managers

lobste.rs - Wed Sep 15 04:14

There are (at least) two types of package managers

September 14, 2021

These days, it seems that everything has a package manager (or package management system). Linux distributions and other Unixes have long had them (Debian apt, Fedora DNF, FreeBSD ports, etc), as do languages; Rust has Cargo, NPM is for node.js, and Perl has the famous CPAN. However, over dealing with a number of them I have come to feel that there are at least two rather different sorts of things being lumped together under the name "package managers". I don't have good names for them, so for this entry I'll call them program managers and module managers.

Program managers are what Linuxes and other Unixes have. Their job is to install programs (or packages more generally) and their dependencies (almost always globally), and keep everything up to date. In general, programs and packages within a single distribution version all depend on the same versions of things; if two programs depend on two different versions of something else, either one program can't be packaged for the distribution or there needs to be a second package of the something else, so both versions can be (globally) installed and used at the same time. While program managers often theoretically allow packages to express relatively complex dependency constraints (eg 'versions X to Y of this other package, except version Z'), this power is rarely used in practice because the entire collection is expected to be coherent.

Module managers are what languages have. Their job is to manage the dependencies of various different pieces of code, automatically determining what versions of additional modules can satisfy all of the constraints of your code, the modules your code uses, and so on (and then fetch, install, and update them). There is no idea of a "distribution version" (and thus no idea of required package versions being the same within one); instead there is a cloud of various versions of various packages, with a complex interlinked network of dependencies and version requirements. Module managers support relatively complex dependency constraints and these constraints are frequently used, partly because modules get updated at any time without promises of retaining backward compatibility in their API.

(Modules may signal API breaks with semantic versioning, but they don't promise not to make them at all. If your code says 'I accept the latest version of module X, whatever it is' and module X releases a new major version with a new API, this is socially acceptable on the part of module X and the result to your code is your fault.)

Module managers can operate in a global mode, but this is not really natural to them. The natural mode for modern module managers is to be applied to an individual top-level entity (your code, a module, a program, etc) to gather its requirements. It's expected that there are many top level entities on the system and not all of them can use the same version of any particular dependency; package version management is per top level entity, not global.

The package repository used by a program manager doesn't necessarily keep older versions of packages around (within a single distribution version), because they're both unnecessary (all other packages use the latest version) and undesirable (they've been superseded by an updated version). The package repository used by a module manager has to keep around essentially all versions of all modules ever published, because someone out there might be requiring that version specifically.

A program manager and its backend is almost always implicitly a closed universe, where the people operating it only consider the needs and dependencies of the packages it contains. If you have your own packages, you're on your own to keep them up to date as the program manager's packages change versions. A module manager and its backend are explicitly an open universe; it's expected that you have your own outside code that requires packages from the module manager in a way that's invisible and unpredictable to the module manager.

(People are often hostile to the local module manager client reporting very much information about what they're using it for to the module system's operators. Many people only want to expose what packages they actually fetch, and even then they may hide this with local caches and other mechanisms.)