danw@174: YUM vs RAZOR danw@174: ------------ danw@174: danw@174: At a very high level, yum's depsolver does something roughly danw@174: equivalent to: danw@174: danw@174: - For each package being installed or removed danw@174: danw@174: - For each relevant property (provides, requires, conflicts, danw@174: obsoletes): danw@174: danw@174: - Figure out what additional packages need to be added to danw@174: or removed from the system to satisfy this property danw@174: danw@174: which ends up being roughly O(N^2 * M) where N is the total number of danw@174: properties and M is the number of packages being acted on. danw@174: danw@174: (I just figured that out off the top of my head, and I'm not totally danw@174: familiar with the yum code, so it may be wrong.) danw@174: danw@174: Razor's depsolver is something like: danw@174: danw@174: - do { danw@174: danw@174: - For each property to be added to or removed from the system: danw@174: danw@174: - Figure out what packages need to be added to or removed danw@174: from the system to satisfy this property danw@174: danw@174: - } until we stop adding/remove more packages danw@174: danw@174: with the key being that it's very easy to find the PROVIDES danw@174: corresponding to a REQUIRES and vice versa, because the property danw@174: arrays are sorted, and so all properties with the same "name" will be danw@174: adjacent to one another in the array, allowing many dependencies to be danw@174: satisified in essentially constant time. (Actually... we've been danw@174: calling it constant, but it's really O(log N) for heavily-depended-on danw@174: packages, because the more packages you have, the more variations on danw@174: "requires foo", "requires foo = 1.1", "requires foo > 1.0", etc you're danw@174: going to have to scan through.) danw@174: danw@174: Ideally though, each iteration of the inner loop body happens in danw@174: constant time, and thus the inner loop as a whole is O(N), and thus danw@174: the depsolver as a whole is O(N * M) (or at least, less than danw@174: O(N * M * log N). danw@174: danw@174: danw@174: FILE DEPENDENCIES danw@174: ----------------- danw@174: danw@174: Whenever we add a package with a file REQUIRES to a razor_set, we also danw@174: add a PROVIDES for that file to the package or packages which provide danw@174: that file. This means that if we later add another package that danw@174: requires the same file (eg, /bin/sh or /usr/bin/perl), we can resolve danw@174: its file requirement exactly like we would resolve a property danw@174: requirement, in nearly constant time. danw@174: danw@174: When adding a *new* file requirement (ie, a requirement on a file that danw@174: no existing package depends on), we still have to scan through the danw@174: file tree, which is O(log N) in the number of files. danw@174: danw@174: (AFAICT, there's no reason yum couldn't do the same optimization. danw@174: Also, AFAICT, yum currently sticks property dependencies and file danw@174: dependencies into the same hash table, so that if any package in the danw@174: transaction has a file dependency, it causes *property* dependencies danw@174: to become slower to resolve as well...) danw@174: danw@174: danw@174: THE DEPSOLVER danw@174: ------------- danw@174: danw@157: We start with two razor_sets, system and upstream, and a list of danw@157: requested installations and removals. danw@157: danw@157: FIXME: what about multiple upstream repos? Having to deal with danw@157: arbitrary numbers of razor_sets is possible, but will probably be danw@157: messy... It might be easier to either store all upstream repo data danw@157: in a single .repo file, or else merge all upstream .repo files danw@157: together into a single razor_set at startup. (Or some combination danw@157: of those.) danw@157: danw@157: We create a bit array of the packages in each set, indicating which danw@157: ones are installed; the system bitarray starts out all 1s, and the danw@157: upstream bitarray all 0s. Each bit is only allowed to change state danw@157: once during the transaction; an installed package can be removed, or danw@157: an uninstalled package installed, but trying to reinstall a removed danw@157: package, or uninstall a newly-installed package is an error. This danw@157: means the packages break down into four categories: danw@157: danw@157: - installed (1 bit in the system bit array) danw@157: - to-be-removed (0 bit in the system bit array) danw@157: - to-be-installed (1 bit in the upstream bit array) danw@157: - installable (0 bit in the upstream bit array) danw@157: danw@157: danw@174: Depsolver algorithm: danw@157: danw@158: - Create new razor_transaction_packages ("rtp"s) for each danw@161: requested install or remove. These will be "unresolved", because danw@161: we haven't yet found the razor_packages that correspond to them. danw@157: danw@158: - while there are new rtps: danw@157: danw@161: - sort the new rtps danw@157: danw@161: - Walk the system property list, upstream property list, and danw@161: new rtp list in parallel, and: danw@157: danw@161: - For each uninstalled PROVIDES: danw@161: danw@161: - If the property is a valid package name (that is, danw@161: either it's a package providing its own name, or it danw@161: has a matching OBSOLETES), and it matches the name danw@161: of a new rtp of type INSTALL or FORCED_UPDATE with danw@174: an unresolved new_package: danw@174: danw@174: - If the upstream package has the same version as danw@174: the system package, we have an UP_TO_DATE error danw@174: (FIXME: not quite right. This doesn't deal with danw@174: the case where we try to update an application danw@174: because of a library update, and it turns out danw@174: there's no new version of the application, but danw@174: there IS a compat package containing the old danw@174: version of the library.) danw@174: danw@174: - Otherwise, set the rtp's new_package to point to danw@174: the package providing this property and set the danw@174: appropriate bit in the upstream bit array. danw@157: danw@158: - For each to-be-installed non-file REQUIRES: danw@157: danw@158: - See if there's an installed or to-be-installed danw@158: package that PROVIDES that property. danw@157: danw@158: - If not, see if there's an installable package that danw@158: PROVIDES that property, and create a new INSTALL rtp danw@158: for it if so. danw@157: danw@158: - If not, see if there's a to-be-removed package that danw@158: PROVIDES that property. (If we find such a package, danw@158: we have a CONTRADICTION error.) danw@157: danw@158: - If none of the above, then we have an UNSATISFIABLE danw@158: error danw@157: danw@158: - For each to-be-installed file REQUIRES: danw@157: danw@158: - (We create fake file PROVIDES to match file REQUIRES danw@158: when importing/merging razor sets, so if there is danw@158: already another installed package that REQUIRES this danw@158: file, there will be a PROVIDES listed for it as well.) danw@157: danw@158: - See if there's an installed package that PROVIDES danw@158: that file. danw@157: danw@158: - If not, do a binary search of the system file tree danw@158: looking to see if some installed package provides danw@158: that file but does not have a PROVIDES for it. danw@157: danw@158: - If not, see if there's an installable package that danw@158: PROVIDES that property, and create a new INSTALL rtp danw@158: for it if so. danw@157: danw@158: - (If we actually work with multiple upstream danw@158: razor_sets, then we will need to search the upstream danw@158: file trees at this point, because it's possible that danw@158: a package in one upstream repo would require a file danw@158: in another upstream repo. But if we merge the danw@158: multiple upstream repos into a single razor_set at danw@158: some point, then we would not need to do that, danw@158: because it would be guaranteed that we would have danw@158: already created a fake PROVIDES if any package danw@158: provides the file.) danw@157: danw@158: - If no installed or installable package provides the danw@158: file, see if there's a to-be-removed package that danw@158: provides the file. (If we find such a package, we danw@158: have a CONTRADICTION error.) danw@157: danw@158: - If none of the above, then we have an UNSATISFIABLE danw@158: error danw@157: danw@158: - For each to-be-installed PROVIDES: danw@157: danw@158: - Check if the new PROVIDES conflicts with an danw@158: installed CONFLICTS. If so, create a new danw@158: FORCED_UPDATE rtp for the installed package, so we danw@158: can try to upgrade it to a non-conflicting version. danw@158: (If we can't, we'll have an OLD_CONFLICT error.) danw@157: danw@158: - Check if the new PROVIDES conflicts with an danw@158: installed OBSOLETES *and* the PROVIDES property danw@158: corresponds to the name of its package. (That is, danw@158: OBSOLETES are only matched against package names, danw@158: not arbitrary provided properties.) If so, we have danw@158: an ALREADY_OBSOLETE error. danw@157: danw@158: - Check if the new PROVIDES conflicts with a danw@158: to-be-installed CONFLICTS. If so, we have a danw@158: CONTRADICTION error. danw@157: danw@158: - For each to-be-installed CONFLICTS: danw@157: danw@158: - Basically the reverse of the previous case: check if danw@158: the new CONFLICTS conflicts with an installed danw@158: PROVIDES. If so, create a new FORCED_UPDATE rtp for danw@158: the installed package, so we can try to upgrade it danw@158: to a non-conflicting version. (If we can't, we'll danw@158: have an NEW_CONFLICT error.) danw@157: danw@158: - Check if the new CONFLICTS conflicts with a danw@158: to-be-installed PROVIDES. If so, we have a danw@158: CONTRADICTION error. danw@157: danw@158: - For each to-be-installed OBSOLETES: danw@157: danw@158: - Check if there's an installed package that PROVIDES danw@158: that property. If so, create an OBSOLETED rtp for danw@158: the installed package. danw@157: danw@158: - If not, check if there's a to-be-installed package danw@158: that PROVIDES that property. If so, we have a danw@158: CONTRADICTION error. danw@157: danw@161: danw@161: - For each installed PROVIDES: danw@161: danw@161: - If the property is a valid package name (that is, danw@161: it's a package providing its own name), and it danw@161: matches the name of a new rtp with an unresolved danw@161: old_package, then set the rtp's old_package to point danw@161: to the package providing this property and clear the danw@161: appropriate bit in the system bit array. danw@161: danw@158: - For each to-be-removed PROVIDES: danw@157: danw@158: - If there's also an identical to-be-installed danw@158: PROVIDES, we're ok and can skip this danw@157: danw@158: - Otherwise, for each installed REQUIRES of this danw@158: property: danw@157: danw@158: - Look for some other installed or to-be-installed danw@158: property that satisfies the REQUIRES. danw@157: danw@158: - If there isn't one, then for each installed danw@158: package in this REQUIRES's package list: danw@157: danw@158: - If the PROVIDES was lost because the old danw@158: package was REMOVEd (not FORCED_UPDATE or danw@158: OBSOLETED), then create a new REMOVE rtp for danw@158: this package. danw@157: danw@158: - Otherwise, create a new FORCED_UPDATE rtp danw@158: for this package. danw@157: danw@158: - (We don't need to look at to-be-installed REQUIRES danw@158: of this property, because if there are any, they danw@158: will cause a CONTRADICTION error when we try to danw@158: re-satisfy them the next time through.)