krh@311: krh@311: krh@311: krh@311: krh@311: Dependency Solver krh@311: krh@311: krh@311: At a very high level, yum's depsolver does something roughly krh@311: equivalent to: krh@311: krh@311: - For each package being installed or removed krh@311: krh@311: - For each relevant property (provides, requires, conflicts, krh@311: obsoletes): krh@311: krh@311: - Figure out what additional packages need to be added to krh@311: or removed from the system to satisfy this property krh@311: krh@311: which ends up being roughly O(N^2 * M) where N is the total number of krh@311: properties and M is the number of packages being acted on. krh@311: krh@311: (I just figured that out off the top of my head, and I'm not totally krh@311: familiar with the yum code, so it may be wrong.) krh@311: krh@311: Razor's depsolver is something like: krh@311: krh@311: - do { krh@311: krh@311: - For each property to be added to or removed from the system: krh@311: krh@311: - Figure out what packages need to be added to or removed krh@311: from the system to satisfy this property krh@311: krh@311: - } until we stop adding/remove more packages krh@311: krh@311: with the key being that it's very easy to find the PROVIDES krh@311: corresponding to a REQUIRES and vice versa, because the property krh@311: arrays are sorted, and so all properties with the same "name" will be krh@311: adjacent to one another in the array, allowing many dependencies to be krh@311: satisified in essentially constant time. (Actually... we've been krh@311: calling it constant, but it's really O(log N) for heavily-depended-on krh@311: packages, because the more packages you have, the more variations on krh@311: "requires foo", "requires foo = 1.1", "requires foo > 1.0", etc you're krh@311: going to have to scan through.) krh@311: krh@311: Ideally though, each iteration of the inner loop body happens in krh@311: constant time, and thus the inner loop as a whole is O(N), and thus krh@311: the depsolver as a whole is O(N * M) (or at least, less than krh@311: O(N * M * log N). krh@311: krh@311: krh@311: FILE DEPENDENCIES krh@311: ----------------- krh@311: krh@311: Whenever we add a package with a file REQUIRES to a razor_set, we also krh@311: add a PROVIDES for that file to the package or packages which provide krh@311: that file. This means that if we later add another package that krh@311: requires the same file (eg, /bin/sh or /usr/bin/perl), we can resolve krh@311: its file requirement exactly like we would resolve a property krh@311: requirement, in nearly constant time. krh@311: krh@311: When adding a *new* file requirement (ie, a requirement on a file that krh@311: no existing package depends on), we still have to scan through the krh@311: file tree, which is O(log N) in the number of files. krh@311: krh@311: (AFAICT, there's no reason yum couldn't do the same optimization. krh@311: Also, AFAICT, yum currently sticks property dependencies and file krh@311: dependencies into the same hash table, so that if any package in the krh@311: transaction has a file dependency, it causes *property* dependencies krh@311: to become slower to resolve as well...) krh@311: krh@311: krh@311: THE RULES krh@311: --------- krh@311: krh@311: This is what we have figured out for transaction-solving rules; krh@311: neither yum nor rpm's algorithm seems to be explained in full krh@311: anywhere... krh@311: krh@311: 1. Every requested install in the initial package set must be krh@311: satisfied as either a new install or an update: krh@311: krh@311: - if the requested package name is the name of an upstream krh@311: package: krh@311: krh@311: - if there is not a corresponding already-installed krh@311: package, then install the upstream package krh@311: krh@311: - else if the upstream package is newer than the krh@311: already-installed package, then update the package krh@311: krh@311: - else it's an error (UP_TO_DATE) krh@311: krh@311: - else if the requested package name is the name of an krh@311: already-installed package: krh@311: krh@311: - if there is an upstream package that obsoletes the krh@311: already-installed package, then behave as though the krh@311: user had requested that that package be installed krh@311: instead. krh@311: krh@311: - else it's an error (UP_TO_DATE or INSTALL_UNAVAILABLE?) krh@311: krh@311: - else it's an error (INSTALL_UNAVAILABLE) krh@311: krh@311: 2. Every requested removal in the initial package set must be krh@311: satisfied as a removal. If any requested package name is not krh@311: the name of an installed package, it's an error krh@311: (REMOVE_NOT_INSTALLED) krh@311: krh@311: REQUIRES processing: krh@311: krh@311: 3. If a package being installed or updated-to REQUIRES a property krh@311: that is not provided by any installed or to-be-installed krh@311: package, we need to find an installable package that provides krh@311: that property. If we find one, install/update it. If not, it's krh@311: an error (UNSATISFIABLE). (If we find an upstream package krh@311: providing the property that corresponds to a system package krh@311: that's being removed, then it's a CONTRADICTION.) krh@311: krh@311: 4. If an already-installed package REQUIRES a property which is krh@311: only provided by a package that is being removed, then that krh@311: package needs to be removed as well. krh@311: krh@311: 5. If an already-installed package REQUIRES a property which is krh@311: only provided by a package that is being upgraded or obsoleted krh@311: (to a new package which does not provide that property), then: krh@311: krh@311: - if there is an update for the installed package, then update krh@311: the installed package krh@311: krh@311: - else if there is another installable package that provides krh@311: the required property, then install that. krh@311: krh@311: - else it's an error (UNSATISFIABLE?) krh@311: krh@311: CONFLICTS processing krh@311: krh@311: 6. If a package being installed or updated-to CONFLICTS with a krh@311: property provided by an installed package: krh@311: krh@311: - if there is an update for the installed package, which the krh@311: new package does not conflict with, then update the krh@311: installed package. krh@311: krh@311: - else it's an error (NEW_CONFLICT) krh@311: krh@311: 7. If an already-installed package CONFLICTS with a property krh@311: provided by a to-be-installed package: krh@311: krh@311: - if there is an update for the installed package, which does krh@311: not conflict with the new package, then update the installed krh@311: package. krh@311: krh@311: - else it's an error (NEW_CONFLICT) krh@311: krh@311: 8. If a package being installed or updated-to CONFLICTS with a krh@311: property provided by a to-be-installed package, then it's an krh@311: error (CONTRADICTION). krh@311: krh@311: OBSOLETES processing. NOTE: OBSOLETES are only matched against krh@311: package names, not against arbitrary provided properties krh@311: krh@311: 9. If a package being installed or updated-to OBSOLETES an krh@311: installed package, then obsolete that package. (ie, remove it, krh@311: but treat it as updated for purposes of dangling REQUIRES). krh@311: krh@311: 10. If an already-installed package OBSOLETES a to-be-installed krh@311: package, then it's an error. (ALREADY_OBSOLETE) krh@311: krh@311: 11. If a package being installed or updated-to OBSOLETES another krh@311: package being installed or updated-to, then it's an error krh@311: (CONTRADICTION). krh@311: krh@311: krh@311: krh@311: THE DEPSOLVER krh@311: ------------- krh@311: krh@311: We start with two razor_sets, system and upstream, and a list of krh@311: requested installations and removals. krh@311: krh@311: FIXME: what about multiple upstream repos? Having to deal with krh@311: arbitrary numbers of razor_sets is possible, but will probably be krh@311: messy... It might be easier to either store all upstream repo data krh@311: in a single .rzdb file, or else merge all upstream .rzdb files krh@311: together into a single razor_set at startup. (Or some combination krh@311: of those.) krh@311: krh@311: We create a bit array of the packages in each set, indicating which krh@311: ones are installed; the system bitarray starts out all 1s, and the krh@311: upstream bitarray all 0s. Each bit is only allowed to change state krh@311: once during the transaction; an installed package can be removed, or krh@311: an uninstalled package installed, but trying to reinstall a removed krh@311: package, or uninstall a newly-installed package is an error. This krh@311: means the packages break down into four categories: krh@311: krh@311: - installed (1 bit in the system bit array) krh@311: - to-be-removed (0 bit in the system bit array) krh@311: - to-be-installed (1 bit in the upstream bit array) krh@311: - installable (0 bit in the upstream bit array) krh@311: krh@311: krh@311: Depsolver algorithm: krh@311: krh@311: - Create new razor_transaction_packages ("rtp"s) for each krh@311: requested install or remove. These will be "unresolved", because krh@311: we haven't yet found the razor_packages that correspond to them. krh@311: krh@311: - while there are new rtps: krh@311: krh@311: - sort the new rtps krh@311: krh@311: - Walk the system property list, upstream property list, and krh@311: new rtp list in parallel, and: krh@311: krh@311: - For each uninstalled PROVIDES: krh@311: krh@311: - If the property is a valid package name (that is, krh@311: either it's a package providing its own name, or it krh@311: has a matching OBSOLETES), and it matches the name krh@311: of a new rtp of type INSTALL or FORCED_UPDATE with krh@311: an unresolved new_package: krh@311: krh@311: - If the upstream package has the same version as krh@311: the system package, we have an UP_TO_DATE error krh@311: (FIXME: not quite right. This doesn't deal with krh@311: the case where we try to update an application krh@311: because of a library update, and it turns out krh@311: there's no new version of the application, but krh@311: there IS a compat package containing the old krh@311: version of the library.) krh@311: krh@311: - Otherwise, set the rtp's new_package to point to krh@311: the package providing this property and set the krh@311: appropriate bit in the upstream bit array. krh@311: krh@311: - For each to-be-installed non-file REQUIRES: krh@311: krh@311: - See if there's an installed or to-be-installed krh@311: package that PROVIDES that property. krh@311: krh@311: - If not, see if there's an installable package that krh@311: PROVIDES that property, and create a new INSTALL rtp krh@311: for it if so. krh@311: krh@311: - If not, see if there's a to-be-removed package that krh@311: PROVIDES that property. (If we find such a package, krh@311: we have a CONTRADICTION error.) krh@311: krh@311: - If none of the above, then we have an UNSATISFIABLE krh@311: error krh@311: krh@311: - For each to-be-installed file REQUIRES: krh@311: krh@311: - (We create fake file PROVIDES to match file REQUIRES krh@311: when importing/merging razor sets, so if there is krh@311: already another installed package that REQUIRES this krh@311: file, there will be a PROVIDES listed for it as well.) krh@311: krh@311: - See if there's an installed package that PROVIDES krh@311: that file. krh@311: krh@311: - If not, do a binary search of the system file tree krh@311: looking to see if some installed package provides krh@311: that file but does not have a PROVIDES for it. krh@311: krh@311: - If not, see if there's an installable package that krh@311: PROVIDES that property, and create a new INSTALL rtp krh@311: for it if so. krh@311: krh@311: - (If we actually work with multiple upstream krh@311: razor_sets, then we will need to search the upstream krh@311: file trees at this point, because it's possible that krh@311: a package in one upstream repo would require a file krh@311: in another upstream repo. But if we merge the krh@311: multiple upstream repos into a single razor_set at krh@311: some point, then we would not need to do that, krh@311: because it would be guaranteed that we would have krh@311: already created a fake PROVIDES if any package krh@311: provides the file.) krh@311: krh@311: - If no installed or installable package provides the krh@311: file, see if there's a to-be-removed package that krh@311: provides the file. (If we find such a package, we krh@311: have a CONTRADICTION error.) krh@311: krh@311: - If none of the above, then we have an UNSATISFIABLE krh@311: error krh@311: krh@311: - For each to-be-installed PROVIDES: krh@311: krh@311: - Check if the new PROVIDES conflicts with an krh@311: installed CONFLICTS. If so, create a new krh@311: FORCED_UPDATE rtp for the installed package, so we krh@311: can try to upgrade it to a non-conflicting version. krh@311: (If we can't, we'll have an OLD_CONFLICT error.) krh@311: krh@311: - Check if the new PROVIDES conflicts with an krh@311: installed OBSOLETES *and* the PROVIDES property krh@311: corresponds to the name of its package. (That is, krh@311: OBSOLETES are only matched against package names, krh@311: not arbitrary provided properties.) If so, we have krh@311: an ALREADY_OBSOLETE error. krh@311: krh@311: - Check if the new PROVIDES conflicts with a krh@311: to-be-installed CONFLICTS. If so, we have a krh@311: CONTRADICTION error. krh@311: krh@311: - For each to-be-installed CONFLICTS: krh@311: krh@311: - Basically the reverse of the previous case: check if krh@311: the new CONFLICTS conflicts with an installed krh@311: PROVIDES. If so, create a new FORCED_UPDATE rtp for krh@311: the installed package, so we can try to upgrade it krh@311: to a non-conflicting version. (If we can't, we'll krh@311: have an NEW_CONFLICT error.) krh@311: krh@311: - Check if the new CONFLICTS conflicts with a krh@311: to-be-installed PROVIDES. If so, we have a krh@311: CONTRADICTION error. krh@311: krh@311: - For each to-be-installed OBSOLETES: krh@311: krh@311: - Check if there's an installed package that PROVIDES krh@311: that property. If so, create an OBSOLETED rtp for krh@311: the installed package. krh@311: krh@311: - If not, check if there's a to-be-installed package krh@311: that PROVIDES that property. If so, we have a krh@311: CONTRADICTION error. krh@311: krh@311: krh@311: - For each installed PROVIDES: krh@311: krh@311: - If the property is a valid package name (that is, krh@311: it's a package providing its own name), and it krh@311: matches the name of a new rtp with an unresolved krh@311: old_package, then set the rtp's old_package to point krh@311: to the package providing this property and clear the krh@311: appropriate bit in the system bit array. krh@311: krh@311: - For each to-be-removed PROVIDES: krh@311: krh@311: - If there's also an identical to-be-installed krh@311: PROVIDES, we're ok and can skip this krh@311: krh@311: - Otherwise, for each installed REQUIRES of this krh@311: property: krh@311: krh@311: - Look for some other installed or to-be-installed krh@311: property that satisfies the REQUIRES. krh@311: krh@311: - If there isn't one, then for each installed krh@311: package in this REQUIRES's package list: krh@311: krh@311: - If the PROVIDES was lost because the old krh@311: package was REMOVEd (not FORCED_UPDATE or krh@311: OBSOLETED), then create a new REMOVE rtp for krh@311: this package. krh@311: krh@311: - Otherwise, create a new FORCED_UPDATE rtp krh@311: for this package. krh@311: krh@311: - (We don't need to look at to-be-installed REQUIRES krh@311: of this property, because if there are any, they krh@311: will cause a CONTRADICTION error when we try to krh@311: re-satisfy them the next time through.) krh@311: krh@311: