rhughes@241: YUM vs RAZOR rhughes@241: ------------ rhughes@241: rhughes@241: At a very high level, yum's depsolver does something roughly rhughes@241: equivalent to: rhughes@241: rhughes@241: - For each package being installed or removed rhughes@241: rhughes@241: - For each relevant property (provides, requires, conflicts, rhughes@241: obsoletes): rhughes@241: rhughes@241: - Figure out what additional packages need to be added to rhughes@241: or removed from the system to satisfy this property rhughes@241: rhughes@241: which ends up being roughly O(N^2 * M) where N is the total number of rhughes@241: properties and M is the number of packages being acted on. rhughes@241: rhughes@241: (I just figured that out off the top of my head, and I'm not totally rhughes@241: familiar with the yum code, so it may be wrong.) rhughes@241: rhughes@241: Razor's depsolver is something like: rhughes@241: rhughes@241: - do { rhughes@241: rhughes@241: - For each property to be added to or removed from the system: rhughes@241: rhughes@241: - Figure out what packages need to be added to or removed rhughes@241: from the system to satisfy this property rhughes@241: rhughes@241: - } until we stop adding/remove more packages rhughes@241: rhughes@241: with the key being that it's very easy to find the PROVIDES rhughes@241: corresponding to a REQUIRES and vice versa, because the property rhughes@241: arrays are sorted, and so all properties with the same "name" will be rhughes@241: adjacent to one another in the array, allowing many dependencies to be rhughes@241: satisified in essentially constant time. (Actually... we've been rhughes@241: calling it constant, but it's really O(log N) for heavily-depended-on rhughes@241: packages, because the more packages you have, the more variations on rhughes@241: "requires foo", "requires foo = 1.1", "requires foo > 1.0", etc you're rhughes@241: going to have to scan through.) rhughes@241: rhughes@241: Ideally though, each iteration of the inner loop body happens in rhughes@241: constant time, and thus the inner loop as a whole is O(N), and thus rhughes@241: the depsolver as a whole is O(N * M) (or at least, less than rhughes@241: O(N * M * log N). rhughes@241: rhughes@241: rhughes@241: FILE DEPENDENCIES rhughes@241: ----------------- rhughes@241: rhughes@241: Whenever we add a package with a file REQUIRES to a razor_set, we also rhughes@241: add a PROVIDES for that file to the package or packages which provide rhughes@241: that file. This means that if we later add another package that rhughes@241: requires the same file (eg, /bin/sh or /usr/bin/perl), we can resolve rhughes@241: its file requirement exactly like we would resolve a property rhughes@241: requirement, in nearly constant time. rhughes@241: rhughes@241: When adding a *new* file requirement (ie, a requirement on a file that rhughes@241: no existing package depends on), we still have to scan through the rhughes@241: file tree, which is O(log N) in the number of files. rhughes@241: rhughes@241: (AFAICT, there's no reason yum couldn't do the same optimization. rhughes@241: Also, AFAICT, yum currently sticks property dependencies and file rhughes@241: dependencies into the same hash table, so that if any package in the rhughes@241: transaction has a file dependency, it causes *property* dependencies rhughes@241: to become slower to resolve as well...) rhughes@241: rhughes@241: rhughes@241: THE RULES rhughes@241: --------- rhughes@241: rhughes@241: This is what we have figured out for transaction-solving rules; rhughes@241: neither yum nor rpm's algorithm seems to be explained in full rhughes@241: anywhere... rhughes@241: rhughes@241: 1. Every requested install in the initial package set must be rhughes@241: satisfied as either a new install or an update: rhughes@241: rhughes@241: - if the requested package name is the name of an upstream rhughes@241: package: rhughes@241: rhughes@241: - if there is not a corresponding already-installed rhughes@241: package, then install the upstream package rhughes@241: rhughes@241: - else if the upstream package is newer than the rhughes@241: already-installed package, then update the package rhughes@241: rhughes@241: - else it's an error (UP_TO_DATE) rhughes@241: rhughes@241: - else if the requested package name is the name of an rhughes@241: already-installed package: rhughes@241: rhughes@241: - if there is an upstream package that obsoletes the rhughes@241: already-installed package, then behave as though the rhughes@241: user had requested that that package be installed rhughes@241: instead. rhughes@241: rhughes@241: - else it's an error (UP_TO_DATE or INSTALL_UNAVAILABLE?) rhughes@241: rhughes@241: - else it's an error (INSTALL_UNAVAILABLE) rhughes@241: rhughes@241: 2. Every requested removal in the initial package set must be rhughes@241: satisfied as a removal. If any requested package name is not rhughes@241: the name of an installed package, it's an error rhughes@241: (REMOVE_NOT_INSTALLED) rhughes@241: rhughes@241: REQUIRES processing: rhughes@241: rhughes@241: 3. If a package being installed or updated-to REQUIRES a property rhughes@241: that is not provided by any installed or to-be-installed rhughes@241: package, we need to find an installable package that provides rhughes@241: that property. If we find one, install/update it. If not, it's rhughes@241: an error (UNSATISFIABLE). (If we find an upstream package rhughes@241: providing the property that corresponds to a system package rhughes@241: that's being removed, then it's a CONTRADICTION.) rhughes@241: rhughes@241: 4. If an already-installed package REQUIRES a property which is rhughes@241: only provided by a package that is being removed, then that rhughes@241: package needs to be removed as well. rhughes@241: rhughes@241: 5. If an already-installed package REQUIRES a property which is rhughes@241: only provided by a package that is being upgraded or obsoleted rhughes@241: (to a new package which does not provide that property), then: rhughes@241: rhughes@241: - if there is an update for the installed package, then update rhughes@241: the installed package rhughes@241: rhughes@241: - else if there is another installable package that provides rhughes@241: the required property, then install that. rhughes@241: rhughes@241: - else it's an error (UNSATISFIABLE?) rhughes@241: rhughes@241: CONFLICTS processing rhughes@241: rhughes@241: 6. If a package being installed or updated-to CONFLICTS with a rhughes@241: property provided by an installed package: rhughes@241: rhughes@241: - if there is an update for the installed package, which the rhughes@241: new package does not conflict with, then update the rhughes@241: installed package. rhughes@241: rhughes@241: - else it's an error (NEW_CONFLICT) rhughes@241: rhughes@241: 7. If an already-installed package CONFLICTS with a property rhughes@241: provided by a to-be-installed package: rhughes@241: rhughes@241: - if there is an update for the installed package, which does rhughes@241: not conflict with the new package, then update the installed rhughes@241: package. rhughes@241: rhughes@241: - else it's an error (NEW_CONFLICT) rhughes@241: rhughes@241: 8. If a package being installed or updated-to CONFLICTS with a rhughes@241: property provided by a to-be-installed package, then it's an rhughes@241: error (CONTRADICTION). rhughes@241: rhughes@241: OBSOLETES processing. NOTE: OBSOLETES are only matched against rhughes@241: package names, not against arbitrary provided properties rhughes@241: rhughes@241: 9. If a package being installed or updated-to OBSOLETES an rhughes@241: installed package, then obsolete that package. (ie, remove it, rhughes@241: but treat it as updated for purposes of dangling REQUIRES). rhughes@241: rhughes@241: 10. If an already-installed package OBSOLETES a to-be-installed rhughes@241: package, then it's an error. (ALREADY_OBSOLETE) rhughes@241: rhughes@241: 11. If a package being installed or updated-to OBSOLETES another rhughes@241: package being installed or updated-to, then it's an error rhughes@241: (CONTRADICTION). rhughes@241: rhughes@241: rhughes@241: rhughes@241: THE DEPSOLVER rhughes@241: ------------- rhughes@241: rhughes@241: We start with two razor_sets, system and upstream, and a list of rhughes@241: requested installations and removals. rhughes@241: rhughes@241: FIXME: what about multiple upstream repos? Having to deal with rhughes@241: arbitrary numbers of razor_sets is possible, but will probably be rhughes@241: messy... It might be easier to either store all upstream repo data rhughes@241: in a single .repo file, or else merge all upstream .repo files rhughes@241: together into a single razor_set at startup. (Or some combination rhughes@241: of those.) rhughes@241: rhughes@241: We create a bit array of the packages in each set, indicating which rhughes@241: ones are installed; the system bitarray starts out all 1s, and the rhughes@241: upstream bitarray all 0s. Each bit is only allowed to change state rhughes@241: once during the transaction; an installed package can be removed, or rhughes@241: an uninstalled package installed, but trying to reinstall a removed rhughes@241: package, or uninstall a newly-installed package is an error. This rhughes@241: means the packages break down into four categories: rhughes@241: rhughes@241: - installed (1 bit in the system bit array) rhughes@241: - to-be-removed (0 bit in the system bit array) rhughes@241: - to-be-installed (1 bit in the upstream bit array) rhughes@241: - installable (0 bit in the upstream bit array) rhughes@241: rhughes@241: rhughes@241: Depsolver algorithm: rhughes@241: rhughes@241: - Create new razor_transaction_packages ("rtp"s) for each rhughes@241: requested install or remove. These will be "unresolved", because rhughes@241: we haven't yet found the razor_packages that correspond to them. rhughes@241: rhughes@241: - while there are new rtps: rhughes@241: rhughes@241: - sort the new rtps rhughes@241: rhughes@241: - Walk the system property list, upstream property list, and rhughes@241: new rtp list in parallel, and: rhughes@241: rhughes@241: - For each uninstalled PROVIDES: rhughes@241: rhughes@241: - If the property is a valid package name (that is, rhughes@241: either it's a package providing its own name, or it rhughes@241: has a matching OBSOLETES), and it matches the name rhughes@241: of a new rtp of type INSTALL or FORCED_UPDATE with rhughes@241: an unresolved new_package: rhughes@241: rhughes@241: - If the upstream package has the same version as rhughes@241: the system package, we have an UP_TO_DATE error rhughes@241: (FIXME: not quite right. This doesn't deal with rhughes@241: the case where we try to update an application rhughes@241: because of a library update, and it turns out rhughes@241: there's no new version of the application, but rhughes@241: there IS a compat package containing the old rhughes@241: version of the library.) rhughes@241: rhughes@241: - Otherwise, set the rtp's new_package to point to rhughes@241: the package providing this property and set the rhughes@241: appropriate bit in the upstream bit array. rhughes@241: rhughes@241: - For each to-be-installed non-file REQUIRES: rhughes@241: rhughes@241: - See if there's an installed or to-be-installed rhughes@241: package that PROVIDES that property. rhughes@241: rhughes@241: - If not, see if there's an installable package that rhughes@241: PROVIDES that property, and create a new INSTALL rtp rhughes@241: for it if so. rhughes@241: rhughes@241: - If not, see if there's a to-be-removed package that rhughes@241: PROVIDES that property. (If we find such a package, rhughes@241: we have a CONTRADICTION error.) rhughes@241: rhughes@241: - If none of the above, then we have an UNSATISFIABLE rhughes@241: error rhughes@241: rhughes@241: - For each to-be-installed file REQUIRES: rhughes@241: rhughes@241: - (We create fake file PROVIDES to match file REQUIRES rhughes@241: when importing/merging razor sets, so if there is rhughes@241: already another installed package that REQUIRES this rhughes@241: file, there will be a PROVIDES listed for it as well.) rhughes@241: rhughes@241: - See if there's an installed package that PROVIDES rhughes@241: that file. rhughes@241: rhughes@241: - If not, do a binary search of the system file tree rhughes@241: looking to see if some installed package provides rhughes@241: that file but does not have a PROVIDES for it. rhughes@241: rhughes@241: - If not, see if there's an installable package that rhughes@241: PROVIDES that property, and create a new INSTALL rtp rhughes@241: for it if so. rhughes@241: rhughes@241: - (If we actually work with multiple upstream rhughes@241: razor_sets, then we will need to search the upstream rhughes@241: file trees at this point, because it's possible that rhughes@241: a package in one upstream repo would require a file rhughes@241: in another upstream repo. But if we merge the rhughes@241: multiple upstream repos into a single razor_set at rhughes@241: some point, then we would not need to do that, rhughes@241: because it would be guaranteed that we would have rhughes@241: already created a fake PROVIDES if any package rhughes@241: provides the file.) rhughes@241: rhughes@241: - If no installed or installable package provides the rhughes@241: file, see if there's a to-be-removed package that rhughes@241: provides the file. (If we find such a package, we rhughes@241: have a CONTRADICTION error.) rhughes@241: rhughes@241: - If none of the above, then we have an UNSATISFIABLE rhughes@241: error rhughes@241: rhughes@241: - For each to-be-installed PROVIDES: rhughes@241: rhughes@241: - Check if the new PROVIDES conflicts with an rhughes@241: installed CONFLICTS. If so, create a new rhughes@241: FORCED_UPDATE rtp for the installed package, so we rhughes@241: can try to upgrade it to a non-conflicting version. rhughes@241: (If we can't, we'll have an OLD_CONFLICT error.) rhughes@241: rhughes@241: - Check if the new PROVIDES conflicts with an rhughes@241: installed OBSOLETES *and* the PROVIDES property rhughes@241: corresponds to the name of its package. (That is, rhughes@241: OBSOLETES are only matched against package names, rhughes@241: not arbitrary provided properties.) If so, we have rhughes@241: an ALREADY_OBSOLETE error. rhughes@241: rhughes@241: - Check if the new PROVIDES conflicts with a rhughes@241: to-be-installed CONFLICTS. If so, we have a rhughes@241: CONTRADICTION error. rhughes@241: rhughes@241: - For each to-be-installed CONFLICTS: rhughes@241: rhughes@241: - Basically the reverse of the previous case: check if rhughes@241: the new CONFLICTS conflicts with an installed rhughes@241: PROVIDES. If so, create a new FORCED_UPDATE rtp for rhughes@241: the installed package, so we can try to upgrade it rhughes@241: to a non-conflicting version. (If we can't, we'll rhughes@241: have an NEW_CONFLICT error.) rhughes@241: rhughes@241: - Check if the new CONFLICTS conflicts with a rhughes@241: to-be-installed PROVIDES. If so, we have a rhughes@241: CONTRADICTION error. rhughes@241: rhughes@241: - For each to-be-installed OBSOLETES: rhughes@241: rhughes@241: - Check if there's an installed package that PROVIDES rhughes@241: that property. If so, create an OBSOLETED rtp for rhughes@241: the installed package. rhughes@241: rhughes@241: - If not, check if there's a to-be-installed package rhughes@241: that PROVIDES that property. If so, we have a rhughes@241: CONTRADICTION error. rhughes@241: rhughes@241: rhughes@241: - For each installed PROVIDES: rhughes@241: rhughes@241: - If the property is a valid package name (that is, rhughes@241: it's a package providing its own name), and it rhughes@241: matches the name of a new rtp with an unresolved rhughes@241: old_package, then set the rtp's old_package to point rhughes@241: to the package providing this property and clear the rhughes@241: appropriate bit in the system bit array. rhughes@241: rhughes@241: - For each to-be-removed PROVIDES: rhughes@241: rhughes@241: - If there's also an identical to-be-installed rhughes@241: PROVIDES, we're ok and can skip this rhughes@241: rhughes@241: - Otherwise, for each installed REQUIRES of this rhughes@241: property: rhughes@241: rhughes@241: - Look for some other installed or to-be-installed rhughes@241: property that satisfies the REQUIRES. rhughes@241: rhughes@241: - If there isn't one, then for each installed rhughes@241: package in this REQUIRES's package list: rhughes@241: rhughes@241: - If the PROVIDES was lost because the old rhughes@241: package was REMOVEd (not FORCED_UPDATE or rhughes@241: OBSOLETED), then create a new REMOVE rtp for rhughes@241: this package. rhughes@241: rhughes@241: - Otherwise, create a new FORCED_UPDATE rtp rhughes@241: for this package. rhughes@241: rhughes@241: - (We don't need to look at to-be-installed REQUIRES rhughes@241: of this property, because if there are any, they rhughes@241: will cause a CONTRADICTION error when we try to rhughes@241: re-satisfy them the next time through.)