danw@174: YUM vs RAZOR danw@174: ------------ danw@174: danw@174: At a very high level, yum's depsolver does something roughly danw@174: equivalent to: danw@174: danw@174: - For each package being installed or removed danw@174: danw@174: - For each relevant property (provides, requires, conflicts, danw@174: obsoletes): danw@174: danw@174: - Figure out what additional packages need to be added to danw@174: or removed from the system to satisfy this property danw@174: danw@174: which ends up being roughly O(N^2 * M) where N is the total number of danw@174: properties and M is the number of packages being acted on. danw@174: danw@174: (I just figured that out off the top of my head, and I'm not totally danw@174: familiar with the yum code, so it may be wrong.) danw@174: danw@174: Razor's depsolver is something like: danw@174: danw@174: - do { danw@174: danw@174: - For each property to be added to or removed from the system: danw@174: danw@174: - Figure out what packages need to be added to or removed danw@174: from the system to satisfy this property danw@174: danw@174: - } until we stop adding/remove more packages danw@174: danw@174: with the key being that it's very easy to find the PROVIDES danw@174: corresponding to a REQUIRES and vice versa, because the property danw@174: arrays are sorted, and so all properties with the same "name" will be danw@174: adjacent to one another in the array, allowing many dependencies to be danw@174: satisified in essentially constant time. (Actually... we've been danw@174: calling it constant, but it's really O(log N) for heavily-depended-on danw@174: packages, because the more packages you have, the more variations on danw@174: "requires foo", "requires foo = 1.1", "requires foo > 1.0", etc you're danw@174: going to have to scan through.) danw@174: danw@174: Ideally though, each iteration of the inner loop body happens in danw@174: constant time, and thus the inner loop as a whole is O(N), and thus danw@174: the depsolver as a whole is O(N * M) (or at least, less than danw@174: O(N * M * log N). danw@174: danw@174: danw@174: FILE DEPENDENCIES danw@174: ----------------- danw@174: danw@174: Whenever we add a package with a file REQUIRES to a razor_set, we also danw@174: add a PROVIDES for that file to the package or packages which provide danw@174: that file. This means that if we later add another package that danw@174: requires the same file (eg, /bin/sh or /usr/bin/perl), we can resolve danw@174: its file requirement exactly like we would resolve a property danw@174: requirement, in nearly constant time. danw@174: danw@174: When adding a *new* file requirement (ie, a requirement on a file that danw@174: no existing package depends on), we still have to scan through the danw@174: file tree, which is O(log N) in the number of files. danw@174: danw@174: (AFAICT, there's no reason yum couldn't do the same optimization. danw@174: Also, AFAICT, yum currently sticks property dependencies and file danw@174: dependencies into the same hash table, so that if any package in the danw@174: transaction has a file dependency, it causes *property* dependencies danw@174: to become slower to resolve as well...) danw@174: danw@174: danw@179: THE RULES danw@179: --------- danw@179: danw@179: This is what we have figured out for transaction-solving rules; danw@179: neither yum nor rpm's algorithm seems to be explained in full danw@179: anywhere... danw@179: danw@179: 1. Every requested install in the initial package set must be danw@179: satisfied as either a new install or an update: danw@179: danw@179: - if the requested package name is the name of an upstream danw@179: package: danw@179: danw@179: - if there is not a corresponding already-installed danw@179: package, then install the upstream package danw@179: danw@179: - else if the upstream package is newer than the danw@179: already-installed package, then update the package danw@179: danw@179: - else it's an error (UP_TO_DATE) danw@179: danw@179: - else if the requested package name is the name of an danw@179: already-installed package: danw@179: danw@179: - if there is an upstream package that obsoletes the danw@179: already-installed package, then behave as though the danw@179: user had requested that that package be installed danw@179: instead. danw@179: danw@179: - else it's an error (UP_TO_DATE or INSTALL_UNAVAILABLE?) danw@179: danw@179: - else it's an error (INSTALL_UNAVAILABLE) danw@179: danw@179: 2. Every requested removal in the initial package set must be danw@179: satisfied as a removal. If any requested package name is not danw@179: the name of an installed package, it's an error danw@179: (REMOVE_NOT_INSTALLED) danw@179: danw@179: REQUIRES processing: danw@179: danw@179: 3. If a package being installed or updated-to REQUIRES a property danw@179: that is not provided by any installed or to-be-installed danw@179: package, we need to find an installable package that provides danw@179: that property. If we find one, install/update it. If not, it's danw@179: an error (UNSATISFIABLE). (If we find an upstream package danw@179: providing the property that corresponds to a system package danw@179: that's being removed, then it's a CONTRADICTION.) danw@179: danw@179: 4. If an already-installed package REQUIRES a property which is danw@179: only provided by a package that is being removed, then that danw@179: package needs to be removed as well. danw@179: danw@179: 5. If an already-installed package REQUIRES a property which is danw@179: only provided by a package that is being upgraded or obsoleted danw@179: (to a new package which does not provide that property), then: danw@179: danw@179: - if there is an update for the installed package, then update danw@179: the installed package danw@179: danw@179: - else if there is another installable package that provides danw@179: the required property, then install that. danw@179: danw@179: - else it's an error (UNSATISFIABLE?) danw@179: danw@179: CONFLICTS processing danw@179: danw@179: 6. If a package being installed or updated-to CONFLICTS with a danw@179: property provided by an installed package: danw@179: danw@179: - if there is an update for the installed package, which the danw@179: new package does not conflict with, then update the danw@179: installed package. danw@179: danw@179: - else it's an error (NEW_CONFLICT) danw@179: danw@179: 7. If an already-installed package CONFLICTS with a property danw@179: provided by a to-be-installed package: danw@179: danw@179: - if there is an update for the installed package, which does danw@179: not conflict with the new package, then update the installed danw@179: package. danw@179: danw@179: - else it's an error (NEW_CONFLICT) danw@179: danw@179: 8. If a package being installed or updated-to CONFLICTS with a danw@179: property provided by a to-be-installed package, then it's an danw@179: error (CONTRADICTION). danw@179: danw@179: OBSOLETES processing. NOTE: OBSOLETES are only matched against danw@179: package names, not against arbitrary provided properties danw@179: danw@179: 9. If a package being installed or updated-to OBSOLETES an danw@179: installed package, then obsolete that package. (ie, remove it, danw@179: but treat it as updated for purposes of dangling REQUIRES). danw@179: danw@179: 10. If an already-installed package OBSOLETES a to-be-installed danw@179: package, then it's an error. (ALREADY_OBSOLETE) danw@179: danw@179: 11. If a package being installed or updated-to OBSOLETES another danw@179: package being installed or updated-to, then it's an error danw@179: (CONTRADICTION). danw@179: danw@179: danw@179: danw@174: THE DEPSOLVER danw@174: ------------- danw@174: danw@157: We start with two razor_sets, system and upstream, and a list of danw@157: requested installations and removals. danw@157: danw@157: FIXME: what about multiple upstream repos? Having to deal with danw@157: arbitrary numbers of razor_sets is possible, but will probably be danw@157: messy... It might be easier to either store all upstream repo data danw@157: in a single .repo file, or else merge all upstream .repo files danw@157: together into a single razor_set at startup. (Or some combination danw@157: of those.) danw@157: danw@157: We create a bit array of the packages in each set, indicating which danw@157: ones are installed; the system bitarray starts out all 1s, and the danw@157: upstream bitarray all 0s. Each bit is only allowed to change state danw@157: once during the transaction; an installed package can be removed, or danw@157: an uninstalled package installed, but trying to reinstall a removed danw@157: package, or uninstall a newly-installed package is an error. This danw@157: means the packages break down into four categories: danw@157: danw@157: - installed (1 bit in the system bit array) danw@157: - to-be-removed (0 bit in the system bit array) danw@157: - to-be-installed (1 bit in the upstream bit array) danw@157: - installable (0 bit in the upstream bit array) danw@157: danw@157: danw@174: Depsolver algorithm: danw@157: danw@158: - Create new razor_transaction_packages ("rtp"s) for each danw@161: requested install or remove. These will be "unresolved", because danw@161: we haven't yet found the razor_packages that correspond to them. danw@157: danw@158: - while there are new rtps: danw@157: danw@161: - sort the new rtps danw@157: danw@161: - Walk the system property list, upstream property list, and danw@161: new rtp list in parallel, and: danw@157: danw@161: - For each uninstalled PROVIDES: danw@161: danw@161: - If the property is a valid package name (that is, danw@161: either it's a package providing its own name, or it danw@161: has a matching OBSOLETES), and it matches the name danw@161: of a new rtp of type INSTALL or FORCED_UPDATE with danw@174: an unresolved new_package: danw@174: danw@174: - If the upstream package has the same version as danw@174: the system package, we have an UP_TO_DATE error danw@174: (FIXME: not quite right. This doesn't deal with danw@174: the case where we try to update an application danw@174: because of a library update, and it turns out danw@174: there's no new version of the application, but danw@174: there IS a compat package containing the old danw@174: version of the library.) danw@174: danw@174: - Otherwise, set the rtp's new_package to point to danw@174: the package providing this property and set the danw@174: appropriate bit in the upstream bit array. danw@157: danw@158: - For each to-be-installed non-file REQUIRES: danw@157: danw@158: - See if there's an installed or to-be-installed danw@158: package that PROVIDES that property. danw@157: danw@158: - If not, see if there's an installable package that danw@158: PROVIDES that property, and create a new INSTALL rtp danw@158: for it if so. danw@157: danw@158: - If not, see if there's a to-be-removed package that danw@158: PROVIDES that property. (If we find such a package, danw@158: we have a CONTRADICTION error.) danw@157: danw@158: - If none of the above, then we have an UNSATISFIABLE danw@158: error danw@157: danw@158: - For each to-be-installed file REQUIRES: danw@157: danw@158: - (We create fake file PROVIDES to match file REQUIRES danw@158: when importing/merging razor sets, so if there is danw@158: already another installed package that REQUIRES this danw@158: file, there will be a PROVIDES listed for it as well.) danw@157: danw@158: - See if there's an installed package that PROVIDES danw@158: that file. danw@157: danw@158: - If not, do a binary search of the system file tree danw@158: looking to see if some installed package provides danw@158: that file but does not have a PROVIDES for it. danw@157: danw@158: - If not, see if there's an installable package that danw@158: PROVIDES that property, and create a new INSTALL rtp danw@158: for it if so. danw@157: danw@158: - (If we actually work with multiple upstream danw@158: razor_sets, then we will need to search the upstream danw@158: file trees at this point, because it's possible that danw@158: a package in one upstream repo would require a file danw@158: in another upstream repo. But if we merge the danw@158: multiple upstream repos into a single razor_set at danw@158: some point, then we would not need to do that, danw@158: because it would be guaranteed that we would have danw@158: already created a fake PROVIDES if any package danw@158: provides the file.) danw@157: danw@158: - If no installed or installable package provides the danw@158: file, see if there's a to-be-removed package that danw@158: provides the file. (If we find such a package, we danw@158: have a CONTRADICTION error.) danw@157: danw@158: - If none of the above, then we have an UNSATISFIABLE danw@158: error danw@157: danw@158: - For each to-be-installed PROVIDES: danw@157: danw@158: - Check if the new PROVIDES conflicts with an danw@158: installed CONFLICTS. If so, create a new danw@158: FORCED_UPDATE rtp for the installed package, so we danw@158: can try to upgrade it to a non-conflicting version. danw@158: (If we can't, we'll have an OLD_CONFLICT error.) danw@157: danw@158: - Check if the new PROVIDES conflicts with an danw@158: installed OBSOLETES *and* the PROVIDES property danw@158: corresponds to the name of its package. (That is, danw@158: OBSOLETES are only matched against package names, danw@158: not arbitrary provided properties.) If so, we have danw@158: an ALREADY_OBSOLETE error. danw@157: danw@158: - Check if the new PROVIDES conflicts with a danw@158: to-be-installed CONFLICTS. If so, we have a danw@158: CONTRADICTION error. danw@157: danw@158: - For each to-be-installed CONFLICTS: danw@157: danw@158: - Basically the reverse of the previous case: check if danw@158: the new CONFLICTS conflicts with an installed danw@158: PROVIDES. If so, create a new FORCED_UPDATE rtp for danw@158: the installed package, so we can try to upgrade it danw@158: to a non-conflicting version. (If we can't, we'll danw@158: have an NEW_CONFLICT error.) danw@157: danw@158: - Check if the new CONFLICTS conflicts with a danw@158: to-be-installed PROVIDES. If so, we have a danw@158: CONTRADICTION error. danw@157: danw@158: - For each to-be-installed OBSOLETES: danw@157: danw@158: - Check if there's an installed package that PROVIDES danw@158: that property. If so, create an OBSOLETED rtp for danw@158: the installed package. danw@157: danw@158: - If not, check if there's a to-be-installed package danw@158: that PROVIDES that property. If so, we have a danw@158: CONTRADICTION error. danw@157: danw@161: danw@161: - For each installed PROVIDES: danw@161: danw@161: - If the property is a valid package name (that is, danw@161: it's a package providing its own name), and it danw@161: matches the name of a new rtp with an unresolved danw@161: old_package, then set the rtp's old_package to point danw@161: to the package providing this property and clear the danw@161: appropriate bit in the system bit array. danw@161: danw@158: - For each to-be-removed PROVIDES: danw@157: danw@158: - If there's also an identical to-be-installed danw@158: PROVIDES, we're ok and can skip this danw@157: danw@158: - Otherwise, for each installed REQUIRES of this danw@158: property: danw@157: danw@158: - Look for some other installed or to-be-installed danw@158: property that satisfies the REQUIRES. danw@157: danw@158: - If there isn't one, then for each installed danw@158: package in this REQUIRES's package list: danw@157: danw@158: - If the PROVIDES was lost because the old danw@158: package was REMOVEd (not FORCED_UPDATE or danw@158: OBSOLETED), then create a new REMOVE rtp for danw@158: this package. danw@157: danw@158: - Otherwise, create a new FORCED_UPDATE rtp danw@158: for this package. danw@157: danw@158: - (We don't need to look at to-be-installed REQUIRES danw@158: of this property, because if there are any, they danw@158: will cause a CONTRADICTION error when we try to danw@158: re-satisfy them the next time through.)