diff -r 000000000000 -r fda83d91e600 docs/solver.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/docs/solver.xml Fri Jul 03 19:06:29 2009 +0100 @@ -0,0 +1,370 @@ + + + + + Dependency Solver + + + At a very high level, yum's depsolver does something roughly + equivalent to: + + - For each package being installed or removed + + - For each relevant property (provides, requires, conflicts, + obsoletes): + + - Figure out what additional packages need to be added to + or removed from the system to satisfy this property + + which ends up being roughly O(N^2 * M) where N is the total number of + properties and M is the number of packages being acted on. + +(I just figured that out off the top of my head, and I'm not totally +familiar with the yum code, so it may be wrong.) + +Razor's depsolver is something like: + + - do { + + - For each property to be added to or removed from the system: + + - Figure out what packages need to be added to or removed + from the system to satisfy this property + + - } until we stop adding/remove more packages + +with the key being that it's very easy to find the PROVIDES +corresponding to a REQUIRES and vice versa, because the property +arrays are sorted, and so all properties with the same "name" will be +adjacent to one another in the array, allowing many dependencies to be +satisified in essentially constant time. (Actually... we've been +calling it constant, but it's really O(log N) for heavily-depended-on +packages, because the more packages you have, the more variations on +"requires foo", "requires foo = 1.1", "requires foo > 1.0", etc you're +going to have to scan through.) + +Ideally though, each iteration of the inner loop body happens in +constant time, and thus the inner loop as a whole is O(N), and thus +the depsolver as a whole is O(N * M) (or at least, less than +O(N * M * log N). + + +FILE DEPENDENCIES +----------------- + +Whenever we add a package with a file REQUIRES to a razor_set, we also +add a PROVIDES for that file to the package or packages which provide +that file. This means that if we later add another package that +requires the same file (eg, /bin/sh or /usr/bin/perl), we can resolve +its file requirement exactly like we would resolve a property +requirement, in nearly constant time. + +When adding a *new* file requirement (ie, a requirement on a file that +no existing package depends on), we still have to scan through the +file tree, which is O(log N) in the number of files. + +(AFAICT, there's no reason yum couldn't do the same optimization. +Also, AFAICT, yum currently sticks property dependencies and file +dependencies into the same hash table, so that if any package in the +transaction has a file dependency, it causes *property* dependencies +to become slower to resolve as well...) + + +THE RULES +--------- + +This is what we have figured out for transaction-solving rules; +neither yum nor rpm's algorithm seems to be explained in full +anywhere... + + 1. Every requested install in the initial package set must be + satisfied as either a new install or an update: + + - if the requested package name is the name of an upstream + package: + + - if there is not a corresponding already-installed + package, then install the upstream package + + - else if the upstream package is newer than the + already-installed package, then update the package + + - else it's an error (UP_TO_DATE) + + - else if the requested package name is the name of an + already-installed package: + + - if there is an upstream package that obsoletes the + already-installed package, then behave as though the + user had requested that that package be installed + instead. + + - else it's an error (UP_TO_DATE or INSTALL_UNAVAILABLE?) + + - else it's an error (INSTALL_UNAVAILABLE) + + 2. Every requested removal in the initial package set must be + satisfied as a removal. If any requested package name is not + the name of an installed package, it's an error + (REMOVE_NOT_INSTALLED) + + REQUIRES processing: + + 3. If a package being installed or updated-to REQUIRES a property + that is not provided by any installed or to-be-installed + package, we need to find an installable package that provides + that property. If we find one, install/update it. If not, it's + an error (UNSATISFIABLE). (If we find an upstream package + providing the property that corresponds to a system package + that's being removed, then it's a CONTRADICTION.) + + 4. If an already-installed package REQUIRES a property which is + only provided by a package that is being removed, then that + package needs to be removed as well. + + 5. If an already-installed package REQUIRES a property which is + only provided by a package that is being upgraded or obsoleted + (to a new package which does not provide that property), then: + + - if there is an update for the installed package, then update + the installed package + + - else if there is another installable package that provides + the required property, then install that. + + - else it's an error (UNSATISFIABLE?) + + CONFLICTS processing + + 6. If a package being installed or updated-to CONFLICTS with a + property provided by an installed package: + + - if there is an update for the installed package, which the + new package does not conflict with, then update the + installed package. + + - else it's an error (NEW_CONFLICT) + + 7. If an already-installed package CONFLICTS with a property + provided by a to-be-installed package: + + - if there is an update for the installed package, which does + not conflict with the new package, then update the installed + package. + + - else it's an error (NEW_CONFLICT) + + 8. If a package being installed or updated-to CONFLICTS with a + property provided by a to-be-installed package, then it's an + error (CONTRADICTION). + + OBSOLETES processing. NOTE: OBSOLETES are only matched against + package names, not against arbitrary provided properties + + 9. If a package being installed or updated-to OBSOLETES an + installed package, then obsolete that package. (ie, remove it, + but treat it as updated for purposes of dangling REQUIRES). + + 10. If an already-installed package OBSOLETES a to-be-installed + package, then it's an error. (ALREADY_OBSOLETE) + + 11. If a package being installed or updated-to OBSOLETES another + package being installed or updated-to, then it's an error + (CONTRADICTION). + + + +THE DEPSOLVER +------------- + +We start with two razor_sets, system and upstream, and a list of +requested installations and removals. + + FIXME: what about multiple upstream repos? Having to deal with + arbitrary numbers of razor_sets is possible, but will probably be + messy... It might be easier to either store all upstream repo data + in a single .rzdb file, or else merge all upstream .rzdb files + together into a single razor_set at startup. (Or some combination + of those.) + +We create a bit array of the packages in each set, indicating which +ones are installed; the system bitarray starts out all 1s, and the +upstream bitarray all 0s. Each bit is only allowed to change state +once during the transaction; an installed package can be removed, or +an uninstalled package installed, but trying to reinstall a removed +package, or uninstall a newly-installed package is an error. This +means the packages break down into four categories: + + - installed (1 bit in the system bit array) + - to-be-removed (0 bit in the system bit array) + - to-be-installed (1 bit in the upstream bit array) + - installable (0 bit in the upstream bit array) + + +Depsolver algorithm: + + - Create new razor_transaction_packages ("rtp"s) for each + requested install or remove. These will be "unresolved", because + we haven't yet found the razor_packages that correspond to them. + + - while there are new rtps: + + - sort the new rtps + + - Walk the system property list, upstream property list, and + new rtp list in parallel, and: + + - For each uninstalled PROVIDES: + + - If the property is a valid package name (that is, + either it's a package providing its own name, or it + has a matching OBSOLETES), and it matches the name + of a new rtp of type INSTALL or FORCED_UPDATE with + an unresolved new_package: + + - If the upstream package has the same version as + the system package, we have an UP_TO_DATE error + (FIXME: not quite right. This doesn't deal with + the case where we try to update an application + because of a library update, and it turns out + there's no new version of the application, but + there IS a compat package containing the old + version of the library.) + + - Otherwise, set the rtp's new_package to point to + the package providing this property and set the + appropriate bit in the upstream bit array. + + - For each to-be-installed non-file REQUIRES: + + - See if there's an installed or to-be-installed + package that PROVIDES that property. + + - If not, see if there's an installable package that + PROVIDES that property, and create a new INSTALL rtp + for it if so. + + - If not, see if there's a to-be-removed package that + PROVIDES that property. (If we find such a package, + we have a CONTRADICTION error.) + + - If none of the above, then we have an UNSATISFIABLE + error + + - For each to-be-installed file REQUIRES: + + - (We create fake file PROVIDES to match file REQUIRES + when importing/merging razor sets, so if there is + already another installed package that REQUIRES this + file, there will be a PROVIDES listed for it as well.) + + - See if there's an installed package that PROVIDES + that file. + + - If not, do a binary search of the system file tree + looking to see if some installed package provides + that file but does not have a PROVIDES for it. + + - If not, see if there's an installable package that + PROVIDES that property, and create a new INSTALL rtp + for it if so. + + - (If we actually work with multiple upstream + razor_sets, then we will need to search the upstream + file trees at this point, because it's possible that + a package in one upstream repo would require a file + in another upstream repo. But if we merge the + multiple upstream repos into a single razor_set at + some point, then we would not need to do that, + because it would be guaranteed that we would have + already created a fake PROVIDES if any package + provides the file.) + + - If no installed or installable package provides the + file, see if there's a to-be-removed package that + provides the file. (If we find such a package, we + have a CONTRADICTION error.) + + - If none of the above, then we have an UNSATISFIABLE + error + + - For each to-be-installed PROVIDES: + + - Check if the new PROVIDES conflicts with an + installed CONFLICTS. If so, create a new + FORCED_UPDATE rtp for the installed package, so we + can try to upgrade it to a non-conflicting version. + (If we can't, we'll have an OLD_CONFLICT error.) + + - Check if the new PROVIDES conflicts with an + installed OBSOLETES *and* the PROVIDES property + corresponds to the name of its package. (That is, + OBSOLETES are only matched against package names, + not arbitrary provided properties.) If so, we have + an ALREADY_OBSOLETE error. + + - Check if the new PROVIDES conflicts with a + to-be-installed CONFLICTS. If so, we have a + CONTRADICTION error. + + - For each to-be-installed CONFLICTS: + + - Basically the reverse of the previous case: check if + the new CONFLICTS conflicts with an installed + PROVIDES. If so, create a new FORCED_UPDATE rtp for + the installed package, so we can try to upgrade it + to a non-conflicting version. (If we can't, we'll + have an NEW_CONFLICT error.) + + - Check if the new CONFLICTS conflicts with a + to-be-installed PROVIDES. If so, we have a + CONTRADICTION error. + + - For each to-be-installed OBSOLETES: + + - Check if there's an installed package that PROVIDES + that property. If so, create an OBSOLETED rtp for + the installed package. + + - If not, check if there's a to-be-installed package + that PROVIDES that property. If so, we have a + CONTRADICTION error. + + + - For each installed PROVIDES: + + - If the property is a valid package name (that is, + it's a package providing its own name), and it + matches the name of a new rtp with an unresolved + old_package, then set the rtp's old_package to point + to the package providing this property and clear the + appropriate bit in the system bit array. + + - For each to-be-removed PROVIDES: + + - If there's also an identical to-be-installed + PROVIDES, we're ok and can skip this + + - Otherwise, for each installed REQUIRES of this + property: + + - Look for some other installed or to-be-installed + property that satisfies the REQUIRES. + + - If there isn't one, then for each installed + package in this REQUIRES's package list: + + - If the PROVIDES was lost because the old + package was REMOVEd (not FORCED_UPDATE or + OBSOLETED), then create a new REMOVE rtp for + this package. + + - Otherwise, create a new FORCED_UPDATE rtp + for this package. + + - (We don't need to look at to-be-installed REQUIRES + of this property, because if there are any, they + will cause a CONTRADICTION error when we try to + re-satisfy them the next time through.) + +