1.1 --- /dev/null Thu Jan 01 00:00:00 1970 +0000
1.2 +++ b/docs/solver.xml Tue Apr 24 19:27:29 2018 +0100
1.3 @@ -0,0 +1,370 @@
1.4 +<?xml version="1.0" encoding="utf-8"?>
1.5 +<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN" "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd">
1.6 +
1.7 +<chapter id="solver">
1.8 + <title>Dependency Solver</title>
1.9 +
1.10 + <para>
1.11 + At a very high level, yum's depsolver does something roughly
1.12 + equivalent to:
1.13 +
1.14 + - For each package being installed or removed
1.15 +
1.16 + - For each relevant property (provides, requires, conflicts,
1.17 + obsoletes):
1.18 +
1.19 + - Figure out what additional packages need to be added to
1.20 + or removed from the system to satisfy this property
1.21 +
1.22 + which ends up being roughly O(N^2 * M) where N is the total number of
1.23 + properties and M is the number of packages being acted on.
1.24 +
1.25 +(I just figured that out off the top of my head, and I'm not totally
1.26 +familiar with the yum code, so it may be wrong.)
1.27 +
1.28 +Razor's depsolver is something like:
1.29 +
1.30 + - do {
1.31 +
1.32 + - For each property to be added to or removed from the system:
1.33 +
1.34 + - Figure out what packages need to be added to or removed
1.35 + from the system to satisfy this property
1.36 +
1.37 + - } until we stop adding/remove more packages
1.38 +
1.39 +with the key being that it's very easy to find the PROVIDES
1.40 +corresponding to a REQUIRES and vice versa, because the property
1.41 +arrays are sorted, and so all properties with the same "name" will be
1.42 +adjacent to one another in the array, allowing many dependencies to be
1.43 +satisified in essentially constant time. (Actually... we've been
1.44 +calling it constant, but it's really O(log N) for heavily-depended-on
1.45 +packages, because the more packages you have, the more variations on
1.46 +"requires foo", "requires foo = 1.1", "requires foo > 1.0", etc you're
1.47 +going to have to scan through.)
1.48 +
1.49 +Ideally though, each iteration of the inner loop body happens in
1.50 +constant time, and thus the inner loop as a whole is O(N), and thus
1.51 +the depsolver as a whole is O(N * M) (or at least, less than
1.52 +O(N * M * log N).
1.53 +
1.54 +
1.55 +FILE DEPENDENCIES
1.56 +-----------------
1.57 +
1.58 +Whenever we add a package with a file REQUIRES to a razor_set, we also
1.59 +add a PROVIDES for that file to the package or packages which provide
1.60 +that file. This means that if we later add another package that
1.61 +requires the same file (eg, /bin/sh or /usr/bin/perl), we can resolve
1.62 +its file requirement exactly like we would resolve a property
1.63 +requirement, in nearly constant time.
1.64 +
1.65 +When adding a *new* file requirement (ie, a requirement on a file that
1.66 +no existing package depends on), we still have to scan through the
1.67 +file tree, which is O(log N) in the number of files.
1.68 +
1.69 +(AFAICT, there's no reason yum couldn't do the same optimization.
1.70 +Also, AFAICT, yum currently sticks property dependencies and file
1.71 +dependencies into the same hash table, so that if any package in the
1.72 +transaction has a file dependency, it causes *property* dependencies
1.73 +to become slower to resolve as well...)
1.74 +
1.75 +
1.76 +THE RULES
1.77 +---------
1.78 +
1.79 +This is what we have figured out for transaction-solving rules;
1.80 +neither yum nor rpm's algorithm seems to be explained in full
1.81 +anywhere...
1.82 +
1.83 + 1. Every requested install in the initial package set must be
1.84 + satisfied as either a new install or an update:
1.85 +
1.86 + - if the requested package name is the name of an upstream
1.87 + package:
1.88 +
1.89 + - if there is not a corresponding already-installed
1.90 + package, then install the upstream package
1.91 +
1.92 + - else if the upstream package is newer than the
1.93 + already-installed package, then update the package
1.94 +
1.95 + - else it's an error (UP_TO_DATE)
1.96 +
1.97 + - else if the requested package name is the name of an
1.98 + already-installed package:
1.99 +
1.100 + - if there is an upstream package that obsoletes the
1.101 + already-installed package, then behave as though the
1.102 + user had requested that that package be installed
1.103 + instead.
1.104 +
1.105 + - else it's an error (UP_TO_DATE or INSTALL_UNAVAILABLE?)
1.106 +
1.107 + - else it's an error (INSTALL_UNAVAILABLE)
1.108 +
1.109 + 2. Every requested removal in the initial package set must be
1.110 + satisfied as a removal. If any requested package name is not
1.111 + the name of an installed package, it's an error
1.112 + (REMOVE_NOT_INSTALLED)
1.113 +
1.114 + REQUIRES processing:
1.115 +
1.116 + 3. If a package being installed or updated-to REQUIRES a property
1.117 + that is not provided by any installed or to-be-installed
1.118 + package, we need to find an installable package that provides
1.119 + that property. If we find one, install/update it. If not, it's
1.120 + an error (UNSATISFIABLE). (If we find an upstream package
1.121 + providing the property that corresponds to a system package
1.122 + that's being removed, then it's a CONTRADICTION.)
1.123 +
1.124 + 4. If an already-installed package REQUIRES a property which is
1.125 + only provided by a package that is being removed, then that
1.126 + package needs to be removed as well.
1.127 +
1.128 + 5. If an already-installed package REQUIRES a property which is
1.129 + only provided by a package that is being upgraded or obsoleted
1.130 + (to a new package which does not provide that property), then:
1.131 +
1.132 + - if there is an update for the installed package, then update
1.133 + the installed package
1.134 +
1.135 + - else if there is another installable package that provides
1.136 + the required property, then install that.
1.137 +
1.138 + - else it's an error (UNSATISFIABLE?)
1.139 +
1.140 + CONFLICTS processing
1.141 +
1.142 + 6. If a package being installed or updated-to CONFLICTS with a
1.143 + property provided by an installed package:
1.144 +
1.145 + - if there is an update for the installed package, which the
1.146 + new package does not conflict with, then update the
1.147 + installed package.
1.148 +
1.149 + - else it's an error (NEW_CONFLICT)
1.150 +
1.151 + 7. If an already-installed package CONFLICTS with a property
1.152 + provided by a to-be-installed package:
1.153 +
1.154 + - if there is an update for the installed package, which does
1.155 + not conflict with the new package, then update the installed
1.156 + package.
1.157 +
1.158 + - else it's an error (NEW_CONFLICT)
1.159 +
1.160 + 8. If a package being installed or updated-to CONFLICTS with a
1.161 + property provided by a to-be-installed package, then it's an
1.162 + error (CONTRADICTION).
1.163 +
1.164 + OBSOLETES processing. NOTE: OBSOLETES are only matched against
1.165 + package names, not against arbitrary provided properties
1.166 +
1.167 + 9. If a package being installed or updated-to OBSOLETES an
1.168 + installed package, then obsolete that package. (ie, remove it,
1.169 + but treat it as updated for purposes of dangling REQUIRES).
1.170 +
1.171 + 10. If an already-installed package OBSOLETES a to-be-installed
1.172 + package, then it's an error. (ALREADY_OBSOLETE)
1.173 +
1.174 + 11. If a package being installed or updated-to OBSOLETES another
1.175 + package being installed or updated-to, then it's an error
1.176 + (CONTRADICTION).
1.177 +
1.178 +
1.179 +
1.180 +THE DEPSOLVER
1.181 +-------------
1.182 +
1.183 +We start with two razor_sets, system and upstream, and a list of
1.184 +requested installations and removals.
1.185 +
1.186 + FIXME: what about multiple upstream repos? Having to deal with
1.187 + arbitrary numbers of razor_sets is possible, but will probably be
1.188 + messy... It might be easier to either store all upstream repo data
1.189 + in a single .rzdb file, or else merge all upstream .rzdb files
1.190 + together into a single razor_set at startup. (Or some combination
1.191 + of those.)
1.192 +
1.193 +We create a bit array of the packages in each set, indicating which
1.194 +ones are installed; the system bitarray starts out all 1s, and the
1.195 +upstream bitarray all 0s. Each bit is only allowed to change state
1.196 +once during the transaction; an installed package can be removed, or
1.197 +an uninstalled package installed, but trying to reinstall a removed
1.198 +package, or uninstall a newly-installed package is an error. This
1.199 +means the packages break down into four categories:
1.200 +
1.201 + - installed (1 bit in the system bit array)
1.202 + - to-be-removed (0 bit in the system bit array)
1.203 + - to-be-installed (1 bit in the upstream bit array)
1.204 + - installable (0 bit in the upstream bit array)
1.205 +
1.206 +
1.207 +Depsolver algorithm:
1.208 +
1.209 + - Create new razor_transaction_packages ("rtp"s) for each
1.210 + requested install or remove. These will be "unresolved", because
1.211 + we haven't yet found the razor_packages that correspond to them.
1.212 +
1.213 + - while there are new rtps:
1.214 +
1.215 + - sort the new rtps
1.216 +
1.217 + - Walk the system property list, upstream property list, and
1.218 + new rtp list in parallel, and:
1.219 +
1.220 + - For each uninstalled PROVIDES:
1.221 +
1.222 + - If the property is a valid package name (that is,
1.223 + either it's a package providing its own name, or it
1.224 + has a matching OBSOLETES), and it matches the name
1.225 + of a new rtp of type INSTALL or FORCED_UPDATE with
1.226 + an unresolved new_package:
1.227 +
1.228 + - If the upstream package has the same version as
1.229 + the system package, we have an UP_TO_DATE error
1.230 + (FIXME: not quite right. This doesn't deal with
1.231 + the case where we try to update an application
1.232 + because of a library update, and it turns out
1.233 + there's no new version of the application, but
1.234 + there IS a compat package containing the old
1.235 + version of the library.)
1.236 +
1.237 + - Otherwise, set the rtp's new_package to point to
1.238 + the package providing this property and set the
1.239 + appropriate bit in the upstream bit array.
1.240 +
1.241 + - For each to-be-installed non-file REQUIRES:
1.242 +
1.243 + - See if there's an installed or to-be-installed
1.244 + package that PROVIDES that property.
1.245 +
1.246 + - If not, see if there's an installable package that
1.247 + PROVIDES that property, and create a new INSTALL rtp
1.248 + for it if so.
1.249 +
1.250 + - If not, see if there's a to-be-removed package that
1.251 + PROVIDES that property. (If we find such a package,
1.252 + we have a CONTRADICTION error.)
1.253 +
1.254 + - If none of the above, then we have an UNSATISFIABLE
1.255 + error
1.256 +
1.257 + - For each to-be-installed file REQUIRES:
1.258 +
1.259 + - (We create fake file PROVIDES to match file REQUIRES
1.260 + when importing/merging razor sets, so if there is
1.261 + already another installed package that REQUIRES this
1.262 + file, there will be a PROVIDES listed for it as well.)
1.263 +
1.264 + - See if there's an installed package that PROVIDES
1.265 + that file.
1.266 +
1.267 + - If not, do a binary search of the system file tree
1.268 + looking to see if some installed package provides
1.269 + that file but does not have a PROVIDES for it.
1.270 +
1.271 + - If not, see if there's an installable package that
1.272 + PROVIDES that property, and create a new INSTALL rtp
1.273 + for it if so.
1.274 +
1.275 + - (If we actually work with multiple upstream
1.276 + razor_sets, then we will need to search the upstream
1.277 + file trees at this point, because it's possible that
1.278 + a package in one upstream repo would require a file
1.279 + in another upstream repo. But if we merge the
1.280 + multiple upstream repos into a single razor_set at
1.281 + some point, then we would not need to do that,
1.282 + because it would be guaranteed that we would have
1.283 + already created a fake PROVIDES if any package
1.284 + provides the file.)
1.285 +
1.286 + - If no installed or installable package provides the
1.287 + file, see if there's a to-be-removed package that
1.288 + provides the file. (If we find such a package, we
1.289 + have a CONTRADICTION error.)
1.290 +
1.291 + - If none of the above, then we have an UNSATISFIABLE
1.292 + error
1.293 +
1.294 + - For each to-be-installed PROVIDES:
1.295 +
1.296 + - Check if the new PROVIDES conflicts with an
1.297 + installed CONFLICTS. If so, create a new
1.298 + FORCED_UPDATE rtp for the installed package, so we
1.299 + can try to upgrade it to a non-conflicting version.
1.300 + (If we can't, we'll have an OLD_CONFLICT error.)
1.301 +
1.302 + - Check if the new PROVIDES conflicts with an
1.303 + installed OBSOLETES *and* the PROVIDES property
1.304 + corresponds to the name of its package. (That is,
1.305 + OBSOLETES are only matched against package names,
1.306 + not arbitrary provided properties.) If so, we have
1.307 + an ALREADY_OBSOLETE error.
1.308 +
1.309 + - Check if the new PROVIDES conflicts with a
1.310 + to-be-installed CONFLICTS. If so, we have a
1.311 + CONTRADICTION error.
1.312 +
1.313 + - For each to-be-installed CONFLICTS:
1.314 +
1.315 + - Basically the reverse of the previous case: check if
1.316 + the new CONFLICTS conflicts with an installed
1.317 + PROVIDES. If so, create a new FORCED_UPDATE rtp for
1.318 + the installed package, so we can try to upgrade it
1.319 + to a non-conflicting version. (If we can't, we'll
1.320 + have an NEW_CONFLICT error.)
1.321 +
1.322 + - Check if the new CONFLICTS conflicts with a
1.323 + to-be-installed PROVIDES. If so, we have a
1.324 + CONTRADICTION error.
1.325 +
1.326 + - For each to-be-installed OBSOLETES:
1.327 +
1.328 + - Check if there's an installed package that PROVIDES
1.329 + that property. If so, create an OBSOLETED rtp for
1.330 + the installed package.
1.331 +
1.332 + - If not, check if there's a to-be-installed package
1.333 + that PROVIDES that property. If so, we have a
1.334 + CONTRADICTION error.
1.335 +
1.336 +
1.337 + - For each installed PROVIDES:
1.338 +
1.339 + - If the property is a valid package name (that is,
1.340 + it's a package providing its own name), and it
1.341 + matches the name of a new rtp with an unresolved
1.342 + old_package, then set the rtp's old_package to point
1.343 + to the package providing this property and clear the
1.344 + appropriate bit in the system bit array.
1.345 +
1.346 + - For each to-be-removed PROVIDES:
1.347 +
1.348 + - If there's also an identical to-be-installed
1.349 + PROVIDES, we're ok and can skip this
1.350 +
1.351 + - Otherwise, for each installed REQUIRES of this
1.352 + property:
1.353 +
1.354 + - Look for some other installed or to-be-installed
1.355 + property that satisfies the REQUIRES.
1.356 +
1.357 + - If there isn't one, then for each installed
1.358 + package in this REQUIRES's package list:
1.359 +
1.360 + - If the PROVIDES was lost because the old
1.361 + package was REMOVEd (not FORCED_UPDATE or
1.362 + OBSOLETED), then create a new REMOVE rtp for
1.363 + this package.
1.364 +
1.365 + - Otherwise, create a new FORCED_UPDATE rtp
1.366 + for this package.
1.367 +
1.368 + - (We don't need to look at to-be-installed REQUIRES
1.369 + of this property, because if there are any, they
1.370 + will cause a CONTRADICTION error when we try to
1.371 + re-satisfy them the next time through.)
1.372 + </para>
1.373 +</chapter>