docs/solver.xml
changeset 376 d15a16347c77
     1.1 --- /dev/null	Thu Jan 01 00:00:00 1970 +0000
     1.2 +++ b/docs/solver.xml	Tue Jul 07 22:50:22 2009 +0100
     1.3 @@ -0,0 +1,370 @@
     1.4 +<?xml version="1.0" encoding="utf-8"?>
     1.5 +<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN" "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd">
     1.6 +
     1.7 +<chapter id="solver">
     1.8 +  <title>Dependency Solver</title>
     1.9 +
    1.10 +  <para>
    1.11 +    At a very high level, yum's depsolver does something roughly
    1.12 +    equivalent to:
    1.13 +
    1.14 +    - For each package being installed or removed
    1.15 +
    1.16 +	- For each relevant property (provides, requires, conflicts,
    1.17 +          obsoletes):
    1.18 +
    1.19 +	    - Figure out what additional packages need to be added to
    1.20 +	      or removed from the system to satisfy this property
    1.21 +
    1.22 +    which ends up being roughly O(N^2 * M) where N is the total number of
    1.23 +    properties and M is the number of packages being acted on.
    1.24 +
    1.25 +(I just figured that out off the top of my head, and I'm not totally
    1.26 +familiar with the yum code, so it may be wrong.)
    1.27 +
    1.28 +Razor's depsolver is something like:
    1.29 +
    1.30 +    - do {
    1.31 +
    1.32 +	- For each property to be added to or removed from the system:
    1.33 +
    1.34 +	    - Figure out what packages need to be added to or removed
    1.35 +	      from the system to satisfy this property
    1.36 +
    1.37 +    - } until we stop adding/remove more packages
    1.38 +
    1.39 +with the key being that it's very easy to find the PROVIDES
    1.40 +corresponding to a REQUIRES and vice versa, because the property
    1.41 +arrays are sorted, and so all properties with the same "name" will be
    1.42 +adjacent to one another in the array, allowing many dependencies to be
    1.43 +satisified in essentially constant time. (Actually... we've been
    1.44 +calling it constant, but it's really O(log N) for heavily-depended-on
    1.45 +packages, because the more packages you have, the more variations on
    1.46 +"requires foo", "requires foo = 1.1", "requires foo &gt; 1.0", etc you're
    1.47 +going to have to scan through.)
    1.48 +
    1.49 +Ideally though, each iteration of the inner loop body happens in
    1.50 +constant time, and thus the inner loop as a whole is O(N), and thus
    1.51 +the depsolver as a whole is O(N * M) (or at least, less than
    1.52 +O(N * M * log N).
    1.53 +
    1.54 +
    1.55 +FILE DEPENDENCIES
    1.56 +-----------------
    1.57 +
    1.58 +Whenever we add a package with a file REQUIRES to a razor_set, we also
    1.59 +add a PROVIDES for that file to the package or packages which provide
    1.60 +that file. This means that if we later add another package that
    1.61 +requires the same file (eg, /bin/sh or /usr/bin/perl), we can resolve
    1.62 +its file requirement exactly like we would resolve a property
    1.63 +requirement, in nearly constant time.
    1.64 +
    1.65 +When adding a *new* file requirement (ie, a requirement on a file that
    1.66 +no existing package depends on), we still have to scan through the
    1.67 +file tree, which is O(log N) in the number of files.
    1.68 +
    1.69 +(AFAICT, there's no reason yum couldn't do the same optimization.
    1.70 +Also, AFAICT, yum currently sticks property dependencies and file
    1.71 +dependencies into the same hash table, so that if any package in the
    1.72 +transaction has a file dependency, it causes *property* dependencies
    1.73 +to become slower to resolve as well...)
    1.74 +
    1.75 +
    1.76 +THE RULES
    1.77 +---------
    1.78 +
    1.79 +This is what we have figured out for transaction-solving rules;
    1.80 +neither yum nor rpm's algorithm seems to be explained in full
    1.81 +anywhere...
    1.82 +
    1.83 +    1. Every requested install in the initial package set must be
    1.84 +       satisfied as either a new install or an update:
    1.85 +
    1.86 +	- if the requested package name is the name of an upstream
    1.87 +          package:
    1.88 +
    1.89 +	    - if there is not a corresponding already-installed
    1.90 +              package, then install the upstream package
    1.91 +
    1.92 +	    - else if the upstream package is newer than the
    1.93 +              already-installed package, then update the package
    1.94 +
    1.95 +	    - else it's an error (UP_TO_DATE)
    1.96 +
    1.97 +	- else if the requested package name is the name of an
    1.98 +          already-installed package:
    1.99 +
   1.100 +	    - if there is an upstream package that obsoletes the
   1.101 +              already-installed package, then behave as though the
   1.102 +              user had requested that that package be installed
   1.103 +              instead.
   1.104 +
   1.105 +	    - else it's an error (UP_TO_DATE or INSTALL_UNAVAILABLE?)
   1.106 +
   1.107 +	- else it's an error (INSTALL_UNAVAILABLE)
   1.108 +
   1.109 +    2. Every requested removal in the initial package set must be
   1.110 +       satisfied as a removal. If any requested package name is not
   1.111 +       the name of an installed package, it's an error
   1.112 +       (REMOVE_NOT_INSTALLED)
   1.113 +
   1.114 +    REQUIRES processing:
   1.115 +
   1.116 +    3. If a package being installed or updated-to REQUIRES a property
   1.117 +       that is not provided by any installed or to-be-installed
   1.118 +       package, we need to find an installable package that provides
   1.119 +       that property. If we find one, install/update it. If not, it's
   1.120 +       an error (UNSATISFIABLE). (If we find an upstream package
   1.121 +       providing the property that corresponds to a system package
   1.122 +       that's being removed, then it's a CONTRADICTION.)
   1.123 +
   1.124 +    4. If an already-installed package REQUIRES a property which is
   1.125 +       only provided by a package that is being removed, then that
   1.126 +       package needs to be removed as well.
   1.127 +
   1.128 +    5. If an already-installed package REQUIRES a property which is
   1.129 +       only provided by a package that is being upgraded or obsoleted
   1.130 +       (to a new package which does not provide that property), then:
   1.131 +
   1.132 +	- if there is an update for the installed package, then update
   1.133 +          the installed package
   1.134 +
   1.135 +	- else if there is another installable package that provides
   1.136 +          the required property, then install that.
   1.137 +
   1.138 +	- else it's an error (UNSATISFIABLE?)
   1.139 +
   1.140 +    CONFLICTS processing
   1.141 +
   1.142 +    6. If a package being installed or updated-to CONFLICTS with a
   1.143 +       property provided by an installed package:
   1.144 +
   1.145 +	- if there is an update for the installed package, which the
   1.146 +          new package does not conflict with, then update the
   1.147 +          installed package.
   1.148 +
   1.149 +	- else it's an error (NEW_CONFLICT)
   1.150 +
   1.151 +    7. If an already-installed package CONFLICTS with a property
   1.152 +       provided by a to-be-installed package:
   1.153 +
   1.154 +	- if there is an update for the installed package, which does
   1.155 +          not conflict with the new package, then update the installed
   1.156 +          package.
   1.157 +
   1.158 +	- else it's an error (NEW_CONFLICT)
   1.159 +
   1.160 +    8. If a package being installed or updated-to CONFLICTS with a
   1.161 +       property provided by a to-be-installed package, then it's an
   1.162 +       error (CONTRADICTION).
   1.163 +
   1.164 +    OBSOLETES processing. NOTE: OBSOLETES are only matched against
   1.165 +    package names, not against arbitrary provided properties
   1.166 +
   1.167 +    9. If a package being installed or updated-to OBSOLETES an
   1.168 +       installed package, then obsolete that package. (ie, remove it,
   1.169 +       but treat it as updated for purposes of dangling REQUIRES).
   1.170 +
   1.171 +   10. If an already-installed package OBSOLETES a to-be-installed
   1.172 +       package, then it's an error. (ALREADY_OBSOLETE)
   1.173 +
   1.174 +   11. If a package being installed or updated-to OBSOLETES another
   1.175 +       package being installed or updated-to, then it's an error
   1.176 +       (CONTRADICTION).
   1.177 +
   1.178 +
   1.179 +
   1.180 +THE DEPSOLVER
   1.181 +-------------
   1.182 +
   1.183 +We start with two razor_sets, system and upstream, and a list of
   1.184 +requested installations and removals.
   1.185 +
   1.186 +    FIXME: what about multiple upstream repos? Having to deal with
   1.187 +    arbitrary numbers of razor_sets is possible, but will probably be
   1.188 +    messy... It might be easier to either store all upstream repo data
   1.189 +    in a single .rzdb file, or else merge all upstream .rzdb files
   1.190 +    together into a single razor_set at startup. (Or some combination
   1.191 +    of those.)
   1.192 +
   1.193 +We create a bit array of the packages in each set, indicating which
   1.194 +ones are installed; the system bitarray starts out all 1s, and the
   1.195 +upstream bitarray all 0s. Each bit is only allowed to change state
   1.196 +once during the transaction; an installed package can be removed, or
   1.197 +an uninstalled package installed, but trying to reinstall a removed
   1.198 +package, or uninstall a newly-installed package is an error. This
   1.199 +means the packages break down into four categories:
   1.200 +
   1.201 +    - installed       (1 bit in the system bit array)
   1.202 +    - to-be-removed   (0 bit in the system bit array)
   1.203 +    - to-be-installed (1 bit in the upstream bit array)
   1.204 +    - installable     (0 bit in the upstream bit array)
   1.205 +
   1.206 +
   1.207 +Depsolver algorithm:
   1.208 +
   1.209 +    - Create new razor_transaction_packages ("rtp"s) for each
   1.210 +      requested install or remove. These will be "unresolved", because
   1.211 +      we haven't yet found the razor_packages that correspond to them.
   1.212 +
   1.213 +    - while there are new rtps:
   1.214 +
   1.215 +	- sort the new rtps
   1.216 +
   1.217 +	- Walk the system property list, upstream property list, and
   1.218 +          new rtp list in parallel, and:
   1.219 +
   1.220 +	    - For each uninstalled PROVIDES:
   1.221 +
   1.222 +		- If the property is a valid package name (that is,
   1.223 +                  either it's a package providing its own name, or it
   1.224 +                  has a matching OBSOLETES), and it matches the name
   1.225 +                  of a new rtp of type INSTALL or FORCED_UPDATE with
   1.226 +                  an unresolved new_package:
   1.227 +
   1.228 +		    - If the upstream package has the same version as
   1.229 +		      the system package, we have an UP_TO_DATE error
   1.230 +		      (FIXME: not quite right. This doesn't deal with
   1.231 +		      the case where we try to update an application
   1.232 +		      because of a library update, and it turns out
   1.233 +		      there's no new version of the application, but
   1.234 +		      there IS a compat package containing the old
   1.235 +		      version of the library.)
   1.236 +
   1.237 +		    - Otherwise, set the rtp's new_package to point to
   1.238 +		      the package providing this property and set the
   1.239 +		      appropriate bit in the upstream bit array.
   1.240 +
   1.241 +	    - For each to-be-installed non-file REQUIRES:
   1.242 +
   1.243 +		- See if there's an installed or to-be-installed
   1.244 +		  package that PROVIDES that property.
   1.245 +
   1.246 +		- If not, see if there's an installable package that
   1.247 +		  PROVIDES that property, and create a new INSTALL rtp
   1.248 +		  for it if so.
   1.249 +
   1.250 +		- If not, see if there's a to-be-removed package that
   1.251 +		  PROVIDES that property. (If we find such a package,
   1.252 +		  we have a CONTRADICTION error.)
   1.253 +
   1.254 +		- If none of the above, then we have an UNSATISFIABLE
   1.255 +                  error
   1.256 +
   1.257 +	    - For each to-be-installed file REQUIRES:
   1.258 +
   1.259 +		- (We create fake file PROVIDES to match file REQUIRES
   1.260 +                  when importing/merging razor sets, so if there is
   1.261 +                  already another installed package that REQUIRES this
   1.262 +                  file, there will be a PROVIDES listed for it as well.)
   1.263 +
   1.264 +		- See if there's an installed package that PROVIDES
   1.265 +                  that file.
   1.266 +
   1.267 +		- If not, do a binary search of the system file tree
   1.268 +                  looking to see if some installed package provides
   1.269 +                  that file but does not have a PROVIDES for it.
   1.270 +
   1.271 +		- If not, see if there's an installable package that
   1.272 +		  PROVIDES that property, and create a new INSTALL rtp
   1.273 +		  for it if so.
   1.274 +
   1.275 +		- (If we actually work with multiple upstream
   1.276 +                  razor_sets, then we will need to search the upstream
   1.277 +                  file trees at this point, because it's possible that
   1.278 +                  a package in one upstream repo would require a file
   1.279 +                  in another upstream repo. But if we merge the
   1.280 +                  multiple upstream repos into a single razor_set at
   1.281 +                  some point, then we would not need to do that,
   1.282 +                  because it would be guaranteed that we would have
   1.283 +                  already created a fake PROVIDES if any package
   1.284 +                  provides the file.)
   1.285 +
   1.286 +		- If no installed or installable package provides the
   1.287 +		  file, see if there's a to-be-removed package that
   1.288 +		  provides the file. (If we find such a package, we
   1.289 +		  have a CONTRADICTION error.)
   1.290 +
   1.291 +		- If none of the above, then we have an UNSATISFIABLE
   1.292 +                  error
   1.293 +
   1.294 +	    - For each to-be-installed PROVIDES:
   1.295 +
   1.296 +		- Check if the new PROVIDES conflicts with an
   1.297 +		  installed CONFLICTS. If so, create a new
   1.298 +		  FORCED_UPDATE rtp for the installed package, so we
   1.299 +		  can try to upgrade it to a non-conflicting version.
   1.300 +		  (If we can't, we'll have an OLD_CONFLICT error.)
   1.301 +
   1.302 +		- Check if the new PROVIDES conflicts with an
   1.303 +                  installed OBSOLETES *and* the PROVIDES property
   1.304 +                  corresponds to the name of its package. (That is,
   1.305 +                  OBSOLETES are only matched against package names,
   1.306 +                  not arbitrary provided properties.) If so, we have
   1.307 +                  an ALREADY_OBSOLETE error.
   1.308 +
   1.309 +		- Check if the new PROVIDES conflicts with a
   1.310 +		  to-be-installed CONFLICTS. If so, we have a
   1.311 +		  CONTRADICTION error.
   1.312 +
   1.313 +	    - For each to-be-installed CONFLICTS:
   1.314 +
   1.315 +		- Basically the reverse of the previous case: check if
   1.316 +		  the new CONFLICTS conflicts with an installed
   1.317 +		  PROVIDES. If so, create a new FORCED_UPDATE rtp for
   1.318 +		  the installed package, so we can try to upgrade it
   1.319 +		  to a non-conflicting version. (If we can't, we'll
   1.320 +		  have an NEW_CONFLICT error.)
   1.321 +
   1.322 +		- Check if the new CONFLICTS conflicts with a
   1.323 +		  to-be-installed PROVIDES. If so, we have a
   1.324 +		  CONTRADICTION error.
   1.325 +
   1.326 +	    - For each to-be-installed OBSOLETES:
   1.327 +
   1.328 +		- Check if there's an installed package that PROVIDES
   1.329 +		  that property. If so, create an OBSOLETED rtp for
   1.330 +		  the installed package.
   1.331 +
   1.332 +		- If not, check if there's a to-be-installed package
   1.333 +		  that PROVIDES that property. If so, we have a
   1.334 +		  CONTRADICTION error.
   1.335 +
   1.336 +
   1.337 +	    - For each installed PROVIDES:
   1.338 +
   1.339 +		- If the property is a valid package name (that is,
   1.340 +                  it's a package providing its own name), and it
   1.341 +                  matches the name of a new rtp with an unresolved
   1.342 +                  old_package, then set the rtp's old_package to point
   1.343 +                  to the package providing this property and clear the
   1.344 +                  appropriate bit in the system bit array.
   1.345 +
   1.346 +	    - For each to-be-removed PROVIDES:
   1.347 +
   1.348 +		- If there's also an identical to-be-installed
   1.349 +		  PROVIDES, we're ok and can skip this
   1.350 +
   1.351 +		- Otherwise, for each installed REQUIRES of this
   1.352 +                  property:
   1.353 +
   1.354 +		    - Look for some other installed or to-be-installed
   1.355 +		      property that satisfies the REQUIRES.
   1.356 +
   1.357 +		    - If there isn't one, then for each installed
   1.358 +		      package in this REQUIRES's package list:
   1.359 +
   1.360 +			- If the PROVIDES was lost because the old
   1.361 +			  package was REMOVEd (not FORCED_UPDATE or
   1.362 +			  OBSOLETED), then create a new REMOVE rtp for
   1.363 +			  this package.
   1.364 +
   1.365 +			- Otherwise, create a new FORCED_UPDATE rtp
   1.366 +                          for this package.
   1.367 +
   1.368 +		- (We don't need to look at to-be-installed REQUIRES
   1.369 +		  of this property, because if there are any, they
   1.370 +		  will cause a CONTRADICTION error when we try to
   1.371 +		  re-satisfy them the next time through.)
   1.372 +  </para>
   1.373 +</chapter>