docs/DEPSOLVE.txt
changeset 300 455eaa569767
child 310 9a7691262ce6
     1.1 --- /dev/null	Thu Jan 01 00:00:00 1970 +0000
     1.2 +++ b/docs/DEPSOLVE.txt	Sun Jun 29 10:55:17 2008 +0100
     1.3 @@ -0,0 +1,364 @@
     1.4 +YUM vs RAZOR
     1.5 +------------
     1.6 +
     1.7 +At a very high level, yum's depsolver does something roughly
     1.8 +equivalent to:
     1.9 +
    1.10 +    - For each package being installed or removed
    1.11 +
    1.12 +	- For each relevant property (provides, requires, conflicts,
    1.13 +          obsoletes):
    1.14 +
    1.15 +	    - Figure out what additional packages need to be added to
    1.16 +	      or removed from the system to satisfy this property
    1.17 +
    1.18 +which ends up being roughly O(N^2 * M) where N is the total number of
    1.19 +properties and M is the number of packages being acted on.
    1.20 +
    1.21 +(I just figured that out off the top of my head, and I'm not totally
    1.22 +familiar with the yum code, so it may be wrong.)
    1.23 +
    1.24 +Razor's depsolver is something like:
    1.25 +
    1.26 +    - do {
    1.27 +
    1.28 +	- For each property to be added to or removed from the system:
    1.29 +
    1.30 +	    - Figure out what packages need to be added to or removed
    1.31 +	      from the system to satisfy this property
    1.32 +
    1.33 +    - } until we stop adding/remove more packages
    1.34 +
    1.35 +with the key being that it's very easy to find the PROVIDES
    1.36 +corresponding to a REQUIRES and vice versa, because the property
    1.37 +arrays are sorted, and so all properties with the same "name" will be
    1.38 +adjacent to one another in the array, allowing many dependencies to be
    1.39 +satisified in essentially constant time. (Actually... we've been
    1.40 +calling it constant, but it's really O(log N) for heavily-depended-on
    1.41 +packages, because the more packages you have, the more variations on
    1.42 +"requires foo", "requires foo = 1.1", "requires foo > 1.0", etc you're
    1.43 +going to have to scan through.)
    1.44 +
    1.45 +Ideally though, each iteration of the inner loop body happens in
    1.46 +constant time, and thus the inner loop as a whole is O(N), and thus
    1.47 +the depsolver as a whole is O(N * M) (or at least, less than
    1.48 +O(N * M * log N).
    1.49 +
    1.50 +
    1.51 +FILE DEPENDENCIES
    1.52 +-----------------
    1.53 +
    1.54 +Whenever we add a package with a file REQUIRES to a razor_set, we also
    1.55 +add a PROVIDES for that file to the package or packages which provide
    1.56 +that file. This means that if we later add another package that
    1.57 +requires the same file (eg, /bin/sh or /usr/bin/perl), we can resolve
    1.58 +its file requirement exactly like we would resolve a property
    1.59 +requirement, in nearly constant time.
    1.60 +
    1.61 +When adding a *new* file requirement (ie, a requirement on a file that
    1.62 +no existing package depends on), we still have to scan through the
    1.63 +file tree, which is O(log N) in the number of files.
    1.64 +
    1.65 +(AFAICT, there's no reason yum couldn't do the same optimization.
    1.66 +Also, AFAICT, yum currently sticks property dependencies and file
    1.67 +dependencies into the same hash table, so that if any package in the
    1.68 +transaction has a file dependency, it causes *property* dependencies
    1.69 +to become slower to resolve as well...)
    1.70 +
    1.71 +
    1.72 +THE RULES
    1.73 +---------
    1.74 +
    1.75 +This is what we have figured out for transaction-solving rules;
    1.76 +neither yum nor rpm's algorithm seems to be explained in full
    1.77 +anywhere...
    1.78 +
    1.79 +    1. Every requested install in the initial package set must be
    1.80 +       satisfied as either a new install or an update:
    1.81 +
    1.82 +	- if the requested package name is the name of an upstream
    1.83 +          package:
    1.84 +
    1.85 +	    - if there is not a corresponding already-installed
    1.86 +              package, then install the upstream package
    1.87 +
    1.88 +	    - else if the upstream package is newer than the
    1.89 +              already-installed package, then update the package
    1.90 +
    1.91 +	    - else it's an error (UP_TO_DATE)
    1.92 +
    1.93 +	- else if the requested package name is the name of an
    1.94 +          already-installed package:
    1.95 +
    1.96 +	    - if there is an upstream package that obsoletes the
    1.97 +              already-installed package, then behave as though the
    1.98 +              user had requested that that package be installed
    1.99 +              instead.
   1.100 +
   1.101 +	    - else it's an error (UP_TO_DATE or INSTALL_UNAVAILABLE?)
   1.102 +
   1.103 +	- else it's an error (INSTALL_UNAVAILABLE)
   1.104 +
   1.105 +    2. Every requested removal in the initial package set must be
   1.106 +       satisfied as a removal. If any requested package name is not
   1.107 +       the name of an installed package, it's an error
   1.108 +       (REMOVE_NOT_INSTALLED)
   1.109 +
   1.110 +    REQUIRES processing:
   1.111 +
   1.112 +    3. If a package being installed or updated-to REQUIRES a property
   1.113 +       that is not provided by any installed or to-be-installed
   1.114 +       package, we need to find an installable package that provides
   1.115 +       that property. If we find one, install/update it. If not, it's
   1.116 +       an error (UNSATISFIABLE). (If we find an upstream package
   1.117 +       providing the property that corresponds to a system package
   1.118 +       that's being removed, then it's a CONTRADICTION.)
   1.119 +
   1.120 +    4. If an already-installed package REQUIRES a property which is
   1.121 +       only provided by a package that is being removed, then that
   1.122 +       package needs to be removed as well.
   1.123 +
   1.124 +    5. If an already-installed package REQUIRES a property which is
   1.125 +       only provided by a package that is being upgraded or obsoleted
   1.126 +       (to a new package which does not provide that property), then:
   1.127 +
   1.128 +	- if there is an update for the installed package, then update
   1.129 +          the installed package
   1.130 +
   1.131 +	- else if there is another installable package that provides
   1.132 +          the required property, then install that.
   1.133 +
   1.134 +	- else it's an error (UNSATISFIABLE?)
   1.135 +
   1.136 +    CONFLICTS processing
   1.137 +
   1.138 +    6. If a package being installed or updated-to CONFLICTS with a
   1.139 +       property provided by an installed package:
   1.140 +
   1.141 +	- if there is an update for the installed package, which the
   1.142 +          new package does not conflict with, then update the
   1.143 +          installed package.
   1.144 +
   1.145 +	- else it's an error (NEW_CONFLICT)
   1.146 +
   1.147 +    7. If an already-installed package CONFLICTS with a property
   1.148 +       provided by a to-be-installed package:
   1.149 +
   1.150 +	- if there is an update for the installed package, which does
   1.151 +          not conflict with the new package, then update the installed
   1.152 +          package.
   1.153 +
   1.154 +	- else it's an error (NEW_CONFLICT)
   1.155 +
   1.156 +    8. If a package being installed or updated-to CONFLICTS with a
   1.157 +       property provided by a to-be-installed package, then it's an
   1.158 +       error (CONTRADICTION).
   1.159 +
   1.160 +    OBSOLETES processing. NOTE: OBSOLETES are only matched against
   1.161 +    package names, not against arbitrary provided properties
   1.162 +
   1.163 +    9. If a package being installed or updated-to OBSOLETES an
   1.164 +       installed package, then obsolete that package. (ie, remove it,
   1.165 +       but treat it as updated for purposes of dangling REQUIRES).
   1.166 +
   1.167 +   10. If an already-installed package OBSOLETES a to-be-installed
   1.168 +       package, then it's an error. (ALREADY_OBSOLETE)
   1.169 +
   1.170 +   11. If a package being installed or updated-to OBSOLETES another
   1.171 +       package being installed or updated-to, then it's an error
   1.172 +       (CONTRADICTION).
   1.173 +
   1.174 +
   1.175 +
   1.176 +THE DEPSOLVER
   1.177 +-------------
   1.178 +
   1.179 +We start with two razor_sets, system and upstream, and a list of
   1.180 +requested installations and removals.
   1.181 +
   1.182 +    FIXME: what about multiple upstream repos? Having to deal with
   1.183 +    arbitrary numbers of razor_sets is possible, but will probably be
   1.184 +    messy... It might be easier to either store all upstream repo data
   1.185 +    in a single .repo file, or else merge all upstream .repo files
   1.186 +    together into a single razor_set at startup. (Or some combination
   1.187 +    of those.)
   1.188 +
   1.189 +We create a bit array of the packages in each set, indicating which
   1.190 +ones are installed; the system bitarray starts out all 1s, and the
   1.191 +upstream bitarray all 0s. Each bit is only allowed to change state
   1.192 +once during the transaction; an installed package can be removed, or
   1.193 +an uninstalled package installed, but trying to reinstall a removed
   1.194 +package, or uninstall a newly-installed package is an error. This
   1.195 +means the packages break down into four categories:
   1.196 +
   1.197 +    - installed       (1 bit in the system bit array)
   1.198 +    - to-be-removed   (0 bit in the system bit array)
   1.199 +    - to-be-installed (1 bit in the upstream bit array)
   1.200 +    - installable     (0 bit in the upstream bit array)
   1.201 +
   1.202 +
   1.203 +Depsolver algorithm:
   1.204 +
   1.205 +    - Create new razor_transaction_packages ("rtp"s) for each
   1.206 +      requested install or remove. These will be "unresolved", because
   1.207 +      we haven't yet found the razor_packages that correspond to them.
   1.208 +
   1.209 +    - while there are new rtps:
   1.210 +
   1.211 +	- sort the new rtps
   1.212 +
   1.213 +	- Walk the system property list, upstream property list, and
   1.214 +          new rtp list in parallel, and:
   1.215 +
   1.216 +	    - For each uninstalled PROVIDES:
   1.217 +
   1.218 +		- If the property is a valid package name (that is,
   1.219 +                  either it's a package providing its own name, or it
   1.220 +                  has a matching OBSOLETES), and it matches the name
   1.221 +                  of a new rtp of type INSTALL or FORCED_UPDATE with
   1.222 +                  an unresolved new_package:
   1.223 +
   1.224 +		    - If the upstream package has the same version as
   1.225 +		      the system package, we have an UP_TO_DATE error
   1.226 +		      (FIXME: not quite right. This doesn't deal with
   1.227 +		      the case where we try to update an application
   1.228 +		      because of a library update, and it turns out
   1.229 +		      there's no new version of the application, but
   1.230 +		      there IS a compat package containing the old
   1.231 +		      version of the library.)
   1.232 +
   1.233 +		    - Otherwise, set the rtp's new_package to point to
   1.234 +		      the package providing this property and set the
   1.235 +		      appropriate bit in the upstream bit array.
   1.236 +
   1.237 +	    - For each to-be-installed non-file REQUIRES:
   1.238 +
   1.239 +		- See if there's an installed or to-be-installed
   1.240 +		  package that PROVIDES that property.
   1.241 +
   1.242 +		- If not, see if there's an installable package that
   1.243 +		  PROVIDES that property, and create a new INSTALL rtp
   1.244 +		  for it if so.
   1.245 +
   1.246 +		- If not, see if there's a to-be-removed package that
   1.247 +		  PROVIDES that property. (If we find such a package,
   1.248 +		  we have a CONTRADICTION error.)
   1.249 +
   1.250 +		- If none of the above, then we have an UNSATISFIABLE
   1.251 +                  error
   1.252 +
   1.253 +	    - For each to-be-installed file REQUIRES:
   1.254 +
   1.255 +		- (We create fake file PROVIDES to match file REQUIRES
   1.256 +                  when importing/merging razor sets, so if there is
   1.257 +                  already another installed package that REQUIRES this
   1.258 +                  file, there will be a PROVIDES listed for it as well.)
   1.259 +
   1.260 +		- See if there's an installed package that PROVIDES
   1.261 +                  that file.
   1.262 +
   1.263 +		- If not, do a binary search of the system file tree
   1.264 +                  looking to see if some installed package provides
   1.265 +                  that file but does not have a PROVIDES for it.
   1.266 +
   1.267 +		- If not, see if there's an installable package that
   1.268 +		  PROVIDES that property, and create a new INSTALL rtp
   1.269 +		  for it if so.
   1.270 +
   1.271 +		- (If we actually work with multiple upstream
   1.272 +                  razor_sets, then we will need to search the upstream
   1.273 +                  file trees at this point, because it's possible that
   1.274 +                  a package in one upstream repo would require a file
   1.275 +                  in another upstream repo. But if we merge the
   1.276 +                  multiple upstream repos into a single razor_set at
   1.277 +                  some point, then we would not need to do that,
   1.278 +                  because it would be guaranteed that we would have
   1.279 +                  already created a fake PROVIDES if any package
   1.280 +                  provides the file.)
   1.281 +
   1.282 +		- If no installed or installable package provides the
   1.283 +		  file, see if there's a to-be-removed package that
   1.284 +		  provides the file. (If we find such a package, we
   1.285 +		  have a CONTRADICTION error.)
   1.286 +
   1.287 +		- If none of the above, then we have an UNSATISFIABLE
   1.288 +                  error
   1.289 +
   1.290 +	    - For each to-be-installed PROVIDES:
   1.291 +
   1.292 +		- Check if the new PROVIDES conflicts with an
   1.293 +		  installed CONFLICTS. If so, create a new
   1.294 +		  FORCED_UPDATE rtp for the installed package, so we
   1.295 +		  can try to upgrade it to a non-conflicting version.
   1.296 +		  (If we can't, we'll have an OLD_CONFLICT error.)
   1.297 +
   1.298 +		- Check if the new PROVIDES conflicts with an
   1.299 +                  installed OBSOLETES *and* the PROVIDES property
   1.300 +                  corresponds to the name of its package. (That is,
   1.301 +                  OBSOLETES are only matched against package names,
   1.302 +                  not arbitrary provided properties.) If so, we have
   1.303 +                  an ALREADY_OBSOLETE error.
   1.304 +
   1.305 +		- Check if the new PROVIDES conflicts with a
   1.306 +		  to-be-installed CONFLICTS. If so, we have a
   1.307 +		  CONTRADICTION error.
   1.308 +
   1.309 +	    - For each to-be-installed CONFLICTS:
   1.310 +
   1.311 +		- Basically the reverse of the previous case: check if
   1.312 +		  the new CONFLICTS conflicts with an installed
   1.313 +		  PROVIDES. If so, create a new FORCED_UPDATE rtp for
   1.314 +		  the installed package, so we can try to upgrade it
   1.315 +		  to a non-conflicting version. (If we can't, we'll
   1.316 +		  have an NEW_CONFLICT error.)
   1.317 +
   1.318 +		- Check if the new CONFLICTS conflicts with a
   1.319 +		  to-be-installed PROVIDES. If so, we have a
   1.320 +		  CONTRADICTION error.
   1.321 +
   1.322 +	    - For each to-be-installed OBSOLETES:
   1.323 +
   1.324 +		- Check if there's an installed package that PROVIDES
   1.325 +		  that property. If so, create an OBSOLETED rtp for
   1.326 +		  the installed package.
   1.327 +
   1.328 +		- If not, check if there's a to-be-installed package
   1.329 +		  that PROVIDES that property. If so, we have a
   1.330 +		  CONTRADICTION error.
   1.331 +
   1.332 +
   1.333 +	    - For each installed PROVIDES:
   1.334 +
   1.335 +		- If the property is a valid package name (that is,
   1.336 +                  it's a package providing its own name), and it
   1.337 +                  matches the name of a new rtp with an unresolved
   1.338 +                  old_package, then set the rtp's old_package to point
   1.339 +                  to the package providing this property and clear the
   1.340 +                  appropriate bit in the system bit array.
   1.341 +
   1.342 +	    - For each to-be-removed PROVIDES:
   1.343 +
   1.344 +		- If there's also an identical to-be-installed
   1.345 +		  PROVIDES, we're ok and can skip this
   1.346 +
   1.347 +		- Otherwise, for each installed REQUIRES of this
   1.348 +                  property:
   1.349 +
   1.350 +		    - Look for some other installed or to-be-installed
   1.351 +		      property that satisfies the REQUIRES.
   1.352 +
   1.353 +		    - If there isn't one, then for each installed
   1.354 +		      package in this REQUIRES's package list:
   1.355 +
   1.356 +			- If the PROVIDES was lost because the old
   1.357 +			  package was REMOVEd (not FORCED_UPDATE or
   1.358 +			  OBSOLETED), then create a new REMOVE rtp for
   1.359 +			  this package.
   1.360 +
   1.361 +			- Otherwise, create a new FORCED_UPDATE rtp
   1.362 +                          for this package.
   1.363 +
   1.364 +		- (We don't need to look at to-be-installed REQUIRES
   1.365 +		  of this property, because if there are any, they
   1.366 +		  will cause a CONTRADICTION error when we try to
   1.367 +		  re-satisfy them the next time through.)