docs/DEPSOLVE.txt
changeset 311 cd292c2de0b6
parent 310 9a7691262ce6
child 312 068a7429db5d
     1.1 --- a/docs/DEPSOLVE.txt	Wed Jul 02 18:46:47 2008 +0100
     1.2 +++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
     1.3 @@ -1,364 +0,0 @@
     1.4 -YUM vs RAZOR
     1.5 -------------
     1.6 -
     1.7 -At a very high level, yum's depsolver does something roughly
     1.8 -equivalent to:
     1.9 -
    1.10 -    - For each package being installed or removed
    1.11 -
    1.12 -	- For each relevant property (provides, requires, conflicts,
    1.13 -          obsoletes):
    1.14 -
    1.15 -	    - Figure out what additional packages need to be added to
    1.16 -	      or removed from the system to satisfy this property
    1.17 -
    1.18 -which ends up being roughly O(N^2 * M) where N is the total number of
    1.19 -properties and M is the number of packages being acted on.
    1.20 -
    1.21 -(I just figured that out off the top of my head, and I'm not totally
    1.22 -familiar with the yum code, so it may be wrong.)
    1.23 -
    1.24 -Razor's depsolver is something like:
    1.25 -
    1.26 -    - do {
    1.27 -
    1.28 -	- For each property to be added to or removed from the system:
    1.29 -
    1.30 -	    - Figure out what packages need to be added to or removed
    1.31 -	      from the system to satisfy this property
    1.32 -
    1.33 -    - } until we stop adding/remove more packages
    1.34 -
    1.35 -with the key being that it's very easy to find the PROVIDES
    1.36 -corresponding to a REQUIRES and vice versa, because the property
    1.37 -arrays are sorted, and so all properties with the same "name" will be
    1.38 -adjacent to one another in the array, allowing many dependencies to be
    1.39 -satisified in essentially constant time. (Actually... we've been
    1.40 -calling it constant, but it's really O(log N) for heavily-depended-on
    1.41 -packages, because the more packages you have, the more variations on
    1.42 -"requires foo", "requires foo = 1.1", "requires foo > 1.0", etc you're
    1.43 -going to have to scan through.)
    1.44 -
    1.45 -Ideally though, each iteration of the inner loop body happens in
    1.46 -constant time, and thus the inner loop as a whole is O(N), and thus
    1.47 -the depsolver as a whole is O(N * M) (or at least, less than
    1.48 -O(N * M * log N).
    1.49 -
    1.50 -
    1.51 -FILE DEPENDENCIES
    1.52 ------------------
    1.53 -
    1.54 -Whenever we add a package with a file REQUIRES to a razor_set, we also
    1.55 -add a PROVIDES for that file to the package or packages which provide
    1.56 -that file. This means that if we later add another package that
    1.57 -requires the same file (eg, /bin/sh or /usr/bin/perl), we can resolve
    1.58 -its file requirement exactly like we would resolve a property
    1.59 -requirement, in nearly constant time.
    1.60 -
    1.61 -When adding a *new* file requirement (ie, a requirement on a file that
    1.62 -no existing package depends on), we still have to scan through the
    1.63 -file tree, which is O(log N) in the number of files.
    1.64 -
    1.65 -(AFAICT, there's no reason yum couldn't do the same optimization.
    1.66 -Also, AFAICT, yum currently sticks property dependencies and file
    1.67 -dependencies into the same hash table, so that if any package in the
    1.68 -transaction has a file dependency, it causes *property* dependencies
    1.69 -to become slower to resolve as well...)
    1.70 -
    1.71 -
    1.72 -THE RULES
    1.73 ----------
    1.74 -
    1.75 -This is what we have figured out for transaction-solving rules;
    1.76 -neither yum nor rpm's algorithm seems to be explained in full
    1.77 -anywhere...
    1.78 -
    1.79 -    1. Every requested install in the initial package set must be
    1.80 -       satisfied as either a new install or an update:
    1.81 -
    1.82 -	- if the requested package name is the name of an upstream
    1.83 -          package:
    1.84 -
    1.85 -	    - if there is not a corresponding already-installed
    1.86 -              package, then install the upstream package
    1.87 -
    1.88 -	    - else if the upstream package is newer than the
    1.89 -              already-installed package, then update the package
    1.90 -
    1.91 -	    - else it's an error (UP_TO_DATE)
    1.92 -
    1.93 -	- else if the requested package name is the name of an
    1.94 -          already-installed package:
    1.95 -
    1.96 -	    - if there is an upstream package that obsoletes the
    1.97 -              already-installed package, then behave as though the
    1.98 -              user had requested that that package be installed
    1.99 -              instead.
   1.100 -
   1.101 -	    - else it's an error (UP_TO_DATE or INSTALL_UNAVAILABLE?)
   1.102 -
   1.103 -	- else it's an error (INSTALL_UNAVAILABLE)
   1.104 -
   1.105 -    2. Every requested removal in the initial package set must be
   1.106 -       satisfied as a removal. If any requested package name is not
   1.107 -       the name of an installed package, it's an error
   1.108 -       (REMOVE_NOT_INSTALLED)
   1.109 -
   1.110 -    REQUIRES processing:
   1.111 -
   1.112 -    3. If a package being installed or updated-to REQUIRES a property
   1.113 -       that is not provided by any installed or to-be-installed
   1.114 -       package, we need to find an installable package that provides
   1.115 -       that property. If we find one, install/update it. If not, it's
   1.116 -       an error (UNSATISFIABLE). (If we find an upstream package
   1.117 -       providing the property that corresponds to a system package
   1.118 -       that's being removed, then it's a CONTRADICTION.)
   1.119 -
   1.120 -    4. If an already-installed package REQUIRES a property which is
   1.121 -       only provided by a package that is being removed, then that
   1.122 -       package needs to be removed as well.
   1.123 -
   1.124 -    5. If an already-installed package REQUIRES a property which is
   1.125 -       only provided by a package that is being upgraded or obsoleted
   1.126 -       (to a new package which does not provide that property), then:
   1.127 -
   1.128 -	- if there is an update for the installed package, then update
   1.129 -          the installed package
   1.130 -
   1.131 -	- else if there is another installable package that provides
   1.132 -          the required property, then install that.
   1.133 -
   1.134 -	- else it's an error (UNSATISFIABLE?)
   1.135 -
   1.136 -    CONFLICTS processing
   1.137 -
   1.138 -    6. If a package being installed or updated-to CONFLICTS with a
   1.139 -       property provided by an installed package:
   1.140 -
   1.141 -	- if there is an update for the installed package, which the
   1.142 -          new package does not conflict with, then update the
   1.143 -          installed package.
   1.144 -
   1.145 -	- else it's an error (NEW_CONFLICT)
   1.146 -
   1.147 -    7. If an already-installed package CONFLICTS with a property
   1.148 -       provided by a to-be-installed package:
   1.149 -
   1.150 -	- if there is an update for the installed package, which does
   1.151 -          not conflict with the new package, then update the installed
   1.152 -          package.
   1.153 -
   1.154 -	- else it's an error (NEW_CONFLICT)
   1.155 -
   1.156 -    8. If a package being installed or updated-to CONFLICTS with a
   1.157 -       property provided by a to-be-installed package, then it's an
   1.158 -       error (CONTRADICTION).
   1.159 -
   1.160 -    OBSOLETES processing. NOTE: OBSOLETES are only matched against
   1.161 -    package names, not against arbitrary provided properties
   1.162 -
   1.163 -    9. If a package being installed or updated-to OBSOLETES an
   1.164 -       installed package, then obsolete that package. (ie, remove it,
   1.165 -       but treat it as updated for purposes of dangling REQUIRES).
   1.166 -
   1.167 -   10. If an already-installed package OBSOLETES a to-be-installed
   1.168 -       package, then it's an error. (ALREADY_OBSOLETE)
   1.169 -
   1.170 -   11. If a package being installed or updated-to OBSOLETES another
   1.171 -       package being installed or updated-to, then it's an error
   1.172 -       (CONTRADICTION).
   1.173 -
   1.174 -
   1.175 -
   1.176 -THE DEPSOLVER
   1.177 --------------
   1.178 -
   1.179 -We start with two razor_sets, system and upstream, and a list of
   1.180 -requested installations and removals.
   1.181 -
   1.182 -    FIXME: what about multiple upstream repos? Having to deal with
   1.183 -    arbitrary numbers of razor_sets is possible, but will probably be
   1.184 -    messy... It might be easier to either store all upstream repo data
   1.185 -    in a single .rzdb file, or else merge all upstream .rzdb files
   1.186 -    together into a single razor_set at startup. (Or some combination
   1.187 -    of those.)
   1.188 -
   1.189 -We create a bit array of the packages in each set, indicating which
   1.190 -ones are installed; the system bitarray starts out all 1s, and the
   1.191 -upstream bitarray all 0s. Each bit is only allowed to change state
   1.192 -once during the transaction; an installed package can be removed, or
   1.193 -an uninstalled package installed, but trying to reinstall a removed
   1.194 -package, or uninstall a newly-installed package is an error. This
   1.195 -means the packages break down into four categories:
   1.196 -
   1.197 -    - installed       (1 bit in the system bit array)
   1.198 -    - to-be-removed   (0 bit in the system bit array)
   1.199 -    - to-be-installed (1 bit in the upstream bit array)
   1.200 -    - installable     (0 bit in the upstream bit array)
   1.201 -
   1.202 -
   1.203 -Depsolver algorithm:
   1.204 -
   1.205 -    - Create new razor_transaction_packages ("rtp"s) for each
   1.206 -      requested install or remove. These will be "unresolved", because
   1.207 -      we haven't yet found the razor_packages that correspond to them.
   1.208 -
   1.209 -    - while there are new rtps:
   1.210 -
   1.211 -	- sort the new rtps
   1.212 -
   1.213 -	- Walk the system property list, upstream property list, and
   1.214 -          new rtp list in parallel, and:
   1.215 -
   1.216 -	    - For each uninstalled PROVIDES:
   1.217 -
   1.218 -		- If the property is a valid package name (that is,
   1.219 -                  either it's a package providing its own name, or it
   1.220 -                  has a matching OBSOLETES), and it matches the name
   1.221 -                  of a new rtp of type INSTALL or FORCED_UPDATE with
   1.222 -                  an unresolved new_package:
   1.223 -
   1.224 -		    - If the upstream package has the same version as
   1.225 -		      the system package, we have an UP_TO_DATE error
   1.226 -		      (FIXME: not quite right. This doesn't deal with
   1.227 -		      the case where we try to update an application
   1.228 -		      because of a library update, and it turns out
   1.229 -		      there's no new version of the application, but
   1.230 -		      there IS a compat package containing the old
   1.231 -		      version of the library.)
   1.232 -
   1.233 -		    - Otherwise, set the rtp's new_package to point to
   1.234 -		      the package providing this property and set the
   1.235 -		      appropriate bit in the upstream bit array.
   1.236 -
   1.237 -	    - For each to-be-installed non-file REQUIRES:
   1.238 -
   1.239 -		- See if there's an installed or to-be-installed
   1.240 -		  package that PROVIDES that property.
   1.241 -
   1.242 -		- If not, see if there's an installable package that
   1.243 -		  PROVIDES that property, and create a new INSTALL rtp
   1.244 -		  for it if so.
   1.245 -
   1.246 -		- If not, see if there's a to-be-removed package that
   1.247 -		  PROVIDES that property. (If we find such a package,
   1.248 -		  we have a CONTRADICTION error.)
   1.249 -
   1.250 -		- If none of the above, then we have an UNSATISFIABLE
   1.251 -                  error
   1.252 -
   1.253 -	    - For each to-be-installed file REQUIRES:
   1.254 -
   1.255 -		- (We create fake file PROVIDES to match file REQUIRES
   1.256 -                  when importing/merging razor sets, so if there is
   1.257 -                  already another installed package that REQUIRES this
   1.258 -                  file, there will be a PROVIDES listed for it as well.)
   1.259 -
   1.260 -		- See if there's an installed package that PROVIDES
   1.261 -                  that file.
   1.262 -
   1.263 -		- If not, do a binary search of the system file tree
   1.264 -                  looking to see if some installed package provides
   1.265 -                  that file but does not have a PROVIDES for it.
   1.266 -
   1.267 -		- If not, see if there's an installable package that
   1.268 -		  PROVIDES that property, and create a new INSTALL rtp
   1.269 -		  for it if so.
   1.270 -
   1.271 -		- (If we actually work with multiple upstream
   1.272 -                  razor_sets, then we will need to search the upstream
   1.273 -                  file trees at this point, because it's possible that
   1.274 -                  a package in one upstream repo would require a file
   1.275 -                  in another upstream repo. But if we merge the
   1.276 -                  multiple upstream repos into a single razor_set at
   1.277 -                  some point, then we would not need to do that,
   1.278 -                  because it would be guaranteed that we would have
   1.279 -                  already created a fake PROVIDES if any package
   1.280 -                  provides the file.)
   1.281 -
   1.282 -		- If no installed or installable package provides the
   1.283 -		  file, see if there's a to-be-removed package that
   1.284 -		  provides the file. (If we find such a package, we
   1.285 -		  have a CONTRADICTION error.)
   1.286 -
   1.287 -		- If none of the above, then we have an UNSATISFIABLE
   1.288 -                  error
   1.289 -
   1.290 -	    - For each to-be-installed PROVIDES:
   1.291 -
   1.292 -		- Check if the new PROVIDES conflicts with an
   1.293 -		  installed CONFLICTS. If so, create a new
   1.294 -		  FORCED_UPDATE rtp for the installed package, so we
   1.295 -		  can try to upgrade it to a non-conflicting version.
   1.296 -		  (If we can't, we'll have an OLD_CONFLICT error.)
   1.297 -
   1.298 -		- Check if the new PROVIDES conflicts with an
   1.299 -                  installed OBSOLETES *and* the PROVIDES property
   1.300 -                  corresponds to the name of its package. (That is,
   1.301 -                  OBSOLETES are only matched against package names,
   1.302 -                  not arbitrary provided properties.) If so, we have
   1.303 -                  an ALREADY_OBSOLETE error.
   1.304 -
   1.305 -		- Check if the new PROVIDES conflicts with a
   1.306 -		  to-be-installed CONFLICTS. If so, we have a
   1.307 -		  CONTRADICTION error.
   1.308 -
   1.309 -	    - For each to-be-installed CONFLICTS:
   1.310 -
   1.311 -		- Basically the reverse of the previous case: check if
   1.312 -		  the new CONFLICTS conflicts with an installed
   1.313 -		  PROVIDES. If so, create a new FORCED_UPDATE rtp for
   1.314 -		  the installed package, so we can try to upgrade it
   1.315 -		  to a non-conflicting version. (If we can't, we'll
   1.316 -		  have an NEW_CONFLICT error.)
   1.317 -
   1.318 -		- Check if the new CONFLICTS conflicts with a
   1.319 -		  to-be-installed PROVIDES. If so, we have a
   1.320 -		  CONTRADICTION error.
   1.321 -
   1.322 -	    - For each to-be-installed OBSOLETES:
   1.323 -
   1.324 -		- Check if there's an installed package that PROVIDES
   1.325 -		  that property. If so, create an OBSOLETED rtp for
   1.326 -		  the installed package.
   1.327 -
   1.328 -		- If not, check if there's a to-be-installed package
   1.329 -		  that PROVIDES that property. If so, we have a
   1.330 -		  CONTRADICTION error.
   1.331 -
   1.332 -
   1.333 -	    - For each installed PROVIDES:
   1.334 -
   1.335 -		- If the property is a valid package name (that is,
   1.336 -                  it's a package providing its own name), and it
   1.337 -                  matches the name of a new rtp with an unresolved
   1.338 -                  old_package, then set the rtp's old_package to point
   1.339 -                  to the package providing this property and clear the
   1.340 -                  appropriate bit in the system bit array.
   1.341 -
   1.342 -	    - For each to-be-removed PROVIDES:
   1.343 -
   1.344 -		- If there's also an identical to-be-installed
   1.345 -		  PROVIDES, we're ok and can skip this
   1.346 -
   1.347 -		- Otherwise, for each installed REQUIRES of this
   1.348 -                  property:
   1.349 -
   1.350 -		    - Look for some other installed or to-be-installed
   1.351 -		      property that satisfies the REQUIRES.
   1.352 -
   1.353 -		    - If there isn't one, then for each installed
   1.354 -		      package in this REQUIRES's package list:
   1.355 -
   1.356 -			- If the PROVIDES was lost because the old
   1.357 -			  package was REMOVEd (not FORCED_UPDATE or
   1.358 -			  OBSOLETED), then create a new REMOVE rtp for
   1.359 -			  this package.
   1.360 -
   1.361 -			- Otherwise, create a new FORCED_UPDATE rtp
   1.362 -                          for this package.
   1.363 -
   1.364 -		- (We don't need to look at to-be-installed REQUIRES
   1.365 -		  of this property, because if there are any, they
   1.366 -		  will cause a CONTRADICTION error when we try to
   1.367 -		  re-satisfy them the next time through.)