docs/solver.xml
author J. Ali Harlow <ali@juiblex.co.uk>
Thu Jan 08 13:51:07 2009 +0000 (2009-01-08)
changeset 327 c85643dd7164
permissions -rw-r--r--
Don't try and create symbolic links on platforms that don't support symlink()
     1 <?xml version="1.0" encoding="utf-8"?>
     2 <!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN" "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd">
     3 
     4 <chapter id="solver">
     5   <title>Dependency Solver</title>
     6 
     7   <para>
     8     At a very high level, yum's depsolver does something roughly
     9     equivalent to:
    10 
    11     - For each package being installed or removed
    12 
    13 	- For each relevant property (provides, requires, conflicts,
    14           obsoletes):
    15 
    16 	    - Figure out what additional packages need to be added to
    17 	      or removed from the system to satisfy this property
    18 
    19     which ends up being roughly O(N^2 * M) where N is the total number of
    20     properties and M is the number of packages being acted on.
    21 
    22 (I just figured that out off the top of my head, and I'm not totally
    23 familiar with the yum code, so it may be wrong.)
    24 
    25 Razor's depsolver is something like:
    26 
    27     - do {
    28 
    29 	- For each property to be added to or removed from the system:
    30 
    31 	    - Figure out what packages need to be added to or removed
    32 	      from the system to satisfy this property
    33 
    34     - } until we stop adding/remove more packages
    35 
    36 with the key being that it's very easy to find the PROVIDES
    37 corresponding to a REQUIRES and vice versa, because the property
    38 arrays are sorted, and so all properties with the same "name" will be
    39 adjacent to one another in the array, allowing many dependencies to be
    40 satisified in essentially constant time. (Actually... we've been
    41 calling it constant, but it's really O(log N) for heavily-depended-on
    42 packages, because the more packages you have, the more variations on
    43 "requires foo", "requires foo = 1.1", "requires foo &gt; 1.0", etc you're
    44 going to have to scan through.)
    45 
    46 Ideally though, each iteration of the inner loop body happens in
    47 constant time, and thus the inner loop as a whole is O(N), and thus
    48 the depsolver as a whole is O(N * M) (or at least, less than
    49 O(N * M * log N).
    50 
    51 
    52 FILE DEPENDENCIES
    53 -----------------
    54 
    55 Whenever we add a package with a file REQUIRES to a razor_set, we also
    56 add a PROVIDES for that file to the package or packages which provide
    57 that file. This means that if we later add another package that
    58 requires the same file (eg, /bin/sh or /usr/bin/perl), we can resolve
    59 its file requirement exactly like we would resolve a property
    60 requirement, in nearly constant time.
    61 
    62 When adding a *new* file requirement (ie, a requirement on a file that
    63 no existing package depends on), we still have to scan through the
    64 file tree, which is O(log N) in the number of files.
    65 
    66 (AFAICT, there's no reason yum couldn't do the same optimization.
    67 Also, AFAICT, yum currently sticks property dependencies and file
    68 dependencies into the same hash table, so that if any package in the
    69 transaction has a file dependency, it causes *property* dependencies
    70 to become slower to resolve as well...)
    71 
    72 
    73 THE RULES
    74 ---------
    75 
    76 This is what we have figured out for transaction-solving rules;
    77 neither yum nor rpm's algorithm seems to be explained in full
    78 anywhere...
    79 
    80     1. Every requested install in the initial package set must be
    81        satisfied as either a new install or an update:
    82 
    83 	- if the requested package name is the name of an upstream
    84           package:
    85 
    86 	    - if there is not a corresponding already-installed
    87               package, then install the upstream package
    88 
    89 	    - else if the upstream package is newer than the
    90               already-installed package, then update the package
    91 
    92 	    - else it's an error (UP_TO_DATE)
    93 
    94 	- else if the requested package name is the name of an
    95           already-installed package:
    96 
    97 	    - if there is an upstream package that obsoletes the
    98               already-installed package, then behave as though the
    99               user had requested that that package be installed
   100               instead.
   101 
   102 	    - else it's an error (UP_TO_DATE or INSTALL_UNAVAILABLE?)
   103 
   104 	- else it's an error (INSTALL_UNAVAILABLE)
   105 
   106     2. Every requested removal in the initial package set must be
   107        satisfied as a removal. If any requested package name is not
   108        the name of an installed package, it's an error
   109        (REMOVE_NOT_INSTALLED)
   110 
   111     REQUIRES processing:
   112 
   113     3. If a package being installed or updated-to REQUIRES a property
   114        that is not provided by any installed or to-be-installed
   115        package, we need to find an installable package that provides
   116        that property. If we find one, install/update it. If not, it's
   117        an error (UNSATISFIABLE). (If we find an upstream package
   118        providing the property that corresponds to a system package
   119        that's being removed, then it's a CONTRADICTION.)
   120 
   121     4. If an already-installed package REQUIRES a property which is
   122        only provided by a package that is being removed, then that
   123        package needs to be removed as well.
   124 
   125     5. If an already-installed package REQUIRES a property which is
   126        only provided by a package that is being upgraded or obsoleted
   127        (to a new package which does not provide that property), then:
   128 
   129 	- if there is an update for the installed package, then update
   130           the installed package
   131 
   132 	- else if there is another installable package that provides
   133           the required property, then install that.
   134 
   135 	- else it's an error (UNSATISFIABLE?)
   136 
   137     CONFLICTS processing
   138 
   139     6. If a package being installed or updated-to CONFLICTS with a
   140        property provided by an installed package:
   141 
   142 	- if there is an update for the installed package, which the
   143           new package does not conflict with, then update the
   144           installed package.
   145 
   146 	- else it's an error (NEW_CONFLICT)
   147 
   148     7. If an already-installed package CONFLICTS with a property
   149        provided by a to-be-installed package:
   150 
   151 	- if there is an update for the installed package, which does
   152           not conflict with the new package, then update the installed
   153           package.
   154 
   155 	- else it's an error (NEW_CONFLICT)
   156 
   157     8. If a package being installed or updated-to CONFLICTS with a
   158        property provided by a to-be-installed package, then it's an
   159        error (CONTRADICTION).
   160 
   161     OBSOLETES processing. NOTE: OBSOLETES are only matched against
   162     package names, not against arbitrary provided properties
   163 
   164     9. If a package being installed or updated-to OBSOLETES an
   165        installed package, then obsolete that package. (ie, remove it,
   166        but treat it as updated for purposes of dangling REQUIRES).
   167 
   168    10. If an already-installed package OBSOLETES a to-be-installed
   169        package, then it's an error. (ALREADY_OBSOLETE)
   170 
   171    11. If a package being installed or updated-to OBSOLETES another
   172        package being installed or updated-to, then it's an error
   173        (CONTRADICTION).
   174 
   175 
   176 
   177 THE DEPSOLVER
   178 -------------
   179 
   180 We start with two razor_sets, system and upstream, and a list of
   181 requested installations and removals.
   182 
   183     FIXME: what about multiple upstream repos? Having to deal with
   184     arbitrary numbers of razor_sets is possible, but will probably be
   185     messy... It might be easier to either store all upstream repo data
   186     in a single .rzdb file, or else merge all upstream .rzdb files
   187     together into a single razor_set at startup. (Or some combination
   188     of those.)
   189 
   190 We create a bit array of the packages in each set, indicating which
   191 ones are installed; the system bitarray starts out all 1s, and the
   192 upstream bitarray all 0s. Each bit is only allowed to change state
   193 once during the transaction; an installed package can be removed, or
   194 an uninstalled package installed, but trying to reinstall a removed
   195 package, or uninstall a newly-installed package is an error. This
   196 means the packages break down into four categories:
   197 
   198     - installed       (1 bit in the system bit array)
   199     - to-be-removed   (0 bit in the system bit array)
   200     - to-be-installed (1 bit in the upstream bit array)
   201     - installable     (0 bit in the upstream bit array)
   202 
   203 
   204 Depsolver algorithm:
   205 
   206     - Create new razor_transaction_packages ("rtp"s) for each
   207       requested install or remove. These will be "unresolved", because
   208       we haven't yet found the razor_packages that correspond to them.
   209 
   210     - while there are new rtps:
   211 
   212 	- sort the new rtps
   213 
   214 	- Walk the system property list, upstream property list, and
   215           new rtp list in parallel, and:
   216 
   217 	    - For each uninstalled PROVIDES:
   218 
   219 		- If the property is a valid package name (that is,
   220                   either it's a package providing its own name, or it
   221                   has a matching OBSOLETES), and it matches the name
   222                   of a new rtp of type INSTALL or FORCED_UPDATE with
   223                   an unresolved new_package:
   224 
   225 		    - If the upstream package has the same version as
   226 		      the system package, we have an UP_TO_DATE error
   227 		      (FIXME: not quite right. This doesn't deal with
   228 		      the case where we try to update an application
   229 		      because of a library update, and it turns out
   230 		      there's no new version of the application, but
   231 		      there IS a compat package containing the old
   232 		      version of the library.)
   233 
   234 		    - Otherwise, set the rtp's new_package to point to
   235 		      the package providing this property and set the
   236 		      appropriate bit in the upstream bit array.
   237 
   238 	    - For each to-be-installed non-file REQUIRES:
   239 
   240 		- See if there's an installed or to-be-installed
   241 		  package that PROVIDES that property.
   242 
   243 		- If not, see if there's an installable package that
   244 		  PROVIDES that property, and create a new INSTALL rtp
   245 		  for it if so.
   246 
   247 		- If not, see if there's a to-be-removed package that
   248 		  PROVIDES that property. (If we find such a package,
   249 		  we have a CONTRADICTION error.)
   250 
   251 		- If none of the above, then we have an UNSATISFIABLE
   252                   error
   253 
   254 	    - For each to-be-installed file REQUIRES:
   255 
   256 		- (We create fake file PROVIDES to match file REQUIRES
   257                   when importing/merging razor sets, so if there is
   258                   already another installed package that REQUIRES this
   259                   file, there will be a PROVIDES listed for it as well.)
   260 
   261 		- See if there's an installed package that PROVIDES
   262                   that file.
   263 
   264 		- If not, do a binary search of the system file tree
   265                   looking to see if some installed package provides
   266                   that file but does not have a PROVIDES for it.
   267 
   268 		- If not, see if there's an installable package that
   269 		  PROVIDES that property, and create a new INSTALL rtp
   270 		  for it if so.
   271 
   272 		- (If we actually work with multiple upstream
   273                   razor_sets, then we will need to search the upstream
   274                   file trees at this point, because it's possible that
   275                   a package in one upstream repo would require a file
   276                   in another upstream repo. But if we merge the
   277                   multiple upstream repos into a single razor_set at
   278                   some point, then we would not need to do that,
   279                   because it would be guaranteed that we would have
   280                   already created a fake PROVIDES if any package
   281                   provides the file.)
   282 
   283 		- If no installed or installable package provides the
   284 		  file, see if there's a to-be-removed package that
   285 		  provides the file. (If we find such a package, we
   286 		  have a CONTRADICTION error.)
   287 
   288 		- If none of the above, then we have an UNSATISFIABLE
   289                   error
   290 
   291 	    - For each to-be-installed PROVIDES:
   292 
   293 		- Check if the new PROVIDES conflicts with an
   294 		  installed CONFLICTS. If so, create a new
   295 		  FORCED_UPDATE rtp for the installed package, so we
   296 		  can try to upgrade it to a non-conflicting version.
   297 		  (If we can't, we'll have an OLD_CONFLICT error.)
   298 
   299 		- Check if the new PROVIDES conflicts with an
   300                   installed OBSOLETES *and* the PROVIDES property
   301                   corresponds to the name of its package. (That is,
   302                   OBSOLETES are only matched against package names,
   303                   not arbitrary provided properties.) If so, we have
   304                   an ALREADY_OBSOLETE error.
   305 
   306 		- Check if the new PROVIDES conflicts with a
   307 		  to-be-installed CONFLICTS. If so, we have a
   308 		  CONTRADICTION error.
   309 
   310 	    - For each to-be-installed CONFLICTS:
   311 
   312 		- Basically the reverse of the previous case: check if
   313 		  the new CONFLICTS conflicts with an installed
   314 		  PROVIDES. If so, create a new FORCED_UPDATE rtp for
   315 		  the installed package, so we can try to upgrade it
   316 		  to a non-conflicting version. (If we can't, we'll
   317 		  have an NEW_CONFLICT error.)
   318 
   319 		- Check if the new CONFLICTS conflicts with a
   320 		  to-be-installed PROVIDES. If so, we have a
   321 		  CONTRADICTION error.
   322 
   323 	    - For each to-be-installed OBSOLETES:
   324 
   325 		- Check if there's an installed package that PROVIDES
   326 		  that property. If so, create an OBSOLETED rtp for
   327 		  the installed package.
   328 
   329 		- If not, check if there's a to-be-installed package
   330 		  that PROVIDES that property. If so, we have a
   331 		  CONTRADICTION error.
   332 
   333 
   334 	    - For each installed PROVIDES:
   335 
   336 		- If the property is a valid package name (that is,
   337                   it's a package providing its own name), and it
   338                   matches the name of a new rtp with an unresolved
   339                   old_package, then set the rtp's old_package to point
   340                   to the package providing this property and clear the
   341                   appropriate bit in the system bit array.
   342 
   343 	    - For each to-be-removed PROVIDES:
   344 
   345 		- If there's also an identical to-be-installed
   346 		  PROVIDES, we're ok and can skip this
   347 
   348 		- Otherwise, for each installed REQUIRES of this
   349                   property:
   350 
   351 		    - Look for some other installed or to-be-installed
   352 		      property that satisfies the REQUIRES.
   353 
   354 		    - If there isn't one, then for each installed
   355 		      package in this REQUIRES's package list:
   356 
   357 			- If the PROVIDES was lost because the old
   358 			  package was REMOVEd (not FORCED_UPDATE or
   359 			  OBSOLETED), then create a new REMOVE rtp for
   360 			  this package.
   361 
   362 			- Otherwise, create a new FORCED_UPDATE rtp
   363                           for this package.
   364 
   365 		- (We don't need to look at to-be-installed REQUIRES
   366 		  of this property, because if there are any, they
   367 		  will cause a CONTRADICTION error when we try to
   368 		  re-satisfy them the next time through.)
   369   </para>
   370 </chapter>