docs/DEPSOLVE.txt
author Kristian H?gsberg <krh@redhat.com>
Fri Jun 20 21:56:43 2008 -0400 (2008-06-20)
changeset 254 ccb1c11968ab
child 310 9a7691262ce6
permissions -rw-r--r--
Introduce install/remove iterators.

These iterator constructors lets you pass in two sets and creates
an iterator for the packages to remove or the packages to install.
The iterators will step through the packages in a sequence that respects
the pre, post, preun and postun modifiers.

Right now, the install order isn't actually implemented, this patch just
implements the API changes and updates the applications.
     1 YUM vs RAZOR
     2 ------------
     3 
     4 At a very high level, yum's depsolver does something roughly
     5 equivalent to:
     6 
     7     - For each package being installed or removed
     8 
     9 	- For each relevant property (provides, requires, conflicts,
    10           obsoletes):
    11 
    12 	    - Figure out what additional packages need to be added to
    13 	      or removed from the system to satisfy this property
    14 
    15 which ends up being roughly O(N^2 * M) where N is the total number of
    16 properties and M is the number of packages being acted on.
    17 
    18 (I just figured that out off the top of my head, and I'm not totally
    19 familiar with the yum code, so it may be wrong.)
    20 
    21 Razor's depsolver is something like:
    22 
    23     - do {
    24 
    25 	- For each property to be added to or removed from the system:
    26 
    27 	    - Figure out what packages need to be added to or removed
    28 	      from the system to satisfy this property
    29 
    30     - } until we stop adding/remove more packages
    31 
    32 with the key being that it's very easy to find the PROVIDES
    33 corresponding to a REQUIRES and vice versa, because the property
    34 arrays are sorted, and so all properties with the same "name" will be
    35 adjacent to one another in the array, allowing many dependencies to be
    36 satisified in essentially constant time. (Actually... we've been
    37 calling it constant, but it's really O(log N) for heavily-depended-on
    38 packages, because the more packages you have, the more variations on
    39 "requires foo", "requires foo = 1.1", "requires foo > 1.0", etc you're
    40 going to have to scan through.)
    41 
    42 Ideally though, each iteration of the inner loop body happens in
    43 constant time, and thus the inner loop as a whole is O(N), and thus
    44 the depsolver as a whole is O(N * M) (or at least, less than
    45 O(N * M * log N).
    46 
    47 
    48 FILE DEPENDENCIES
    49 -----------------
    50 
    51 Whenever we add a package with a file REQUIRES to a razor_set, we also
    52 add a PROVIDES for that file to the package or packages which provide
    53 that file. This means that if we later add another package that
    54 requires the same file (eg, /bin/sh or /usr/bin/perl), we can resolve
    55 its file requirement exactly like we would resolve a property
    56 requirement, in nearly constant time.
    57 
    58 When adding a *new* file requirement (ie, a requirement on a file that
    59 no existing package depends on), we still have to scan through the
    60 file tree, which is O(log N) in the number of files.
    61 
    62 (AFAICT, there's no reason yum couldn't do the same optimization.
    63 Also, AFAICT, yum currently sticks property dependencies and file
    64 dependencies into the same hash table, so that if any package in the
    65 transaction has a file dependency, it causes *property* dependencies
    66 to become slower to resolve as well...)
    67 
    68 
    69 THE RULES
    70 ---------
    71 
    72 This is what we have figured out for transaction-solving rules;
    73 neither yum nor rpm's algorithm seems to be explained in full
    74 anywhere...
    75 
    76     1. Every requested install in the initial package set must be
    77        satisfied as either a new install or an update:
    78 
    79 	- if the requested package name is the name of an upstream
    80           package:
    81 
    82 	    - if there is not a corresponding already-installed
    83               package, then install the upstream package
    84 
    85 	    - else if the upstream package is newer than the
    86               already-installed package, then update the package
    87 
    88 	    - else it's an error (UP_TO_DATE)
    89 
    90 	- else if the requested package name is the name of an
    91           already-installed package:
    92 
    93 	    - if there is an upstream package that obsoletes the
    94               already-installed package, then behave as though the
    95               user had requested that that package be installed
    96               instead.
    97 
    98 	    - else it's an error (UP_TO_DATE or INSTALL_UNAVAILABLE?)
    99 
   100 	- else it's an error (INSTALL_UNAVAILABLE)
   101 
   102     2. Every requested removal in the initial package set must be
   103        satisfied as a removal. If any requested package name is not
   104        the name of an installed package, it's an error
   105        (REMOVE_NOT_INSTALLED)
   106 
   107     REQUIRES processing:
   108 
   109     3. If a package being installed or updated-to REQUIRES a property
   110        that is not provided by any installed or to-be-installed
   111        package, we need to find an installable package that provides
   112        that property. If we find one, install/update it. If not, it's
   113        an error (UNSATISFIABLE). (If we find an upstream package
   114        providing the property that corresponds to a system package
   115        that's being removed, then it's a CONTRADICTION.)
   116 
   117     4. If an already-installed package REQUIRES a property which is
   118        only provided by a package that is being removed, then that
   119        package needs to be removed as well.
   120 
   121     5. If an already-installed package REQUIRES a property which is
   122        only provided by a package that is being upgraded or obsoleted
   123        (to a new package which does not provide that property), then:
   124 
   125 	- if there is an update for the installed package, then update
   126           the installed package
   127 
   128 	- else if there is another installable package that provides
   129           the required property, then install that.
   130 
   131 	- else it's an error (UNSATISFIABLE?)
   132 
   133     CONFLICTS processing
   134 
   135     6. If a package being installed or updated-to CONFLICTS with a
   136        property provided by an installed package:
   137 
   138 	- if there is an update for the installed package, which the
   139           new package does not conflict with, then update the
   140           installed package.
   141 
   142 	- else it's an error (NEW_CONFLICT)
   143 
   144     7. If an already-installed package CONFLICTS with a property
   145        provided by a to-be-installed package:
   146 
   147 	- if there is an update for the installed package, which does
   148           not conflict with the new package, then update the installed
   149           package.
   150 
   151 	- else it's an error (NEW_CONFLICT)
   152 
   153     8. If a package being installed or updated-to CONFLICTS with a
   154        property provided by a to-be-installed package, then it's an
   155        error (CONTRADICTION).
   156 
   157     OBSOLETES processing. NOTE: OBSOLETES are only matched against
   158     package names, not against arbitrary provided properties
   159 
   160     9. If a package being installed or updated-to OBSOLETES an
   161        installed package, then obsolete that package. (ie, remove it,
   162        but treat it as updated for purposes of dangling REQUIRES).
   163 
   164    10. If an already-installed package OBSOLETES a to-be-installed
   165        package, then it's an error. (ALREADY_OBSOLETE)
   166 
   167    11. If a package being installed or updated-to OBSOLETES another
   168        package being installed or updated-to, then it's an error
   169        (CONTRADICTION).
   170 
   171 
   172 
   173 THE DEPSOLVER
   174 -------------
   175 
   176 We start with two razor_sets, system and upstream, and a list of
   177 requested installations and removals.
   178 
   179     FIXME: what about multiple upstream repos? Having to deal with
   180     arbitrary numbers of razor_sets is possible, but will probably be
   181     messy... It might be easier to either store all upstream repo data
   182     in a single .repo file, or else merge all upstream .repo files
   183     together into a single razor_set at startup. (Or some combination
   184     of those.)
   185 
   186 We create a bit array of the packages in each set, indicating which
   187 ones are installed; the system bitarray starts out all 1s, and the
   188 upstream bitarray all 0s. Each bit is only allowed to change state
   189 once during the transaction; an installed package can be removed, or
   190 an uninstalled package installed, but trying to reinstall a removed
   191 package, or uninstall a newly-installed package is an error. This
   192 means the packages break down into four categories:
   193 
   194     - installed       (1 bit in the system bit array)
   195     - to-be-removed   (0 bit in the system bit array)
   196     - to-be-installed (1 bit in the upstream bit array)
   197     - installable     (0 bit in the upstream bit array)
   198 
   199 
   200 Depsolver algorithm:
   201 
   202     - Create new razor_transaction_packages ("rtp"s) for each
   203       requested install or remove. These will be "unresolved", because
   204       we haven't yet found the razor_packages that correspond to them.
   205 
   206     - while there are new rtps:
   207 
   208 	- sort the new rtps
   209 
   210 	- Walk the system property list, upstream property list, and
   211           new rtp list in parallel, and:
   212 
   213 	    - For each uninstalled PROVIDES:
   214 
   215 		- If the property is a valid package name (that is,
   216                   either it's a package providing its own name, or it
   217                   has a matching OBSOLETES), and it matches the name
   218                   of a new rtp of type INSTALL or FORCED_UPDATE with
   219                   an unresolved new_package:
   220 
   221 		    - If the upstream package has the same version as
   222 		      the system package, we have an UP_TO_DATE error
   223 		      (FIXME: not quite right. This doesn't deal with
   224 		      the case where we try to update an application
   225 		      because of a library update, and it turns out
   226 		      there's no new version of the application, but
   227 		      there IS a compat package containing the old
   228 		      version of the library.)
   229 
   230 		    - Otherwise, set the rtp's new_package to point to
   231 		      the package providing this property and set the
   232 		      appropriate bit in the upstream bit array.
   233 
   234 	    - For each to-be-installed non-file REQUIRES:
   235 
   236 		- See if there's an installed or to-be-installed
   237 		  package that PROVIDES that property.
   238 
   239 		- If not, see if there's an installable package that
   240 		  PROVIDES that property, and create a new INSTALL rtp
   241 		  for it if so.
   242 
   243 		- If not, see if there's a to-be-removed package that
   244 		  PROVIDES that property. (If we find such a package,
   245 		  we have a CONTRADICTION error.)
   246 
   247 		- If none of the above, then we have an UNSATISFIABLE
   248                   error
   249 
   250 	    - For each to-be-installed file REQUIRES:
   251 
   252 		- (We create fake file PROVIDES to match file REQUIRES
   253                   when importing/merging razor sets, so if there is
   254                   already another installed package that REQUIRES this
   255                   file, there will be a PROVIDES listed for it as well.)
   256 
   257 		- See if there's an installed package that PROVIDES
   258                   that file.
   259 
   260 		- If not, do a binary search of the system file tree
   261                   looking to see if some installed package provides
   262                   that file but does not have a PROVIDES for it.
   263 
   264 		- If not, see if there's an installable package that
   265 		  PROVIDES that property, and create a new INSTALL rtp
   266 		  for it if so.
   267 
   268 		- (If we actually work with multiple upstream
   269                   razor_sets, then we will need to search the upstream
   270                   file trees at this point, because it's possible that
   271                   a package in one upstream repo would require a file
   272                   in another upstream repo. But if we merge the
   273                   multiple upstream repos into a single razor_set at
   274                   some point, then we would not need to do that,
   275                   because it would be guaranteed that we would have
   276                   already created a fake PROVIDES if any package
   277                   provides the file.)
   278 
   279 		- If no installed or installable package provides the
   280 		  file, see if there's a to-be-removed package that
   281 		  provides the file. (If we find such a package, we
   282 		  have a CONTRADICTION error.)
   283 
   284 		- If none of the above, then we have an UNSATISFIABLE
   285                   error
   286 
   287 	    - For each to-be-installed PROVIDES:
   288 
   289 		- Check if the new PROVIDES conflicts with an
   290 		  installed CONFLICTS. If so, create a new
   291 		  FORCED_UPDATE rtp for the installed package, so we
   292 		  can try to upgrade it to a non-conflicting version.
   293 		  (If we can't, we'll have an OLD_CONFLICT error.)
   294 
   295 		- Check if the new PROVIDES conflicts with an
   296                   installed OBSOLETES *and* the PROVIDES property
   297                   corresponds to the name of its package. (That is,
   298                   OBSOLETES are only matched against package names,
   299                   not arbitrary provided properties.) If so, we have
   300                   an ALREADY_OBSOLETE error.
   301 
   302 		- Check if the new PROVIDES conflicts with a
   303 		  to-be-installed CONFLICTS. If so, we have a
   304 		  CONTRADICTION error.
   305 
   306 	    - For each to-be-installed CONFLICTS:
   307 
   308 		- Basically the reverse of the previous case: check if
   309 		  the new CONFLICTS conflicts with an installed
   310 		  PROVIDES. If so, create a new FORCED_UPDATE rtp for
   311 		  the installed package, so we can try to upgrade it
   312 		  to a non-conflicting version. (If we can't, we'll
   313 		  have an NEW_CONFLICT error.)
   314 
   315 		- Check if the new CONFLICTS conflicts with a
   316 		  to-be-installed PROVIDES. If so, we have a
   317 		  CONTRADICTION error.
   318 
   319 	    - For each to-be-installed OBSOLETES:
   320 
   321 		- Check if there's an installed package that PROVIDES
   322 		  that property. If so, create an OBSOLETED rtp for
   323 		  the installed package.
   324 
   325 		- If not, check if there's a to-be-installed package
   326 		  that PROVIDES that property. If so, we have a
   327 		  CONTRADICTION error.
   328 
   329 
   330 	    - For each installed PROVIDES:
   331 
   332 		- If the property is a valid package name (that is,
   333                   it's a package providing its own name), and it
   334                   matches the name of a new rtp with an unresolved
   335                   old_package, then set the rtp's old_package to point
   336                   to the package providing this property and clear the
   337                   appropriate bit in the system bit array.
   338 
   339 	    - For each to-be-removed PROVIDES:
   340 
   341 		- If there's also an identical to-be-installed
   342 		  PROVIDES, we're ok and can skip this
   343 
   344 		- Otherwise, for each installed REQUIRES of this
   345                   property:
   346 
   347 		    - Look for some other installed or to-be-installed
   348 		      property that satisfies the REQUIRES.
   349 
   350 		    - If there isn't one, then for each installed
   351 		      package in this REQUIRES's package list:
   352 
   353 			- If the PROVIDES was lost because the old
   354 			  package was REMOVEd (not FORCED_UPDATE or
   355 			  OBSOLETED), then create a new REMOVE rtp for
   356 			  this package.
   357 
   358 			- Otherwise, create a new FORCED_UPDATE rtp
   359                           for this package.
   360 
   361 		- (We don't need to look at to-be-installed REQUIRES
   362 		  of this property, because if there are any, they
   363 		  will cause a CONTRADICTION error when we try to
   364 		  re-satisfy them the next time through.)