docs/solver.xml
author J. Ali Harlow <ali@juiblex.co.uk>
Sat Feb 11 23:50:26 2012 +0000 (2012-02-11)
changeset 423 6112bcc5d1cf
permissions -rw-r--r--
Add an error object.
This is intended to dis-entangle the two roles that the atomic
object has evolved into so that atomic need only be used where
atomic actions are actually being undertaken.
krh@311
     1
<?xml version="1.0" encoding="utf-8"?>
krh@311
     2
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN" "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd">
krh@311
     3
krh@311
     4
<chapter id="solver">
krh@311
     5
  <title>Dependency Solver</title>
krh@311
     6
krh@311
     7
  <para>
krh@311
     8
    At a very high level, yum's depsolver does something roughly
krh@311
     9
    equivalent to:
krh@311
    10
krh@311
    11
    - For each package being installed or removed
krh@311
    12
krh@311
    13
	- For each relevant property (provides, requires, conflicts,
krh@311
    14
          obsoletes):
krh@311
    15
krh@311
    16
	    - Figure out what additional packages need to be added to
krh@311
    17
	      or removed from the system to satisfy this property
krh@311
    18
krh@311
    19
    which ends up being roughly O(N^2 * M) where N is the total number of
krh@311
    20
    properties and M is the number of packages being acted on.
krh@311
    21
krh@311
    22
(I just figured that out off the top of my head, and I'm not totally
krh@311
    23
familiar with the yum code, so it may be wrong.)
krh@311
    24
krh@311
    25
Razor's depsolver is something like:
krh@311
    26
krh@311
    27
    - do {
krh@311
    28
krh@311
    29
	- For each property to be added to or removed from the system:
krh@311
    30
krh@311
    31
	    - Figure out what packages need to be added to or removed
krh@311
    32
	      from the system to satisfy this property
krh@311
    33
krh@311
    34
    - } until we stop adding/remove more packages
krh@311
    35
krh@311
    36
with the key being that it's very easy to find the PROVIDES
krh@311
    37
corresponding to a REQUIRES and vice versa, because the property
krh@311
    38
arrays are sorted, and so all properties with the same "name" will be
krh@311
    39
adjacent to one another in the array, allowing many dependencies to be
krh@311
    40
satisified in essentially constant time. (Actually... we've been
krh@311
    41
calling it constant, but it's really O(log N) for heavily-depended-on
krh@311
    42
packages, because the more packages you have, the more variations on
krh@311
    43
"requires foo", "requires foo = 1.1", "requires foo &gt; 1.0", etc you're
krh@311
    44
going to have to scan through.)
krh@311
    45
krh@311
    46
Ideally though, each iteration of the inner loop body happens in
krh@311
    47
constant time, and thus the inner loop as a whole is O(N), and thus
krh@311
    48
the depsolver as a whole is O(N * M) (or at least, less than
krh@311
    49
O(N * M * log N).
krh@311
    50
krh@311
    51
krh@311
    52
FILE DEPENDENCIES
krh@311
    53
-----------------
krh@311
    54
krh@311
    55
Whenever we add a package with a file REQUIRES to a razor_set, we also
krh@311
    56
add a PROVIDES for that file to the package or packages which provide
krh@311
    57
that file. This means that if we later add another package that
krh@311
    58
requires the same file (eg, /bin/sh or /usr/bin/perl), we can resolve
krh@311
    59
its file requirement exactly like we would resolve a property
krh@311
    60
requirement, in nearly constant time.
krh@311
    61
krh@311
    62
When adding a *new* file requirement (ie, a requirement on a file that
krh@311
    63
no existing package depends on), we still have to scan through the
krh@311
    64
file tree, which is O(log N) in the number of files.
krh@311
    65
krh@311
    66
(AFAICT, there's no reason yum couldn't do the same optimization.
krh@311
    67
Also, AFAICT, yum currently sticks property dependencies and file
krh@311
    68
dependencies into the same hash table, so that if any package in the
krh@311
    69
transaction has a file dependency, it causes *property* dependencies
krh@311
    70
to become slower to resolve as well...)
krh@311
    71
krh@311
    72
krh@311
    73
THE RULES
krh@311
    74
---------
krh@311
    75
krh@311
    76
This is what we have figured out for transaction-solving rules;
krh@311
    77
neither yum nor rpm's algorithm seems to be explained in full
krh@311
    78
anywhere...
krh@311
    79
krh@311
    80
    1. Every requested install in the initial package set must be
krh@311
    81
       satisfied as either a new install or an update:
krh@311
    82
krh@311
    83
	- if the requested package name is the name of an upstream
krh@311
    84
          package:
krh@311
    85
krh@311
    86
	    - if there is not a corresponding already-installed
krh@311
    87
              package, then install the upstream package
krh@311
    88
krh@311
    89
	    - else if the upstream package is newer than the
krh@311
    90
              already-installed package, then update the package
krh@311
    91
krh@311
    92
	    - else it's an error (UP_TO_DATE)
krh@311
    93
krh@311
    94
	- else if the requested package name is the name of an
krh@311
    95
          already-installed package:
krh@311
    96
krh@311
    97
	    - if there is an upstream package that obsoletes the
krh@311
    98
              already-installed package, then behave as though the
krh@311
    99
              user had requested that that package be installed
krh@311
   100
              instead.
krh@311
   101
krh@311
   102
	    - else it's an error (UP_TO_DATE or INSTALL_UNAVAILABLE?)
krh@311
   103
krh@311
   104
	- else it's an error (INSTALL_UNAVAILABLE)
krh@311
   105
krh@311
   106
    2. Every requested removal in the initial package set must be
krh@311
   107
       satisfied as a removal. If any requested package name is not
krh@311
   108
       the name of an installed package, it's an error
krh@311
   109
       (REMOVE_NOT_INSTALLED)
krh@311
   110
krh@311
   111
    REQUIRES processing:
krh@311
   112
krh@311
   113
    3. If a package being installed or updated-to REQUIRES a property
krh@311
   114
       that is not provided by any installed or to-be-installed
krh@311
   115
       package, we need to find an installable package that provides
krh@311
   116
       that property. If we find one, install/update it. If not, it's
krh@311
   117
       an error (UNSATISFIABLE). (If we find an upstream package
krh@311
   118
       providing the property that corresponds to a system package
krh@311
   119
       that's being removed, then it's a CONTRADICTION.)
krh@311
   120
krh@311
   121
    4. If an already-installed package REQUIRES a property which is
krh@311
   122
       only provided by a package that is being removed, then that
krh@311
   123
       package needs to be removed as well.
krh@311
   124
krh@311
   125
    5. If an already-installed package REQUIRES a property which is
krh@311
   126
       only provided by a package that is being upgraded or obsoleted
krh@311
   127
       (to a new package which does not provide that property), then:
krh@311
   128
krh@311
   129
	- if there is an update for the installed package, then update
krh@311
   130
          the installed package
krh@311
   131
krh@311
   132
	- else if there is another installable package that provides
krh@311
   133
          the required property, then install that.
krh@311
   134
krh@311
   135
	- else it's an error (UNSATISFIABLE?)
krh@311
   136
krh@311
   137
    CONFLICTS processing
krh@311
   138
krh@311
   139
    6. If a package being installed or updated-to CONFLICTS with a
krh@311
   140
       property provided by an installed package:
krh@311
   141
krh@311
   142
	- if there is an update for the installed package, which the
krh@311
   143
          new package does not conflict with, then update the
krh@311
   144
          installed package.
krh@311
   145
krh@311
   146
	- else it's an error (NEW_CONFLICT)
krh@311
   147
krh@311
   148
    7. If an already-installed package CONFLICTS with a property
krh@311
   149
       provided by a to-be-installed package:
krh@311
   150
krh@311
   151
	- if there is an update for the installed package, which does
krh@311
   152
          not conflict with the new package, then update the installed
krh@311
   153
          package.
krh@311
   154
krh@311
   155
	- else it's an error (NEW_CONFLICT)
krh@311
   156
krh@311
   157
    8. If a package being installed or updated-to CONFLICTS with a
krh@311
   158
       property provided by a to-be-installed package, then it's an
krh@311
   159
       error (CONTRADICTION).
krh@311
   160
krh@311
   161
    OBSOLETES processing. NOTE: OBSOLETES are only matched against
krh@311
   162
    package names, not against arbitrary provided properties
krh@311
   163
krh@311
   164
    9. If a package being installed or updated-to OBSOLETES an
krh@311
   165
       installed package, then obsolete that package. (ie, remove it,
krh@311
   166
       but treat it as updated for purposes of dangling REQUIRES).
krh@311
   167
krh@311
   168
   10. If an already-installed package OBSOLETES a to-be-installed
krh@311
   169
       package, then it's an error. (ALREADY_OBSOLETE)
krh@311
   170
krh@311
   171
   11. If a package being installed or updated-to OBSOLETES another
krh@311
   172
       package being installed or updated-to, then it's an error
krh@311
   173
       (CONTRADICTION).
krh@311
   174
krh@311
   175
krh@311
   176
krh@311
   177
THE DEPSOLVER
krh@311
   178
-------------
krh@311
   179
krh@311
   180
We start with two razor_sets, system and upstream, and a list of
krh@311
   181
requested installations and removals.
krh@311
   182
krh@311
   183
    FIXME: what about multiple upstream repos? Having to deal with
krh@311
   184
    arbitrary numbers of razor_sets is possible, but will probably be
krh@311
   185
    messy... It might be easier to either store all upstream repo data
krh@311
   186
    in a single .rzdb file, or else merge all upstream .rzdb files
krh@311
   187
    together into a single razor_set at startup. (Or some combination
krh@311
   188
    of those.)
krh@311
   189
krh@311
   190
We create a bit array of the packages in each set, indicating which
krh@311
   191
ones are installed; the system bitarray starts out all 1s, and the
krh@311
   192
upstream bitarray all 0s. Each bit is only allowed to change state
krh@311
   193
once during the transaction; an installed package can be removed, or
krh@311
   194
an uninstalled package installed, but trying to reinstall a removed
krh@311
   195
package, or uninstall a newly-installed package is an error. This
krh@311
   196
means the packages break down into four categories:
krh@311
   197
krh@311
   198
    - installed       (1 bit in the system bit array)
krh@311
   199
    - to-be-removed   (0 bit in the system bit array)
krh@311
   200
    - to-be-installed (1 bit in the upstream bit array)
krh@311
   201
    - installable     (0 bit in the upstream bit array)
krh@311
   202
krh@311
   203
krh@311
   204
Depsolver algorithm:
krh@311
   205
krh@311
   206
    - Create new razor_transaction_packages ("rtp"s) for each
krh@311
   207
      requested install or remove. These will be "unresolved", because
krh@311
   208
      we haven't yet found the razor_packages that correspond to them.
krh@311
   209
krh@311
   210
    - while there are new rtps:
krh@311
   211
krh@311
   212
	- sort the new rtps
krh@311
   213
krh@311
   214
	- Walk the system property list, upstream property list, and
krh@311
   215
          new rtp list in parallel, and:
krh@311
   216
krh@311
   217
	    - For each uninstalled PROVIDES:
krh@311
   218
krh@311
   219
		- If the property is a valid package name (that is,
krh@311
   220
                  either it's a package providing its own name, or it
krh@311
   221
                  has a matching OBSOLETES), and it matches the name
krh@311
   222
                  of a new rtp of type INSTALL or FORCED_UPDATE with
krh@311
   223
                  an unresolved new_package:
krh@311
   224
krh@311
   225
		    - If the upstream package has the same version as
krh@311
   226
		      the system package, we have an UP_TO_DATE error
krh@311
   227
		      (FIXME: not quite right. This doesn't deal with
krh@311
   228
		      the case where we try to update an application
krh@311
   229
		      because of a library update, and it turns out
krh@311
   230
		      there's no new version of the application, but
krh@311
   231
		      there IS a compat package containing the old
krh@311
   232
		      version of the library.)
krh@311
   233
krh@311
   234
		    - Otherwise, set the rtp's new_package to point to
krh@311
   235
		      the package providing this property and set the
krh@311
   236
		      appropriate bit in the upstream bit array.
krh@311
   237
krh@311
   238
	    - For each to-be-installed non-file REQUIRES:
krh@311
   239
krh@311
   240
		- See if there's an installed or to-be-installed
krh@311
   241
		  package that PROVIDES that property.
krh@311
   242
krh@311
   243
		- If not, see if there's an installable package that
krh@311
   244
		  PROVIDES that property, and create a new INSTALL rtp
krh@311
   245
		  for it if so.
krh@311
   246
krh@311
   247
		- If not, see if there's a to-be-removed package that
krh@311
   248
		  PROVIDES that property. (If we find such a package,
krh@311
   249
		  we have a CONTRADICTION error.)
krh@311
   250
krh@311
   251
		- If none of the above, then we have an UNSATISFIABLE
krh@311
   252
                  error
krh@311
   253
krh@311
   254
	    - For each to-be-installed file REQUIRES:
krh@311
   255
krh@311
   256
		- (We create fake file PROVIDES to match file REQUIRES
krh@311
   257
                  when importing/merging razor sets, so if there is
krh@311
   258
                  already another installed package that REQUIRES this
krh@311
   259
                  file, there will be a PROVIDES listed for it as well.)
krh@311
   260
krh@311
   261
		- See if there's an installed package that PROVIDES
krh@311
   262
                  that file.
krh@311
   263
krh@311
   264
		- If not, do a binary search of the system file tree
krh@311
   265
                  looking to see if some installed package provides
krh@311
   266
                  that file but does not have a PROVIDES for it.
krh@311
   267
krh@311
   268
		- If not, see if there's an installable package that
krh@311
   269
		  PROVIDES that property, and create a new INSTALL rtp
krh@311
   270
		  for it if so.
krh@311
   271
krh@311
   272
		- (If we actually work with multiple upstream
krh@311
   273
                  razor_sets, then we will need to search the upstream
krh@311
   274
                  file trees at this point, because it's possible that
krh@311
   275
                  a package in one upstream repo would require a file
krh@311
   276
                  in another upstream repo. But if we merge the
krh@311
   277
                  multiple upstream repos into a single razor_set at
krh@311
   278
                  some point, then we would not need to do that,
krh@311
   279
                  because it would be guaranteed that we would have
krh@311
   280
                  already created a fake PROVIDES if any package
krh@311
   281
                  provides the file.)
krh@311
   282
krh@311
   283
		- If no installed or installable package provides the
krh@311
   284
		  file, see if there's a to-be-removed package that
krh@311
   285
		  provides the file. (If we find such a package, we
krh@311
   286
		  have a CONTRADICTION error.)
krh@311
   287
krh@311
   288
		- If none of the above, then we have an UNSATISFIABLE
krh@311
   289
                  error
krh@311
   290
krh@311
   291
	    - For each to-be-installed PROVIDES:
krh@311
   292
krh@311
   293
		- Check if the new PROVIDES conflicts with an
krh@311
   294
		  installed CONFLICTS. If so, create a new
krh@311
   295
		  FORCED_UPDATE rtp for the installed package, so we
krh@311
   296
		  can try to upgrade it to a non-conflicting version.
krh@311
   297
		  (If we can't, we'll have an OLD_CONFLICT error.)
krh@311
   298
krh@311
   299
		- Check if the new PROVIDES conflicts with an
krh@311
   300
                  installed OBSOLETES *and* the PROVIDES property
krh@311
   301
                  corresponds to the name of its package. (That is,
krh@311
   302
                  OBSOLETES are only matched against package names,
krh@311
   303
                  not arbitrary provided properties.) If so, we have
krh@311
   304
                  an ALREADY_OBSOLETE error.
krh@311
   305
krh@311
   306
		- Check if the new PROVIDES conflicts with a
krh@311
   307
		  to-be-installed CONFLICTS. If so, we have a
krh@311
   308
		  CONTRADICTION error.
krh@311
   309
krh@311
   310
	    - For each to-be-installed CONFLICTS:
krh@311
   311
krh@311
   312
		- Basically the reverse of the previous case: check if
krh@311
   313
		  the new CONFLICTS conflicts with an installed
krh@311
   314
		  PROVIDES. If so, create a new FORCED_UPDATE rtp for
krh@311
   315
		  the installed package, so we can try to upgrade it
krh@311
   316
		  to a non-conflicting version. (If we can't, we'll
krh@311
   317
		  have an NEW_CONFLICT error.)
krh@311
   318
krh@311
   319
		- Check if the new CONFLICTS conflicts with a
krh@311
   320
		  to-be-installed PROVIDES. If so, we have a
krh@311
   321
		  CONTRADICTION error.
krh@311
   322
krh@311
   323
	    - For each to-be-installed OBSOLETES:
krh@311
   324
krh@311
   325
		- Check if there's an installed package that PROVIDES
krh@311
   326
		  that property. If so, create an OBSOLETED rtp for
krh@311
   327
		  the installed package.
krh@311
   328
krh@311
   329
		- If not, check if there's a to-be-installed package
krh@311
   330
		  that PROVIDES that property. If so, we have a
krh@311
   331
		  CONTRADICTION error.
krh@311
   332
krh@311
   333
krh@311
   334
	    - For each installed PROVIDES:
krh@311
   335
krh@311
   336
		- If the property is a valid package name (that is,
krh@311
   337
                  it's a package providing its own name), and it
krh@311
   338
                  matches the name of a new rtp with an unresolved
krh@311
   339
                  old_package, then set the rtp's old_package to point
krh@311
   340
                  to the package providing this property and clear the
krh@311
   341
                  appropriate bit in the system bit array.
krh@311
   342
krh@311
   343
	    - For each to-be-removed PROVIDES:
krh@311
   344
krh@311
   345
		- If there's also an identical to-be-installed
krh@311
   346
		  PROVIDES, we're ok and can skip this
krh@311
   347
krh@311
   348
		- Otherwise, for each installed REQUIRES of this
krh@311
   349
                  property:
krh@311
   350
krh@311
   351
		    - Look for some other installed or to-be-installed
krh@311
   352
		      property that satisfies the REQUIRES.
krh@311
   353
krh@311
   354
		    - If there isn't one, then for each installed
krh@311
   355
		      package in this REQUIRES's package list:
krh@311
   356
krh@311
   357
			- If the PROVIDES was lost because the old
krh@311
   358
			  package was REMOVEd (not FORCED_UPDATE or
krh@311
   359
			  OBSOLETED), then create a new REMOVE rtp for
krh@311
   360
			  this package.
krh@311
   361
krh@311
   362
			- Otherwise, create a new FORCED_UPDATE rtp
krh@311
   363
                          for this package.
krh@311
   364
krh@311
   365
		- (We don't need to look at to-be-installed REQUIRES
krh@311
   366
		  of this property, because if there are any, they
krh@311
   367
		  will cause a CONTRADICTION error when we try to
krh@311
   368
		  re-satisfy them the next time through.)
krh@311
   369
  </para>
krh@311
   370
</chapter>