docs/solver.xml
author J. Ali Harlow <ali@juiblex.co.uk>
Thu Feb 09 20:45:27 2012 +0000 (2012-02-09)
changeset 418 33b825d3128d
permissions -rw-r--r--
Add transaction barriers
These allow packages to be installed and removed which have scripts
that depend on each other when atomic transactions are involved.
Note that yum supports pre, but not other requires flags. post will
need similar support to the post scripts themselves pulling in the
requires flags from the rpms. Likewise preun and postun will need
similar handling to those scrips since the requires flags will need
to be stored in the razor database.
krh@311
     1
<?xml version="1.0" encoding="utf-8"?>
krh@311
     2
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN" "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd">
krh@311
     3
krh@311
     4
<chapter id="solver">
krh@311
     5
  <title>Dependency Solver</title>
krh@311
     6
krh@311
     7
  <para>
krh@311
     8
    At a very high level, yum's depsolver does something roughly
krh@311
     9
    equivalent to:
krh@311
    10
krh@311
    11
    - For each package being installed or removed
krh@311
    12
krh@311
    13
	- For each relevant property (provides, requires, conflicts,
krh@311
    14
          obsoletes):
krh@311
    15
krh@311
    16
	    - Figure out what additional packages need to be added to
krh@311
    17
	      or removed from the system to satisfy this property
krh@311
    18
krh@311
    19
    which ends up being roughly O(N^2 * M) where N is the total number of
krh@311
    20
    properties and M is the number of packages being acted on.
krh@311
    21
krh@311
    22
(I just figured that out off the top of my head, and I'm not totally
krh@311
    23
familiar with the yum code, so it may be wrong.)
krh@311
    24
krh@311
    25
Razor's depsolver is something like:
krh@311
    26
krh@311
    27
    - do {
krh@311
    28
krh@311
    29
	- For each property to be added to or removed from the system:
krh@311
    30
krh@311
    31
	    - Figure out what packages need to be added to or removed
krh@311
    32
	      from the system to satisfy this property
krh@311
    33
krh@311
    34
    - } until we stop adding/remove more packages
krh@311
    35
krh@311
    36
with the key being that it's very easy to find the PROVIDES
krh@311
    37
corresponding to a REQUIRES and vice versa, because the property
krh@311
    38
arrays are sorted, and so all properties with the same "name" will be
krh@311
    39
adjacent to one another in the array, allowing many dependencies to be
krh@311
    40
satisified in essentially constant time. (Actually... we've been
krh@311
    41
calling it constant, but it's really O(log N) for heavily-depended-on
krh@311
    42
packages, because the more packages you have, the more variations on
krh@311
    43
"requires foo", "requires foo = 1.1", "requires foo &gt; 1.0", etc you're
krh@311
    44
going to have to scan through.)
krh@311
    45
krh@311
    46
Ideally though, each iteration of the inner loop body happens in
krh@311
    47
constant time, and thus the inner loop as a whole is O(N), and thus
krh@311
    48
the depsolver as a whole is O(N * M) (or at least, less than
krh@311
    49
O(N * M * log N).
krh@311
    50
krh@311
    51
krh@311
    52
FILE DEPENDENCIES
krh@311
    53
-----------------
krh@311
    54
krh@311
    55
Whenever we add a package with a file REQUIRES to a razor_set, we also
krh@311
    56
add a PROVIDES for that file to the package or packages which provide
krh@311
    57
that file. This means that if we later add another package that
krh@311
    58
requires the same file (eg, /bin/sh or /usr/bin/perl), we can resolve
krh@311
    59
its file requirement exactly like we would resolve a property
krh@311
    60
requirement, in nearly constant time.
krh@311
    61
krh@311
    62
When adding a *new* file requirement (ie, a requirement on a file that
krh@311
    63
no existing package depends on), we still have to scan through the
krh@311
    64
file tree, which is O(log N) in the number of files.
krh@311
    65
krh@311
    66
(AFAICT, there's no reason yum couldn't do the same optimization.
krh@311
    67
Also, AFAICT, yum currently sticks property dependencies and file
krh@311
    68
dependencies into the same hash table, so that if any package in the
krh@311
    69
transaction has a file dependency, it causes *property* dependencies
krh@311
    70
to become slower to resolve as well...)
krh@311
    71
krh@311
    72
krh@311
    73
THE RULES
krh@311
    74
---------
krh@311
    75
krh@311
    76
This is what we have figured out for transaction-solving rules;
krh@311
    77
neither yum nor rpm's algorithm seems to be explained in full
krh@311
    78
anywhere...
krh@311
    79
krh@311
    80
    1. Every requested install in the initial package set must be
krh@311
    81
       satisfied as either a new install or an update:
krh@311
    82
krh@311
    83
	- if the requested package name is the name of an upstream
krh@311
    84
          package:
krh@311
    85
krh@311
    86
	    - if there is not a corresponding already-installed
krh@311
    87
              package, then install the upstream package
krh@311
    88
krh@311
    89
	    - else if the upstream package is newer than the
krh@311
    90
              already-installed package, then update the package
krh@311
    91
krh@311
    92
	    - else it's an error (UP_TO_DATE)
krh@311
    93
krh@311
    94
	- else if the requested package name is the name of an
krh@311
    95
          already-installed package:
krh@311
    96
krh@311
    97
	    - if there is an upstream package that obsoletes the
krh@311
    98
              already-installed package, then behave as though the
krh@311
    99
              user had requested that that package be installed
krh@311
   100
              instead.
krh@311
   101
krh@311
   102
	    - else it's an error (UP_TO_DATE or INSTALL_UNAVAILABLE?)
krh@311
   103
krh@311
   104
	- else it's an error (INSTALL_UNAVAILABLE)
krh@311
   105
krh@311
   106
    2. Every requested removal in the initial package set must be
krh@311
   107
       satisfied as a removal. If any requested package name is not
krh@311
   108
       the name of an installed package, it's an error
krh@311
   109
       (REMOVE_NOT_INSTALLED)
krh@311
   110
krh@311
   111
    REQUIRES processing:
krh@311
   112
krh@311
   113
    3. If a package being installed or updated-to REQUIRES a property
krh@311
   114
       that is not provided by any installed or to-be-installed
krh@311
   115
       package, we need to find an installable package that provides
krh@311
   116
       that property. If we find one, install/update it. If not, it's
krh@311
   117
       an error (UNSATISFIABLE). (If we find an upstream package
krh@311
   118
       providing the property that corresponds to a system package
krh@311
   119
       that's being removed, then it's a CONTRADICTION.)
krh@311
   120
krh@311
   121
    4. If an already-installed package REQUIRES a property which is
krh@311
   122
       only provided by a package that is being removed, then that
krh@311
   123
       package needs to be removed as well.
krh@311
   124
krh@311
   125
    5. If an already-installed package REQUIRES a property which is
krh@311
   126
       only provided by a package that is being upgraded or obsoleted
krh@311
   127
       (to a new package which does not provide that property), then:
krh@311
   128
krh@311
   129
	- if there is an update for the installed package, then update
krh@311
   130
          the installed package
krh@311
   131
krh@311
   132
	- else if there is another installable package that provides
krh@311
   133
          the required property, then install that.
krh@311
   134
krh@311
   135
	- else it's an error (UNSATISFIABLE?)
krh@311
   136
krh@311
   137
    CONFLICTS processing
krh@311
   138
krh@311
   139
    6. If a package being installed or updated-to CONFLICTS with a
krh@311
   140
       property provided by an installed package:
krh@311
   141
krh@311
   142
	- if there is an update for the installed package, which the
krh@311
   143
          new package does not conflict with, then update the
krh@311
   144
          installed package.
krh@311
   145
krh@311
   146
	- else it's an error (NEW_CONFLICT)
krh@311
   147
krh@311
   148
    7. If an already-installed package CONFLICTS with a property
krh@311
   149
       provided by a to-be-installed package:
krh@311
   150
krh@311
   151
	- if there is an update for the installed package, which does
krh@311
   152
          not conflict with the new package, then update the installed
krh@311
   153
          package.
krh@311
   154
krh@311
   155
	- else it's an error (NEW_CONFLICT)
krh@311
   156
krh@311
   157
    8. If a package being installed or updated-to CONFLICTS with a
krh@311
   158
       property provided by a to-be-installed package, then it's an
krh@311
   159
       error (CONTRADICTION).
krh@311
   160
krh@311
   161
    OBSOLETES processing. NOTE: OBSOLETES are only matched against
krh@311
   162
    package names, not against arbitrary provided properties
krh@311
   163
krh@311
   164
    9. If a package being installed or updated-to OBSOLETES an
krh@311
   165
       installed package, then obsolete that package. (ie, remove it,
krh@311
   166
       but treat it as updated for purposes of dangling REQUIRES).
krh@311
   167
krh@311
   168
   10. If an already-installed package OBSOLETES a to-be-installed
krh@311
   169
       package, then it's an error. (ALREADY_OBSOLETE)
krh@311
   170
krh@311
   171
   11. If a package being installed or updated-to OBSOLETES another
krh@311
   172
       package being installed or updated-to, then it's an error
krh@311
   173
       (CONTRADICTION).
krh@311
   174
krh@311
   175
krh@311
   176
krh@311
   177
THE DEPSOLVER
krh@311
   178
-------------
krh@311
   179
krh@311
   180
We start with two razor_sets, system and upstream, and a list of
krh@311
   181
requested installations and removals.
krh@311
   182
krh@311
   183
    FIXME: what about multiple upstream repos? Having to deal with
krh@311
   184
    arbitrary numbers of razor_sets is possible, but will probably be
krh@311
   185
    messy... It might be easier to either store all upstream repo data
krh@311
   186
    in a single .rzdb file, or else merge all upstream .rzdb files
krh@311
   187
    together into a single razor_set at startup. (Or some combination
krh@311
   188
    of those.)
krh@311
   189
krh@311
   190
We create a bit array of the packages in each set, indicating which
krh@311
   191
ones are installed; the system bitarray starts out all 1s, and the
krh@311
   192
upstream bitarray all 0s. Each bit is only allowed to change state
krh@311
   193
once during the transaction; an installed package can be removed, or
krh@311
   194
an uninstalled package installed, but trying to reinstall a removed
krh@311
   195
package, or uninstall a newly-installed package is an error. This
krh@311
   196
means the packages break down into four categories:
krh@311
   197
krh@311
   198
    - installed       (1 bit in the system bit array)
krh@311
   199
    - to-be-removed   (0 bit in the system bit array)
krh@311
   200
    - to-be-installed (1 bit in the upstream bit array)
krh@311
   201
    - installable     (0 bit in the upstream bit array)
krh@311
   202
krh@311
   203
krh@311
   204
Depsolver algorithm:
krh@311
   205
krh@311
   206
    - Create new razor_transaction_packages ("rtp"s) for each
krh@311
   207
      requested install or remove. These will be "unresolved", because
krh@311
   208
      we haven't yet found the razor_packages that correspond to them.
krh@311
   209
krh@311
   210
    - while there are new rtps:
krh@311
   211
krh@311
   212
	- sort the new rtps
krh@311
   213
krh@311
   214
	- Walk the system property list, upstream property list, and
krh@311
   215
          new rtp list in parallel, and:
krh@311
   216
krh@311
   217
	    - For each uninstalled PROVIDES:
krh@311
   218
krh@311
   219
		- If the property is a valid package name (that is,
krh@311
   220
                  either it's a package providing its own name, or it
krh@311
   221
                  has a matching OBSOLETES), and it matches the name
krh@311
   222
                  of a new rtp of type INSTALL or FORCED_UPDATE with
krh@311
   223
                  an unresolved new_package:
krh@311
   224
krh@311
   225
		    - If the upstream package has the same version as
krh@311
   226
		      the system package, we have an UP_TO_DATE error
krh@311
   227
		      (FIXME: not quite right. This doesn't deal with
krh@311
   228
		      the case where we try to update an application
krh@311
   229
		      because of a library update, and it turns out
krh@311
   230
		      there's no new version of the application, but
krh@311
   231
		      there IS a compat package containing the old
krh@311
   232
		      version of the library.)
krh@311
   233
krh@311
   234
		    - Otherwise, set the rtp's new_package to point to
krh@311
   235
		      the package providing this property and set the
krh@311
   236
		      appropriate bit in the upstream bit array.
krh@311
   237
krh@311
   238
	    - For each to-be-installed non-file REQUIRES:
krh@311
   239
krh@311
   240
		- See if there's an installed or to-be-installed
krh@311
   241
		  package that PROVIDES that property.
krh@311
   242
krh@311
   243
		- If not, see if there's an installable package that
krh@311
   244
		  PROVIDES that property, and create a new INSTALL rtp
krh@311
   245
		  for it if so.
krh@311
   246
krh@311
   247
		- If not, see if there's a to-be-removed package that
krh@311
   248
		  PROVIDES that property. (If we find such a package,
krh@311
   249
		  we have a CONTRADICTION error.)
krh@311
   250
krh@311
   251
		- If none of the above, then we have an UNSATISFIABLE
krh@311
   252
                  error
krh@311
   253
krh@311
   254
	    - For each to-be-installed file REQUIRES:
krh@311
   255
krh@311
   256
		- (We create fake file PROVIDES to match file REQUIRES
krh@311
   257
                  when importing/merging razor sets, so if there is
krh@311
   258
                  already another installed package that REQUIRES this
krh@311
   259
                  file, there will be a PROVIDES listed for it as well.)
krh@311
   260
krh@311
   261
		- See if there's an installed package that PROVIDES
krh@311
   262
                  that file.
krh@311
   263
krh@311
   264
		- If not, do a binary search of the system file tree
krh@311
   265
                  looking to see if some installed package provides
krh@311
   266
                  that file but does not have a PROVIDES for it.
krh@311
   267
krh@311
   268
		- If not, see if there's an installable package that
krh@311
   269
		  PROVIDES that property, and create a new INSTALL rtp
krh@311
   270
		  for it if so.
krh@311
   271
krh@311
   272
		- (If we actually work with multiple upstream
krh@311
   273
                  razor_sets, then we will need to search the upstream
krh@311
   274
                  file trees at this point, because it's possible that
krh@311
   275
                  a package in one upstream repo would require a file
krh@311
   276
                  in another upstream repo. But if we merge the
krh@311
   277
                  multiple upstream repos into a single razor_set at
krh@311
   278
                  some point, then we would not need to do that,
krh@311
   279
                  because it would be guaranteed that we would have
krh@311
   280
                  already created a fake PROVIDES if any package
krh@311
   281
                  provides the file.)
krh@311
   282
krh@311
   283
		- If no installed or installable package provides the
krh@311
   284
		  file, see if there's a to-be-removed package that
krh@311
   285
		  provides the file. (If we find such a package, we
krh@311
   286
		  have a CONTRADICTION error.)
krh@311
   287
krh@311
   288
		- If none of the above, then we have an UNSATISFIABLE
krh@311
   289
                  error
krh@311
   290
krh@311
   291
	    - For each to-be-installed PROVIDES:
krh@311
   292
krh@311
   293
		- Check if the new PROVIDES conflicts with an
krh@311
   294
		  installed CONFLICTS. If so, create a new
krh@311
   295
		  FORCED_UPDATE rtp for the installed package, so we
krh@311
   296
		  can try to upgrade it to a non-conflicting version.
krh@311
   297
		  (If we can't, we'll have an OLD_CONFLICT error.)
krh@311
   298
krh@311
   299
		- Check if the new PROVIDES conflicts with an
krh@311
   300
                  installed OBSOLETES *and* the PROVIDES property
krh@311
   301
                  corresponds to the name of its package. (That is,
krh@311
   302
                  OBSOLETES are only matched against package names,
krh@311
   303
                  not arbitrary provided properties.) If so, we have
krh@311
   304
                  an ALREADY_OBSOLETE error.
krh@311
   305
krh@311
   306
		- Check if the new PROVIDES conflicts with a
krh@311
   307
		  to-be-installed CONFLICTS. If so, we have a
krh@311
   308
		  CONTRADICTION error.
krh@311
   309
krh@311
   310
	    - For each to-be-installed CONFLICTS:
krh@311
   311
krh@311
   312
		- Basically the reverse of the previous case: check if
krh@311
   313
		  the new CONFLICTS conflicts with an installed
krh@311
   314
		  PROVIDES. If so, create a new FORCED_UPDATE rtp for
krh@311
   315
		  the installed package, so we can try to upgrade it
krh@311
   316
		  to a non-conflicting version. (If we can't, we'll
krh@311
   317
		  have an NEW_CONFLICT error.)
krh@311
   318
krh@311
   319
		- Check if the new CONFLICTS conflicts with a
krh@311
   320
		  to-be-installed PROVIDES. If so, we have a
krh@311
   321
		  CONTRADICTION error.
krh@311
   322
krh@311
   323
	    - For each to-be-installed OBSOLETES:
krh@311
   324
krh@311
   325
		- Check if there's an installed package that PROVIDES
krh@311
   326
		  that property. If so, create an OBSOLETED rtp for
krh@311
   327
		  the installed package.
krh@311
   328
krh@311
   329
		- If not, check if there's a to-be-installed package
krh@311
   330
		  that PROVIDES that property. If so, we have a
krh@311
   331
		  CONTRADICTION error.
krh@311
   332
krh@311
   333
krh@311
   334
	    - For each installed PROVIDES:
krh@311
   335
krh@311
   336
		- If the property is a valid package name (that is,
krh@311
   337
                  it's a package providing its own name), and it
krh@311
   338
                  matches the name of a new rtp with an unresolved
krh@311
   339
                  old_package, then set the rtp's old_package to point
krh@311
   340
                  to the package providing this property and clear the
krh@311
   341
                  appropriate bit in the system bit array.
krh@311
   342
krh@311
   343
	    - For each to-be-removed PROVIDES:
krh@311
   344
krh@311
   345
		- If there's also an identical to-be-installed
krh@311
   346
		  PROVIDES, we're ok and can skip this
krh@311
   347
krh@311
   348
		- Otherwise, for each installed REQUIRES of this
krh@311
   349
                  property:
krh@311
   350
krh@311
   351
		    - Look for some other installed or to-be-installed
krh@311
   352
		      property that satisfies the REQUIRES.
krh@311
   353
krh@311
   354
		    - If there isn't one, then for each installed
krh@311
   355
		      package in this REQUIRES's package list:
krh@311
   356
krh@311
   357
			- If the PROVIDES was lost because the old
krh@311
   358
			  package was REMOVEd (not FORCED_UPDATE or
krh@311
   359
			  OBSOLETED), then create a new REMOVE rtp for
krh@311
   360
			  this package.
krh@311
   361
krh@311
   362
			- Otherwise, create a new FORCED_UPDATE rtp
krh@311
   363
                          for this package.
krh@311
   364
krh@311
   365
		- (We don't need to look at to-be-installed REQUIRES
krh@311
   366
		  of this property, because if there are any, they
krh@311
   367
		  will cause a CONTRADICTION error when we try to
krh@311
   368
		  re-satisfy them the next time through.)
krh@311
   369
  </para>
krh@311
   370
</chapter>