docs/DEPSOLVE.txt
author Kristian H?gsberg <krh@redhat.com>
Thu Jun 19 15:09:48 2008 -0400 (2008-06-19)
changeset 246 f92d8239324e
child 310 9a7691262ce6
permissions -rw-r--r--
Handle NULL dirnames when importing rpms into a set.
rhughes@241
     1
YUM vs RAZOR
rhughes@241
     2
------------
rhughes@241
     3
rhughes@241
     4
At a very high level, yum's depsolver does something roughly
rhughes@241
     5
equivalent to:
rhughes@241
     6
rhughes@241
     7
    - For each package being installed or removed
rhughes@241
     8
rhughes@241
     9
	- For each relevant property (provides, requires, conflicts,
rhughes@241
    10
          obsoletes):
rhughes@241
    11
rhughes@241
    12
	    - Figure out what additional packages need to be added to
rhughes@241
    13
	      or removed from the system to satisfy this property
rhughes@241
    14
rhughes@241
    15
which ends up being roughly O(N^2 * M) where N is the total number of
rhughes@241
    16
properties and M is the number of packages being acted on.
rhughes@241
    17
rhughes@241
    18
(I just figured that out off the top of my head, and I'm not totally
rhughes@241
    19
familiar with the yum code, so it may be wrong.)
rhughes@241
    20
rhughes@241
    21
Razor's depsolver is something like:
rhughes@241
    22
rhughes@241
    23
    - do {
rhughes@241
    24
rhughes@241
    25
	- For each property to be added to or removed from the system:
rhughes@241
    26
rhughes@241
    27
	    - Figure out what packages need to be added to or removed
rhughes@241
    28
	      from the system to satisfy this property
rhughes@241
    29
rhughes@241
    30
    - } until we stop adding/remove more packages
rhughes@241
    31
rhughes@241
    32
with the key being that it's very easy to find the PROVIDES
rhughes@241
    33
corresponding to a REQUIRES and vice versa, because the property
rhughes@241
    34
arrays are sorted, and so all properties with the same "name" will be
rhughes@241
    35
adjacent to one another in the array, allowing many dependencies to be
rhughes@241
    36
satisified in essentially constant time. (Actually... we've been
rhughes@241
    37
calling it constant, but it's really O(log N) for heavily-depended-on
rhughes@241
    38
packages, because the more packages you have, the more variations on
rhughes@241
    39
"requires foo", "requires foo = 1.1", "requires foo > 1.0", etc you're
rhughes@241
    40
going to have to scan through.)
rhughes@241
    41
rhughes@241
    42
Ideally though, each iteration of the inner loop body happens in
rhughes@241
    43
constant time, and thus the inner loop as a whole is O(N), and thus
rhughes@241
    44
the depsolver as a whole is O(N * M) (or at least, less than
rhughes@241
    45
O(N * M * log N).
rhughes@241
    46
rhughes@241
    47
rhughes@241
    48
FILE DEPENDENCIES
rhughes@241
    49
-----------------
rhughes@241
    50
rhughes@241
    51
Whenever we add a package with a file REQUIRES to a razor_set, we also
rhughes@241
    52
add a PROVIDES for that file to the package or packages which provide
rhughes@241
    53
that file. This means that if we later add another package that
rhughes@241
    54
requires the same file (eg, /bin/sh or /usr/bin/perl), we can resolve
rhughes@241
    55
its file requirement exactly like we would resolve a property
rhughes@241
    56
requirement, in nearly constant time.
rhughes@241
    57
rhughes@241
    58
When adding a *new* file requirement (ie, a requirement on a file that
rhughes@241
    59
no existing package depends on), we still have to scan through the
rhughes@241
    60
file tree, which is O(log N) in the number of files.
rhughes@241
    61
rhughes@241
    62
(AFAICT, there's no reason yum couldn't do the same optimization.
rhughes@241
    63
Also, AFAICT, yum currently sticks property dependencies and file
rhughes@241
    64
dependencies into the same hash table, so that if any package in the
rhughes@241
    65
transaction has a file dependency, it causes *property* dependencies
rhughes@241
    66
to become slower to resolve as well...)
rhughes@241
    67
rhughes@241
    68
rhughes@241
    69
THE RULES
rhughes@241
    70
---------
rhughes@241
    71
rhughes@241
    72
This is what we have figured out for transaction-solving rules;
rhughes@241
    73
neither yum nor rpm's algorithm seems to be explained in full
rhughes@241
    74
anywhere...
rhughes@241
    75
rhughes@241
    76
    1. Every requested install in the initial package set must be
rhughes@241
    77
       satisfied as either a new install or an update:
rhughes@241
    78
rhughes@241
    79
	- if the requested package name is the name of an upstream
rhughes@241
    80
          package:
rhughes@241
    81
rhughes@241
    82
	    - if there is not a corresponding already-installed
rhughes@241
    83
              package, then install the upstream package
rhughes@241
    84
rhughes@241
    85
	    - else if the upstream package is newer than the
rhughes@241
    86
              already-installed package, then update the package
rhughes@241
    87
rhughes@241
    88
	    - else it's an error (UP_TO_DATE)
rhughes@241
    89
rhughes@241
    90
	- else if the requested package name is the name of an
rhughes@241
    91
          already-installed package:
rhughes@241
    92
rhughes@241
    93
	    - if there is an upstream package that obsoletes the
rhughes@241
    94
              already-installed package, then behave as though the
rhughes@241
    95
              user had requested that that package be installed
rhughes@241
    96
              instead.
rhughes@241
    97
rhughes@241
    98
	    - else it's an error (UP_TO_DATE or INSTALL_UNAVAILABLE?)
rhughes@241
    99
rhughes@241
   100
	- else it's an error (INSTALL_UNAVAILABLE)
rhughes@241
   101
rhughes@241
   102
    2. Every requested removal in the initial package set must be
rhughes@241
   103
       satisfied as a removal. If any requested package name is not
rhughes@241
   104
       the name of an installed package, it's an error
rhughes@241
   105
       (REMOVE_NOT_INSTALLED)
rhughes@241
   106
rhughes@241
   107
    REQUIRES processing:
rhughes@241
   108
rhughes@241
   109
    3. If a package being installed or updated-to REQUIRES a property
rhughes@241
   110
       that is not provided by any installed or to-be-installed
rhughes@241
   111
       package, we need to find an installable package that provides
rhughes@241
   112
       that property. If we find one, install/update it. If not, it's
rhughes@241
   113
       an error (UNSATISFIABLE). (If we find an upstream package
rhughes@241
   114
       providing the property that corresponds to a system package
rhughes@241
   115
       that's being removed, then it's a CONTRADICTION.)
rhughes@241
   116
rhughes@241
   117
    4. If an already-installed package REQUIRES a property which is
rhughes@241
   118
       only provided by a package that is being removed, then that
rhughes@241
   119
       package needs to be removed as well.
rhughes@241
   120
rhughes@241
   121
    5. If an already-installed package REQUIRES a property which is
rhughes@241
   122
       only provided by a package that is being upgraded or obsoleted
rhughes@241
   123
       (to a new package which does not provide that property), then:
rhughes@241
   124
rhughes@241
   125
	- if there is an update for the installed package, then update
rhughes@241
   126
          the installed package
rhughes@241
   127
rhughes@241
   128
	- else if there is another installable package that provides
rhughes@241
   129
          the required property, then install that.
rhughes@241
   130
rhughes@241
   131
	- else it's an error (UNSATISFIABLE?)
rhughes@241
   132
rhughes@241
   133
    CONFLICTS processing
rhughes@241
   134
rhughes@241
   135
    6. If a package being installed or updated-to CONFLICTS with a
rhughes@241
   136
       property provided by an installed package:
rhughes@241
   137
rhughes@241
   138
	- if there is an update for the installed package, which the
rhughes@241
   139
          new package does not conflict with, then update the
rhughes@241
   140
          installed package.
rhughes@241
   141
rhughes@241
   142
	- else it's an error (NEW_CONFLICT)
rhughes@241
   143
rhughes@241
   144
    7. If an already-installed package CONFLICTS with a property
rhughes@241
   145
       provided by a to-be-installed package:
rhughes@241
   146
rhughes@241
   147
	- if there is an update for the installed package, which does
rhughes@241
   148
          not conflict with the new package, then update the installed
rhughes@241
   149
          package.
rhughes@241
   150
rhughes@241
   151
	- else it's an error (NEW_CONFLICT)
rhughes@241
   152
rhughes@241
   153
    8. If a package being installed or updated-to CONFLICTS with a
rhughes@241
   154
       property provided by a to-be-installed package, then it's an
rhughes@241
   155
       error (CONTRADICTION).
rhughes@241
   156
rhughes@241
   157
    OBSOLETES processing. NOTE: OBSOLETES are only matched against
rhughes@241
   158
    package names, not against arbitrary provided properties
rhughes@241
   159
rhughes@241
   160
    9. If a package being installed or updated-to OBSOLETES an
rhughes@241
   161
       installed package, then obsolete that package. (ie, remove it,
rhughes@241
   162
       but treat it as updated for purposes of dangling REQUIRES).
rhughes@241
   163
rhughes@241
   164
   10. If an already-installed package OBSOLETES a to-be-installed
rhughes@241
   165
       package, then it's an error. (ALREADY_OBSOLETE)
rhughes@241
   166
rhughes@241
   167
   11. If a package being installed or updated-to OBSOLETES another
rhughes@241
   168
       package being installed or updated-to, then it's an error
rhughes@241
   169
       (CONTRADICTION).
rhughes@241
   170
rhughes@241
   171
rhughes@241
   172
rhughes@241
   173
THE DEPSOLVER
rhughes@241
   174
-------------
rhughes@241
   175
rhughes@241
   176
We start with two razor_sets, system and upstream, and a list of
rhughes@241
   177
requested installations and removals.
rhughes@241
   178
rhughes@241
   179
    FIXME: what about multiple upstream repos? Having to deal with
rhughes@241
   180
    arbitrary numbers of razor_sets is possible, but will probably be
rhughes@241
   181
    messy... It might be easier to either store all upstream repo data
rhughes@241
   182
    in a single .repo file, or else merge all upstream .repo files
rhughes@241
   183
    together into a single razor_set at startup. (Or some combination
rhughes@241
   184
    of those.)
rhughes@241
   185
rhughes@241
   186
We create a bit array of the packages in each set, indicating which
rhughes@241
   187
ones are installed; the system bitarray starts out all 1s, and the
rhughes@241
   188
upstream bitarray all 0s. Each bit is only allowed to change state
rhughes@241
   189
once during the transaction; an installed package can be removed, or
rhughes@241
   190
an uninstalled package installed, but trying to reinstall a removed
rhughes@241
   191
package, or uninstall a newly-installed package is an error. This
rhughes@241
   192
means the packages break down into four categories:
rhughes@241
   193
rhughes@241
   194
    - installed       (1 bit in the system bit array)
rhughes@241
   195
    - to-be-removed   (0 bit in the system bit array)
rhughes@241
   196
    - to-be-installed (1 bit in the upstream bit array)
rhughes@241
   197
    - installable     (0 bit in the upstream bit array)
rhughes@241
   198
rhughes@241
   199
rhughes@241
   200
Depsolver algorithm:
rhughes@241
   201
rhughes@241
   202
    - Create new razor_transaction_packages ("rtp"s) for each
rhughes@241
   203
      requested install or remove. These will be "unresolved", because
rhughes@241
   204
      we haven't yet found the razor_packages that correspond to them.
rhughes@241
   205
rhughes@241
   206
    - while there are new rtps:
rhughes@241
   207
rhughes@241
   208
	- sort the new rtps
rhughes@241
   209
rhughes@241
   210
	- Walk the system property list, upstream property list, and
rhughes@241
   211
          new rtp list in parallel, and:
rhughes@241
   212
rhughes@241
   213
	    - For each uninstalled PROVIDES:
rhughes@241
   214
rhughes@241
   215
		- If the property is a valid package name (that is,
rhughes@241
   216
                  either it's a package providing its own name, or it
rhughes@241
   217
                  has a matching OBSOLETES), and it matches the name
rhughes@241
   218
                  of a new rtp of type INSTALL or FORCED_UPDATE with
rhughes@241
   219
                  an unresolved new_package:
rhughes@241
   220
rhughes@241
   221
		    - If the upstream package has the same version as
rhughes@241
   222
		      the system package, we have an UP_TO_DATE error
rhughes@241
   223
		      (FIXME: not quite right. This doesn't deal with
rhughes@241
   224
		      the case where we try to update an application
rhughes@241
   225
		      because of a library update, and it turns out
rhughes@241
   226
		      there's no new version of the application, but
rhughes@241
   227
		      there IS a compat package containing the old
rhughes@241
   228
		      version of the library.)
rhughes@241
   229
rhughes@241
   230
		    - Otherwise, set the rtp's new_package to point to
rhughes@241
   231
		      the package providing this property and set the
rhughes@241
   232
		      appropriate bit in the upstream bit array.
rhughes@241
   233
rhughes@241
   234
	    - For each to-be-installed non-file REQUIRES:
rhughes@241
   235
rhughes@241
   236
		- See if there's an installed or to-be-installed
rhughes@241
   237
		  package that PROVIDES that property.
rhughes@241
   238
rhughes@241
   239
		- If not, see if there's an installable package that
rhughes@241
   240
		  PROVIDES that property, and create a new INSTALL rtp
rhughes@241
   241
		  for it if so.
rhughes@241
   242
rhughes@241
   243
		- If not, see if there's a to-be-removed package that
rhughes@241
   244
		  PROVIDES that property. (If we find such a package,
rhughes@241
   245
		  we have a CONTRADICTION error.)
rhughes@241
   246
rhughes@241
   247
		- If none of the above, then we have an UNSATISFIABLE
rhughes@241
   248
                  error
rhughes@241
   249
rhughes@241
   250
	    - For each to-be-installed file REQUIRES:
rhughes@241
   251
rhughes@241
   252
		- (We create fake file PROVIDES to match file REQUIRES
rhughes@241
   253
                  when importing/merging razor sets, so if there is
rhughes@241
   254
                  already another installed package that REQUIRES this
rhughes@241
   255
                  file, there will be a PROVIDES listed for it as well.)
rhughes@241
   256
rhughes@241
   257
		- See if there's an installed package that PROVIDES
rhughes@241
   258
                  that file.
rhughes@241
   259
rhughes@241
   260
		- If not, do a binary search of the system file tree
rhughes@241
   261
                  looking to see if some installed package provides
rhughes@241
   262
                  that file but does not have a PROVIDES for it.
rhughes@241
   263
rhughes@241
   264
		- If not, see if there's an installable package that
rhughes@241
   265
		  PROVIDES that property, and create a new INSTALL rtp
rhughes@241
   266
		  for it if so.
rhughes@241
   267
rhughes@241
   268
		- (If we actually work with multiple upstream
rhughes@241
   269
                  razor_sets, then we will need to search the upstream
rhughes@241
   270
                  file trees at this point, because it's possible that
rhughes@241
   271
                  a package in one upstream repo would require a file
rhughes@241
   272
                  in another upstream repo. But if we merge the
rhughes@241
   273
                  multiple upstream repos into a single razor_set at
rhughes@241
   274
                  some point, then we would not need to do that,
rhughes@241
   275
                  because it would be guaranteed that we would have
rhughes@241
   276
                  already created a fake PROVIDES if any package
rhughes@241
   277
                  provides the file.)
rhughes@241
   278
rhughes@241
   279
		- If no installed or installable package provides the
rhughes@241
   280
		  file, see if there's a to-be-removed package that
rhughes@241
   281
		  provides the file. (If we find such a package, we
rhughes@241
   282
		  have a CONTRADICTION error.)
rhughes@241
   283
rhughes@241
   284
		- If none of the above, then we have an UNSATISFIABLE
rhughes@241
   285
                  error
rhughes@241
   286
rhughes@241
   287
	    - For each to-be-installed PROVIDES:
rhughes@241
   288
rhughes@241
   289
		- Check if the new PROVIDES conflicts with an
rhughes@241
   290
		  installed CONFLICTS. If so, create a new
rhughes@241
   291
		  FORCED_UPDATE rtp for the installed package, so we
rhughes@241
   292
		  can try to upgrade it to a non-conflicting version.
rhughes@241
   293
		  (If we can't, we'll have an OLD_CONFLICT error.)
rhughes@241
   294
rhughes@241
   295
		- Check if the new PROVIDES conflicts with an
rhughes@241
   296
                  installed OBSOLETES *and* the PROVIDES property
rhughes@241
   297
                  corresponds to the name of its package. (That is,
rhughes@241
   298
                  OBSOLETES are only matched against package names,
rhughes@241
   299
                  not arbitrary provided properties.) If so, we have
rhughes@241
   300
                  an ALREADY_OBSOLETE error.
rhughes@241
   301
rhughes@241
   302
		- Check if the new PROVIDES conflicts with a
rhughes@241
   303
		  to-be-installed CONFLICTS. If so, we have a
rhughes@241
   304
		  CONTRADICTION error.
rhughes@241
   305
rhughes@241
   306
	    - For each to-be-installed CONFLICTS:
rhughes@241
   307
rhughes@241
   308
		- Basically the reverse of the previous case: check if
rhughes@241
   309
		  the new CONFLICTS conflicts with an installed
rhughes@241
   310
		  PROVIDES. If so, create a new FORCED_UPDATE rtp for
rhughes@241
   311
		  the installed package, so we can try to upgrade it
rhughes@241
   312
		  to a non-conflicting version. (If we can't, we'll
rhughes@241
   313
		  have an NEW_CONFLICT error.)
rhughes@241
   314
rhughes@241
   315
		- Check if the new CONFLICTS conflicts with a
rhughes@241
   316
		  to-be-installed PROVIDES. If so, we have a
rhughes@241
   317
		  CONTRADICTION error.
rhughes@241
   318
rhughes@241
   319
	    - For each to-be-installed OBSOLETES:
rhughes@241
   320
rhughes@241
   321
		- Check if there's an installed package that PROVIDES
rhughes@241
   322
		  that property. If so, create an OBSOLETED rtp for
rhughes@241
   323
		  the installed package.
rhughes@241
   324
rhughes@241
   325
		- If not, check if there's a to-be-installed package
rhughes@241
   326
		  that PROVIDES that property. If so, we have a
rhughes@241
   327
		  CONTRADICTION error.
rhughes@241
   328
rhughes@241
   329
rhughes@241
   330
	    - For each installed PROVIDES:
rhughes@241
   331
rhughes@241
   332
		- If the property is a valid package name (that is,
rhughes@241
   333
                  it's a package providing its own name), and it
rhughes@241
   334
                  matches the name of a new rtp with an unresolved
rhughes@241
   335
                  old_package, then set the rtp's old_package to point
rhughes@241
   336
                  to the package providing this property and clear the
rhughes@241
   337
                  appropriate bit in the system bit array.
rhughes@241
   338
rhughes@241
   339
	    - For each to-be-removed PROVIDES:
rhughes@241
   340
rhughes@241
   341
		- If there's also an identical to-be-installed
rhughes@241
   342
		  PROVIDES, we're ok and can skip this
rhughes@241
   343
rhughes@241
   344
		- Otherwise, for each installed REQUIRES of this
rhughes@241
   345
                  property:
rhughes@241
   346
rhughes@241
   347
		    - Look for some other installed or to-be-installed
rhughes@241
   348
		      property that satisfies the REQUIRES.
rhughes@241
   349
rhughes@241
   350
		    - If there isn't one, then for each installed
rhughes@241
   351
		      package in this REQUIRES's package list:
rhughes@241
   352
rhughes@241
   353
			- If the PROVIDES was lost because the old
rhughes@241
   354
			  package was REMOVEd (not FORCED_UPDATE or
rhughes@241
   355
			  OBSOLETED), then create a new REMOVE rtp for
rhughes@241
   356
			  this package.
rhughes@241
   357
rhughes@241
   358
			- Otherwise, create a new FORCED_UPDATE rtp
rhughes@241
   359
                          for this package.
rhughes@241
   360
rhughes@241
   361
		- (We don't need to look at to-be-installed REQUIRES
rhughes@241
   362
		  of this property, because if there are any, they
rhughes@241
   363
		  will cause a CONTRADICTION error when we try to
rhughes@241
   364
		  re-satisfy them the next time through.)