DEPSOLVE.txt
author Kristian H?gsberg <krh@jiraiya.boston.redhat.com>
Wed Apr 09 02:41:03 2008 -0400 (2008-04-09)
changeset 211 cf0ca962262b
parent 174 ac1387f6aba4
permissions -rw-r--r--
Use the cpio headers instead of the rpm headers when unpacking.

The files in the cpio payload doesn't actually follow the file order
in the rpm headers, so we need to decode the cpio header and use the
information there.
danw@174
     1
YUM vs RAZOR
danw@174
     2
------------
danw@174
     3
danw@174
     4
At a very high level, yum's depsolver does something roughly
danw@174
     5
equivalent to:
danw@174
     6
danw@174
     7
    - For each package being installed or removed
danw@174
     8
danw@174
     9
	- For each relevant property (provides, requires, conflicts,
danw@174
    10
          obsoletes):
danw@174
    11
danw@174
    12
	    - Figure out what additional packages need to be added to
danw@174
    13
	      or removed from the system to satisfy this property
danw@174
    14
danw@174
    15
which ends up being roughly O(N^2 * M) where N is the total number of
danw@174
    16
properties and M is the number of packages being acted on.
danw@174
    17
danw@174
    18
(I just figured that out off the top of my head, and I'm not totally
danw@174
    19
familiar with the yum code, so it may be wrong.)
danw@174
    20
danw@174
    21
Razor's depsolver is something like:
danw@174
    22
danw@174
    23
    - do {
danw@174
    24
danw@174
    25
	- For each property to be added to or removed from the system:
danw@174
    26
danw@174
    27
	    - Figure out what packages need to be added to or removed
danw@174
    28
	      from the system to satisfy this property
danw@174
    29
danw@174
    30
    - } until we stop adding/remove more packages
danw@174
    31
danw@174
    32
with the key being that it's very easy to find the PROVIDES
danw@174
    33
corresponding to a REQUIRES and vice versa, because the property
danw@174
    34
arrays are sorted, and so all properties with the same "name" will be
danw@174
    35
adjacent to one another in the array, allowing many dependencies to be
danw@174
    36
satisified in essentially constant time. (Actually... we've been
danw@174
    37
calling it constant, but it's really O(log N) for heavily-depended-on
danw@174
    38
packages, because the more packages you have, the more variations on
danw@174
    39
"requires foo", "requires foo = 1.1", "requires foo > 1.0", etc you're
danw@174
    40
going to have to scan through.)
danw@174
    41
danw@174
    42
Ideally though, each iteration of the inner loop body happens in
danw@174
    43
constant time, and thus the inner loop as a whole is O(N), and thus
danw@174
    44
the depsolver as a whole is O(N * M) (or at least, less than
danw@174
    45
O(N * M * log N).
danw@174
    46
danw@174
    47
danw@174
    48
FILE DEPENDENCIES
danw@174
    49
-----------------
danw@174
    50
danw@174
    51
Whenever we add a package with a file REQUIRES to a razor_set, we also
danw@174
    52
add a PROVIDES for that file to the package or packages which provide
danw@174
    53
that file. This means that if we later add another package that
danw@174
    54
requires the same file (eg, /bin/sh or /usr/bin/perl), we can resolve
danw@174
    55
its file requirement exactly like we would resolve a property
danw@174
    56
requirement, in nearly constant time.
danw@174
    57
danw@174
    58
When adding a *new* file requirement (ie, a requirement on a file that
danw@174
    59
no existing package depends on), we still have to scan through the
danw@174
    60
file tree, which is O(log N) in the number of files.
danw@174
    61
danw@174
    62
(AFAICT, there's no reason yum couldn't do the same optimization.
danw@174
    63
Also, AFAICT, yum currently sticks property dependencies and file
danw@174
    64
dependencies into the same hash table, so that if any package in the
danw@174
    65
transaction has a file dependency, it causes *property* dependencies
danw@174
    66
to become slower to resolve as well...)
danw@174
    67
danw@174
    68
danw@179
    69
THE RULES
danw@179
    70
---------
danw@179
    71
danw@179
    72
This is what we have figured out for transaction-solving rules;
danw@179
    73
neither yum nor rpm's algorithm seems to be explained in full
danw@179
    74
anywhere...
danw@179
    75
danw@179
    76
    1. Every requested install in the initial package set must be
danw@179
    77
       satisfied as either a new install or an update:
danw@179
    78
danw@179
    79
	- if the requested package name is the name of an upstream
danw@179
    80
          package:
danw@179
    81
danw@179
    82
	    - if there is not a corresponding already-installed
danw@179
    83
              package, then install the upstream package
danw@179
    84
danw@179
    85
	    - else if the upstream package is newer than the
danw@179
    86
              already-installed package, then update the package
danw@179
    87
danw@179
    88
	    - else it's an error (UP_TO_DATE)
danw@179
    89
danw@179
    90
	- else if the requested package name is the name of an
danw@179
    91
          already-installed package:
danw@179
    92
danw@179
    93
	    - if there is an upstream package that obsoletes the
danw@179
    94
              already-installed package, then behave as though the
danw@179
    95
              user had requested that that package be installed
danw@179
    96
              instead.
danw@179
    97
danw@179
    98
	    - else it's an error (UP_TO_DATE or INSTALL_UNAVAILABLE?)
danw@179
    99
danw@179
   100
	- else it's an error (INSTALL_UNAVAILABLE)
danw@179
   101
danw@179
   102
    2. Every requested removal in the initial package set must be
danw@179
   103
       satisfied as a removal. If any requested package name is not
danw@179
   104
       the name of an installed package, it's an error
danw@179
   105
       (REMOVE_NOT_INSTALLED)
danw@179
   106
danw@179
   107
    REQUIRES processing:
danw@179
   108
danw@179
   109
    3. If a package being installed or updated-to REQUIRES a property
danw@179
   110
       that is not provided by any installed or to-be-installed
danw@179
   111
       package, we need to find an installable package that provides
danw@179
   112
       that property. If we find one, install/update it. If not, it's
danw@179
   113
       an error (UNSATISFIABLE). (If we find an upstream package
danw@179
   114
       providing the property that corresponds to a system package
danw@179
   115
       that's being removed, then it's a CONTRADICTION.)
danw@179
   116
danw@179
   117
    4. If an already-installed package REQUIRES a property which is
danw@179
   118
       only provided by a package that is being removed, then that
danw@179
   119
       package needs to be removed as well.
danw@179
   120
danw@179
   121
    5. If an already-installed package REQUIRES a property which is
danw@179
   122
       only provided by a package that is being upgraded or obsoleted
danw@179
   123
       (to a new package which does not provide that property), then:
danw@179
   124
danw@179
   125
	- if there is an update for the installed package, then update
danw@179
   126
          the installed package
danw@179
   127
danw@179
   128
	- else if there is another installable package that provides
danw@179
   129
          the required property, then install that.
danw@179
   130
danw@179
   131
	- else it's an error (UNSATISFIABLE?)
danw@179
   132
danw@179
   133
    CONFLICTS processing
danw@179
   134
danw@179
   135
    6. If a package being installed or updated-to CONFLICTS with a
danw@179
   136
       property provided by an installed package:
danw@179
   137
danw@179
   138
	- if there is an update for the installed package, which the
danw@179
   139
          new package does not conflict with, then update the
danw@179
   140
          installed package.
danw@179
   141
danw@179
   142
	- else it's an error (NEW_CONFLICT)
danw@179
   143
danw@179
   144
    7. If an already-installed package CONFLICTS with a property
danw@179
   145
       provided by a to-be-installed package:
danw@179
   146
danw@179
   147
	- if there is an update for the installed package, which does
danw@179
   148
          not conflict with the new package, then update the installed
danw@179
   149
          package.
danw@179
   150
danw@179
   151
	- else it's an error (NEW_CONFLICT)
danw@179
   152
danw@179
   153
    8. If a package being installed or updated-to CONFLICTS with a
danw@179
   154
       property provided by a to-be-installed package, then it's an
danw@179
   155
       error (CONTRADICTION).
danw@179
   156
danw@179
   157
    OBSOLETES processing. NOTE: OBSOLETES are only matched against
danw@179
   158
    package names, not against arbitrary provided properties
danw@179
   159
danw@179
   160
    9. If a package being installed or updated-to OBSOLETES an
danw@179
   161
       installed package, then obsolete that package. (ie, remove it,
danw@179
   162
       but treat it as updated for purposes of dangling REQUIRES).
danw@179
   163
danw@179
   164
   10. If an already-installed package OBSOLETES a to-be-installed
danw@179
   165
       package, then it's an error. (ALREADY_OBSOLETE)
danw@179
   166
danw@179
   167
   11. If a package being installed or updated-to OBSOLETES another
danw@179
   168
       package being installed or updated-to, then it's an error
danw@179
   169
       (CONTRADICTION).
danw@179
   170
danw@179
   171
danw@179
   172
danw@174
   173
THE DEPSOLVER
danw@174
   174
-------------
danw@174
   175
danw@157
   176
We start with two razor_sets, system and upstream, and a list of
danw@157
   177
requested installations and removals.
danw@157
   178
danw@157
   179
    FIXME: what about multiple upstream repos? Having to deal with
danw@157
   180
    arbitrary numbers of razor_sets is possible, but will probably be
danw@157
   181
    messy... It might be easier to either store all upstream repo data
danw@157
   182
    in a single .repo file, or else merge all upstream .repo files
danw@157
   183
    together into a single razor_set at startup. (Or some combination
danw@157
   184
    of those.)
danw@157
   185
danw@157
   186
We create a bit array of the packages in each set, indicating which
danw@157
   187
ones are installed; the system bitarray starts out all 1s, and the
danw@157
   188
upstream bitarray all 0s. Each bit is only allowed to change state
danw@157
   189
once during the transaction; an installed package can be removed, or
danw@157
   190
an uninstalled package installed, but trying to reinstall a removed
danw@157
   191
package, or uninstall a newly-installed package is an error. This
danw@157
   192
means the packages break down into four categories:
danw@157
   193
danw@157
   194
    - installed       (1 bit in the system bit array)
danw@157
   195
    - to-be-removed   (0 bit in the system bit array)
danw@157
   196
    - to-be-installed (1 bit in the upstream bit array)
danw@157
   197
    - installable     (0 bit in the upstream bit array)
danw@157
   198
danw@157
   199
danw@174
   200
Depsolver algorithm:
danw@157
   201
danw@158
   202
    - Create new razor_transaction_packages ("rtp"s) for each
danw@161
   203
      requested install or remove. These will be "unresolved", because
danw@161
   204
      we haven't yet found the razor_packages that correspond to them.
danw@157
   205
danw@158
   206
    - while there are new rtps:
danw@157
   207
danw@161
   208
	- sort the new rtps
danw@157
   209
danw@161
   210
	- Walk the system property list, upstream property list, and
danw@161
   211
          new rtp list in parallel, and:
danw@157
   212
danw@161
   213
	    - For each uninstalled PROVIDES:
danw@161
   214
danw@161
   215
		- If the property is a valid package name (that is,
danw@161
   216
                  either it's a package providing its own name, or it
danw@161
   217
                  has a matching OBSOLETES), and it matches the name
danw@161
   218
                  of a new rtp of type INSTALL or FORCED_UPDATE with
danw@174
   219
                  an unresolved new_package:
danw@174
   220
danw@174
   221
		    - If the upstream package has the same version as
danw@174
   222
		      the system package, we have an UP_TO_DATE error
danw@174
   223
		      (FIXME: not quite right. This doesn't deal with
danw@174
   224
		      the case where we try to update an application
danw@174
   225
		      because of a library update, and it turns out
danw@174
   226
		      there's no new version of the application, but
danw@174
   227
		      there IS a compat package containing the old
danw@174
   228
		      version of the library.)
danw@174
   229
danw@174
   230
		    - Otherwise, set the rtp's new_package to point to
danw@174
   231
		      the package providing this property and set the
danw@174
   232
		      appropriate bit in the upstream bit array.
danw@157
   233
danw@158
   234
	    - For each to-be-installed non-file REQUIRES:
danw@157
   235
danw@158
   236
		- See if there's an installed or to-be-installed
danw@158
   237
		  package that PROVIDES that property.
danw@157
   238
danw@158
   239
		- If not, see if there's an installable package that
danw@158
   240
		  PROVIDES that property, and create a new INSTALL rtp
danw@158
   241
		  for it if so.
danw@157
   242
danw@158
   243
		- If not, see if there's a to-be-removed package that
danw@158
   244
		  PROVIDES that property. (If we find such a package,
danw@158
   245
		  we have a CONTRADICTION error.)
danw@157
   246
danw@158
   247
		- If none of the above, then we have an UNSATISFIABLE
danw@158
   248
                  error
danw@157
   249
danw@158
   250
	    - For each to-be-installed file REQUIRES:
danw@157
   251
danw@158
   252
		- (We create fake file PROVIDES to match file REQUIRES
danw@158
   253
                  when importing/merging razor sets, so if there is
danw@158
   254
                  already another installed package that REQUIRES this
danw@158
   255
                  file, there will be a PROVIDES listed for it as well.)
danw@157
   256
danw@158
   257
		- See if there's an installed package that PROVIDES
danw@158
   258
                  that file.
danw@157
   259
danw@158
   260
		- If not, do a binary search of the system file tree
danw@158
   261
                  looking to see if some installed package provides
danw@158
   262
                  that file but does not have a PROVIDES for it.
danw@157
   263
danw@158
   264
		- If not, see if there's an installable package that
danw@158
   265
		  PROVIDES that property, and create a new INSTALL rtp
danw@158
   266
		  for it if so.
danw@157
   267
danw@158
   268
		- (If we actually work with multiple upstream
danw@158
   269
                  razor_sets, then we will need to search the upstream
danw@158
   270
                  file trees at this point, because it's possible that
danw@158
   271
                  a package in one upstream repo would require a file
danw@158
   272
                  in another upstream repo. But if we merge the
danw@158
   273
                  multiple upstream repos into a single razor_set at
danw@158
   274
                  some point, then we would not need to do that,
danw@158
   275
                  because it would be guaranteed that we would have
danw@158
   276
                  already created a fake PROVIDES if any package
danw@158
   277
                  provides the file.)
danw@157
   278
danw@158
   279
		- If no installed or installable package provides the
danw@158
   280
		  file, see if there's a to-be-removed package that
danw@158
   281
		  provides the file. (If we find such a package, we
danw@158
   282
		  have a CONTRADICTION error.)
danw@157
   283
danw@158
   284
		- If none of the above, then we have an UNSATISFIABLE
danw@158
   285
                  error
danw@157
   286
danw@158
   287
	    - For each to-be-installed PROVIDES:
danw@157
   288
danw@158
   289
		- Check if the new PROVIDES conflicts with an
danw@158
   290
		  installed CONFLICTS. If so, create a new
danw@158
   291
		  FORCED_UPDATE rtp for the installed package, so we
danw@158
   292
		  can try to upgrade it to a non-conflicting version.
danw@158
   293
		  (If we can't, we'll have an OLD_CONFLICT error.)
danw@157
   294
danw@158
   295
		- Check if the new PROVIDES conflicts with an
danw@158
   296
                  installed OBSOLETES *and* the PROVIDES property
danw@158
   297
                  corresponds to the name of its package. (That is,
danw@158
   298
                  OBSOLETES are only matched against package names,
danw@158
   299
                  not arbitrary provided properties.) If so, we have
danw@158
   300
                  an ALREADY_OBSOLETE error.
danw@157
   301
danw@158
   302
		- Check if the new PROVIDES conflicts with a
danw@158
   303
		  to-be-installed CONFLICTS. If so, we have a
danw@158
   304
		  CONTRADICTION error.
danw@157
   305
danw@158
   306
	    - For each to-be-installed CONFLICTS:
danw@157
   307
danw@158
   308
		- Basically the reverse of the previous case: check if
danw@158
   309
		  the new CONFLICTS conflicts with an installed
danw@158
   310
		  PROVIDES. If so, create a new FORCED_UPDATE rtp for
danw@158
   311
		  the installed package, so we can try to upgrade it
danw@158
   312
		  to a non-conflicting version. (If we can't, we'll
danw@158
   313
		  have an NEW_CONFLICT error.)
danw@157
   314
danw@158
   315
		- Check if the new CONFLICTS conflicts with a
danw@158
   316
		  to-be-installed PROVIDES. If so, we have a
danw@158
   317
		  CONTRADICTION error.
danw@157
   318
danw@158
   319
	    - For each to-be-installed OBSOLETES:
danw@157
   320
danw@158
   321
		- Check if there's an installed package that PROVIDES
danw@158
   322
		  that property. If so, create an OBSOLETED rtp for
danw@158
   323
		  the installed package.
danw@157
   324
danw@158
   325
		- If not, check if there's a to-be-installed package
danw@158
   326
		  that PROVIDES that property. If so, we have a
danw@158
   327
		  CONTRADICTION error.
danw@157
   328
danw@161
   329
danw@161
   330
	    - For each installed PROVIDES:
danw@161
   331
danw@161
   332
		- If the property is a valid package name (that is,
danw@161
   333
                  it's a package providing its own name), and it
danw@161
   334
                  matches the name of a new rtp with an unresolved
danw@161
   335
                  old_package, then set the rtp's old_package to point
danw@161
   336
                  to the package providing this property and clear the
danw@161
   337
                  appropriate bit in the system bit array.
danw@161
   338
danw@158
   339
	    - For each to-be-removed PROVIDES:
danw@157
   340
danw@158
   341
		- If there's also an identical to-be-installed
danw@158
   342
		  PROVIDES, we're ok and can skip this
danw@157
   343
danw@158
   344
		- Otherwise, for each installed REQUIRES of this
danw@158
   345
                  property:
danw@157
   346
danw@158
   347
		    - Look for some other installed or to-be-installed
danw@158
   348
		      property that satisfies the REQUIRES.
danw@157
   349
danw@158
   350
		    - If there isn't one, then for each installed
danw@158
   351
		      package in this REQUIRES's package list:
danw@157
   352
danw@158
   353
			- If the PROVIDES was lost because the old
danw@158
   354
			  package was REMOVEd (not FORCED_UPDATE or
danw@158
   355
			  OBSOLETED), then create a new REMOVE rtp for
danw@158
   356
			  this package.
danw@157
   357
danw@158
   358
			- Otherwise, create a new FORCED_UPDATE rtp
danw@158
   359
                          for this package.
danw@157
   360
danw@158
   361
		- (We don't need to look at to-be-installed REQUIRES
danw@158
   362
		  of this property, because if there are any, they
danw@158
   363
		  will cause a CONTRADICTION error when we try to
danw@158
   364
		  re-satisfy them the next time through.)