docs/DEPSOLVE.txt
author Kristian H?gsberg <krh@redhat.com>
Fri Jun 20 21:56:43 2008 -0400 (2008-06-20)
changeset 254 ccb1c11968ab
child 310 9a7691262ce6
permissions -rw-r--r--
Introduce install/remove iterators.

These iterator constructors lets you pass in two sets and creates
an iterator for the packages to remove or the packages to install.
The iterators will step through the packages in a sequence that respects
the pre, post, preun and postun modifiers.

Right now, the install order isn't actually implemented, this patch just
implements the API changes and updates the applications.
rhughes@241
     1
YUM vs RAZOR
rhughes@241
     2
------------
rhughes@241
     3
rhughes@241
     4
At a very high level, yum's depsolver does something roughly
rhughes@241
     5
equivalent to:
rhughes@241
     6
rhughes@241
     7
    - For each package being installed or removed
rhughes@241
     8
rhughes@241
     9
	- For each relevant property (provides, requires, conflicts,
rhughes@241
    10
          obsoletes):
rhughes@241
    11
rhughes@241
    12
	    - Figure out what additional packages need to be added to
rhughes@241
    13
	      or removed from the system to satisfy this property
rhughes@241
    14
rhughes@241
    15
which ends up being roughly O(N^2 * M) where N is the total number of
rhughes@241
    16
properties and M is the number of packages being acted on.
rhughes@241
    17
rhughes@241
    18
(I just figured that out off the top of my head, and I'm not totally
rhughes@241
    19
familiar with the yum code, so it may be wrong.)
rhughes@241
    20
rhughes@241
    21
Razor's depsolver is something like:
rhughes@241
    22
rhughes@241
    23
    - do {
rhughes@241
    24
rhughes@241
    25
	- For each property to be added to or removed from the system:
rhughes@241
    26
rhughes@241
    27
	    - Figure out what packages need to be added to or removed
rhughes@241
    28
	      from the system to satisfy this property
rhughes@241
    29
rhughes@241
    30
    - } until we stop adding/remove more packages
rhughes@241
    31
rhughes@241
    32
with the key being that it's very easy to find the PROVIDES
rhughes@241
    33
corresponding to a REQUIRES and vice versa, because the property
rhughes@241
    34
arrays are sorted, and so all properties with the same "name" will be
rhughes@241
    35
adjacent to one another in the array, allowing many dependencies to be
rhughes@241
    36
satisified in essentially constant time. (Actually... we've been
rhughes@241
    37
calling it constant, but it's really O(log N) for heavily-depended-on
rhughes@241
    38
packages, because the more packages you have, the more variations on
rhughes@241
    39
"requires foo", "requires foo = 1.1", "requires foo > 1.0", etc you're
rhughes@241
    40
going to have to scan through.)
rhughes@241
    41
rhughes@241
    42
Ideally though, each iteration of the inner loop body happens in
rhughes@241
    43
constant time, and thus the inner loop as a whole is O(N), and thus
rhughes@241
    44
the depsolver as a whole is O(N * M) (or at least, less than
rhughes@241
    45
O(N * M * log N).
rhughes@241
    46
rhughes@241
    47
rhughes@241
    48
FILE DEPENDENCIES
rhughes@241
    49
-----------------
rhughes@241
    50
rhughes@241
    51
Whenever we add a package with a file REQUIRES to a razor_set, we also
rhughes@241
    52
add a PROVIDES for that file to the package or packages which provide
rhughes@241
    53
that file. This means that if we later add another package that
rhughes@241
    54
requires the same file (eg, /bin/sh or /usr/bin/perl), we can resolve
rhughes@241
    55
its file requirement exactly like we would resolve a property
rhughes@241
    56
requirement, in nearly constant time.
rhughes@241
    57
rhughes@241
    58
When adding a *new* file requirement (ie, a requirement on a file that
rhughes@241
    59
no existing package depends on), we still have to scan through the
rhughes@241
    60
file tree, which is O(log N) in the number of files.
rhughes@241
    61
rhughes@241
    62
(AFAICT, there's no reason yum couldn't do the same optimization.
rhughes@241
    63
Also, AFAICT, yum currently sticks property dependencies and file
rhughes@241
    64
dependencies into the same hash table, so that if any package in the
rhughes@241
    65
transaction has a file dependency, it causes *property* dependencies
rhughes@241
    66
to become slower to resolve as well...)
rhughes@241
    67
rhughes@241
    68
rhughes@241
    69
THE RULES
rhughes@241
    70
---------
rhughes@241
    71
rhughes@241
    72
This is what we have figured out for transaction-solving rules;
rhughes@241
    73
neither yum nor rpm's algorithm seems to be explained in full
rhughes@241
    74
anywhere...
rhughes@241
    75
rhughes@241
    76
    1. Every requested install in the initial package set must be
rhughes@241
    77
       satisfied as either a new install or an update:
rhughes@241
    78
rhughes@241
    79
	- if the requested package name is the name of an upstream
rhughes@241
    80
          package:
rhughes@241
    81
rhughes@241
    82
	    - if there is not a corresponding already-installed
rhughes@241
    83
              package, then install the upstream package
rhughes@241
    84
rhughes@241
    85
	    - else if the upstream package is newer than the
rhughes@241
    86
              already-installed package, then update the package
rhughes@241
    87
rhughes@241
    88
	    - else it's an error (UP_TO_DATE)
rhughes@241
    89
rhughes@241
    90
	- else if the requested package name is the name of an
rhughes@241
    91
          already-installed package:
rhughes@241
    92
rhughes@241
    93
	    - if there is an upstream package that obsoletes the
rhughes@241
    94
              already-installed package, then behave as though the
rhughes@241
    95
              user had requested that that package be installed
rhughes@241
    96
              instead.
rhughes@241
    97
rhughes@241
    98
	    - else it's an error (UP_TO_DATE or INSTALL_UNAVAILABLE?)
rhughes@241
    99
rhughes@241
   100
	- else it's an error (INSTALL_UNAVAILABLE)
rhughes@241
   101
rhughes@241
   102
    2. Every requested removal in the initial package set must be
rhughes@241
   103
       satisfied as a removal. If any requested package name is not
rhughes@241
   104
       the name of an installed package, it's an error
rhughes@241
   105
       (REMOVE_NOT_INSTALLED)
rhughes@241
   106
rhughes@241
   107
    REQUIRES processing:
rhughes@241
   108
rhughes@241
   109
    3. If a package being installed or updated-to REQUIRES a property
rhughes@241
   110
       that is not provided by any installed or to-be-installed
rhughes@241
   111
       package, we need to find an installable package that provides
rhughes@241
   112
       that property. If we find one, install/update it. If not, it's
rhughes@241
   113
       an error (UNSATISFIABLE). (If we find an upstream package
rhughes@241
   114
       providing the property that corresponds to a system package
rhughes@241
   115
       that's being removed, then it's a CONTRADICTION.)
rhughes@241
   116
rhughes@241
   117
    4. If an already-installed package REQUIRES a property which is
rhughes@241
   118
       only provided by a package that is being removed, then that
rhughes@241
   119
       package needs to be removed as well.
rhughes@241
   120
rhughes@241
   121
    5. If an already-installed package REQUIRES a property which is
rhughes@241
   122
       only provided by a package that is being upgraded or obsoleted
rhughes@241
   123
       (to a new package which does not provide that property), then:
rhughes@241
   124
rhughes@241
   125
	- if there is an update for the installed package, then update
rhughes@241
   126
          the installed package
rhughes@241
   127
rhughes@241
   128
	- else if there is another installable package that provides
rhughes@241
   129
          the required property, then install that.
rhughes@241
   130
rhughes@241
   131
	- else it's an error (UNSATISFIABLE?)
rhughes@241
   132
rhughes@241
   133
    CONFLICTS processing
rhughes@241
   134
rhughes@241
   135
    6. If a package being installed or updated-to CONFLICTS with a
rhughes@241
   136
       property provided by an installed package:
rhughes@241
   137
rhughes@241
   138
	- if there is an update for the installed package, which the
rhughes@241
   139
          new package does not conflict with, then update the
rhughes@241
   140
          installed package.
rhughes@241
   141
rhughes@241
   142
	- else it's an error (NEW_CONFLICT)
rhughes@241
   143
rhughes@241
   144
    7. If an already-installed package CONFLICTS with a property
rhughes@241
   145
       provided by a to-be-installed package:
rhughes@241
   146
rhughes@241
   147
	- if there is an update for the installed package, which does
rhughes@241
   148
          not conflict with the new package, then update the installed
rhughes@241
   149
          package.
rhughes@241
   150
rhughes@241
   151
	- else it's an error (NEW_CONFLICT)
rhughes@241
   152
rhughes@241
   153
    8. If a package being installed or updated-to CONFLICTS with a
rhughes@241
   154
       property provided by a to-be-installed package, then it's an
rhughes@241
   155
       error (CONTRADICTION).
rhughes@241
   156
rhughes@241
   157
    OBSOLETES processing. NOTE: OBSOLETES are only matched against
rhughes@241
   158
    package names, not against arbitrary provided properties
rhughes@241
   159
rhughes@241
   160
    9. If a package being installed or updated-to OBSOLETES an
rhughes@241
   161
       installed package, then obsolete that package. (ie, remove it,
rhughes@241
   162
       but treat it as updated for purposes of dangling REQUIRES).
rhughes@241
   163
rhughes@241
   164
   10. If an already-installed package OBSOLETES a to-be-installed
rhughes@241
   165
       package, then it's an error. (ALREADY_OBSOLETE)
rhughes@241
   166
rhughes@241
   167
   11. If a package being installed or updated-to OBSOLETES another
rhughes@241
   168
       package being installed or updated-to, then it's an error
rhughes@241
   169
       (CONTRADICTION).
rhughes@241
   170
rhughes@241
   171
rhughes@241
   172
rhughes@241
   173
THE DEPSOLVER
rhughes@241
   174
-------------
rhughes@241
   175
rhughes@241
   176
We start with two razor_sets, system and upstream, and a list of
rhughes@241
   177
requested installations and removals.
rhughes@241
   178
rhughes@241
   179
    FIXME: what about multiple upstream repos? Having to deal with
rhughes@241
   180
    arbitrary numbers of razor_sets is possible, but will probably be
rhughes@241
   181
    messy... It might be easier to either store all upstream repo data
rhughes@241
   182
    in a single .repo file, or else merge all upstream .repo files
rhughes@241
   183
    together into a single razor_set at startup. (Or some combination
rhughes@241
   184
    of those.)
rhughes@241
   185
rhughes@241
   186
We create a bit array of the packages in each set, indicating which
rhughes@241
   187
ones are installed; the system bitarray starts out all 1s, and the
rhughes@241
   188
upstream bitarray all 0s. Each bit is only allowed to change state
rhughes@241
   189
once during the transaction; an installed package can be removed, or
rhughes@241
   190
an uninstalled package installed, but trying to reinstall a removed
rhughes@241
   191
package, or uninstall a newly-installed package is an error. This
rhughes@241
   192
means the packages break down into four categories:
rhughes@241
   193
rhughes@241
   194
    - installed       (1 bit in the system bit array)
rhughes@241
   195
    - to-be-removed   (0 bit in the system bit array)
rhughes@241
   196
    - to-be-installed (1 bit in the upstream bit array)
rhughes@241
   197
    - installable     (0 bit in the upstream bit array)
rhughes@241
   198
rhughes@241
   199
rhughes@241
   200
Depsolver algorithm:
rhughes@241
   201
rhughes@241
   202
    - Create new razor_transaction_packages ("rtp"s) for each
rhughes@241
   203
      requested install or remove. These will be "unresolved", because
rhughes@241
   204
      we haven't yet found the razor_packages that correspond to them.
rhughes@241
   205
rhughes@241
   206
    - while there are new rtps:
rhughes@241
   207
rhughes@241
   208
	- sort the new rtps
rhughes@241
   209
rhughes@241
   210
	- Walk the system property list, upstream property list, and
rhughes@241
   211
          new rtp list in parallel, and:
rhughes@241
   212
rhughes@241
   213
	    - For each uninstalled PROVIDES:
rhughes@241
   214
rhughes@241
   215
		- If the property is a valid package name (that is,
rhughes@241
   216
                  either it's a package providing its own name, or it
rhughes@241
   217
                  has a matching OBSOLETES), and it matches the name
rhughes@241
   218
                  of a new rtp of type INSTALL or FORCED_UPDATE with
rhughes@241
   219
                  an unresolved new_package:
rhughes@241
   220
rhughes@241
   221
		    - If the upstream package has the same version as
rhughes@241
   222
		      the system package, we have an UP_TO_DATE error
rhughes@241
   223
		      (FIXME: not quite right. This doesn't deal with
rhughes@241
   224
		      the case where we try to update an application
rhughes@241
   225
		      because of a library update, and it turns out
rhughes@241
   226
		      there's no new version of the application, but
rhughes@241
   227
		      there IS a compat package containing the old
rhughes@241
   228
		      version of the library.)
rhughes@241
   229
rhughes@241
   230
		    - Otherwise, set the rtp's new_package to point to
rhughes@241
   231
		      the package providing this property and set the
rhughes@241
   232
		      appropriate bit in the upstream bit array.
rhughes@241
   233
rhughes@241
   234
	    - For each to-be-installed non-file REQUIRES:
rhughes@241
   235
rhughes@241
   236
		- See if there's an installed or to-be-installed
rhughes@241
   237
		  package that PROVIDES that property.
rhughes@241
   238
rhughes@241
   239
		- If not, see if there's an installable package that
rhughes@241
   240
		  PROVIDES that property, and create a new INSTALL rtp
rhughes@241
   241
		  for it if so.
rhughes@241
   242
rhughes@241
   243
		- If not, see if there's a to-be-removed package that
rhughes@241
   244
		  PROVIDES that property. (If we find such a package,
rhughes@241
   245
		  we have a CONTRADICTION error.)
rhughes@241
   246
rhughes@241
   247
		- If none of the above, then we have an UNSATISFIABLE
rhughes@241
   248
                  error
rhughes@241
   249
rhughes@241
   250
	    - For each to-be-installed file REQUIRES:
rhughes@241
   251
rhughes@241
   252
		- (We create fake file PROVIDES to match file REQUIRES
rhughes@241
   253
                  when importing/merging razor sets, so if there is
rhughes@241
   254
                  already another installed package that REQUIRES this
rhughes@241
   255
                  file, there will be a PROVIDES listed for it as well.)
rhughes@241
   256
rhughes@241
   257
		- See if there's an installed package that PROVIDES
rhughes@241
   258
                  that file.
rhughes@241
   259
rhughes@241
   260
		- If not, do a binary search of the system file tree
rhughes@241
   261
                  looking to see if some installed package provides
rhughes@241
   262
                  that file but does not have a PROVIDES for it.
rhughes@241
   263
rhughes@241
   264
		- If not, see if there's an installable package that
rhughes@241
   265
		  PROVIDES that property, and create a new INSTALL rtp
rhughes@241
   266
		  for it if so.
rhughes@241
   267
rhughes@241
   268
		- (If we actually work with multiple upstream
rhughes@241
   269
                  razor_sets, then we will need to search the upstream
rhughes@241
   270
                  file trees at this point, because it's possible that
rhughes@241
   271
                  a package in one upstream repo would require a file
rhughes@241
   272
                  in another upstream repo. But if we merge the
rhughes@241
   273
                  multiple upstream repos into a single razor_set at
rhughes@241
   274
                  some point, then we would not need to do that,
rhughes@241
   275
                  because it would be guaranteed that we would have
rhughes@241
   276
                  already created a fake PROVIDES if any package
rhughes@241
   277
                  provides the file.)
rhughes@241
   278
rhughes@241
   279
		- If no installed or installable package provides the
rhughes@241
   280
		  file, see if there's a to-be-removed package that
rhughes@241
   281
		  provides the file. (If we find such a package, we
rhughes@241
   282
		  have a CONTRADICTION error.)
rhughes@241
   283
rhughes@241
   284
		- If none of the above, then we have an UNSATISFIABLE
rhughes@241
   285
                  error
rhughes@241
   286
rhughes@241
   287
	    - For each to-be-installed PROVIDES:
rhughes@241
   288
rhughes@241
   289
		- Check if the new PROVIDES conflicts with an
rhughes@241
   290
		  installed CONFLICTS. If so, create a new
rhughes@241
   291
		  FORCED_UPDATE rtp for the installed package, so we
rhughes@241
   292
		  can try to upgrade it to a non-conflicting version.
rhughes@241
   293
		  (If we can't, we'll have an OLD_CONFLICT error.)
rhughes@241
   294
rhughes@241
   295
		- Check if the new PROVIDES conflicts with an
rhughes@241
   296
                  installed OBSOLETES *and* the PROVIDES property
rhughes@241
   297
                  corresponds to the name of its package. (That is,
rhughes@241
   298
                  OBSOLETES are only matched against package names,
rhughes@241
   299
                  not arbitrary provided properties.) If so, we have
rhughes@241
   300
                  an ALREADY_OBSOLETE error.
rhughes@241
   301
rhughes@241
   302
		- Check if the new PROVIDES conflicts with a
rhughes@241
   303
		  to-be-installed CONFLICTS. If so, we have a
rhughes@241
   304
		  CONTRADICTION error.
rhughes@241
   305
rhughes@241
   306
	    - For each to-be-installed CONFLICTS:
rhughes@241
   307
rhughes@241
   308
		- Basically the reverse of the previous case: check if
rhughes@241
   309
		  the new CONFLICTS conflicts with an installed
rhughes@241
   310
		  PROVIDES. If so, create a new FORCED_UPDATE rtp for
rhughes@241
   311
		  the installed package, so we can try to upgrade it
rhughes@241
   312
		  to a non-conflicting version. (If we can't, we'll
rhughes@241
   313
		  have an NEW_CONFLICT error.)
rhughes@241
   314
rhughes@241
   315
		- Check if the new CONFLICTS conflicts with a
rhughes@241
   316
		  to-be-installed PROVIDES. If so, we have a
rhughes@241
   317
		  CONTRADICTION error.
rhughes@241
   318
rhughes@241
   319
	    - For each to-be-installed OBSOLETES:
rhughes@241
   320
rhughes@241
   321
		- Check if there's an installed package that PROVIDES
rhughes@241
   322
		  that property. If so, create an OBSOLETED rtp for
rhughes@241
   323
		  the installed package.
rhughes@241
   324
rhughes@241
   325
		- If not, check if there's a to-be-installed package
rhughes@241
   326
		  that PROVIDES that property. If so, we have a
rhughes@241
   327
		  CONTRADICTION error.
rhughes@241
   328
rhughes@241
   329
rhughes@241
   330
	    - For each installed PROVIDES:
rhughes@241
   331
rhughes@241
   332
		- If the property is a valid package name (that is,
rhughes@241
   333
                  it's a package providing its own name), and it
rhughes@241
   334
                  matches the name of a new rtp with an unresolved
rhughes@241
   335
                  old_package, then set the rtp's old_package to point
rhughes@241
   336
                  to the package providing this property and clear the
rhughes@241
   337
                  appropriate bit in the system bit array.
rhughes@241
   338
rhughes@241
   339
	    - For each to-be-removed PROVIDES:
rhughes@241
   340
rhughes@241
   341
		- If there's also an identical to-be-installed
rhughes@241
   342
		  PROVIDES, we're ok and can skip this
rhughes@241
   343
rhughes@241
   344
		- Otherwise, for each installed REQUIRES of this
rhughes@241
   345
                  property:
rhughes@241
   346
rhughes@241
   347
		    - Look for some other installed or to-be-installed
rhughes@241
   348
		      property that satisfies the REQUIRES.
rhughes@241
   349
rhughes@241
   350
		    - If there isn't one, then for each installed
rhughes@241
   351
		      package in this REQUIRES's package list:
rhughes@241
   352
rhughes@241
   353
			- If the PROVIDES was lost because the old
rhughes@241
   354
			  package was REMOVEd (not FORCED_UPDATE or
rhughes@241
   355
			  OBSOLETED), then create a new REMOVE rtp for
rhughes@241
   356
			  this package.
rhughes@241
   357
rhughes@241
   358
			- Otherwise, create a new FORCED_UPDATE rtp
rhughes@241
   359
                          for this package.
rhughes@241
   360
rhughes@241
   361
		- (We don't need to look at to-be-installed REQUIRES
rhughes@241
   362
		  of this property, because if there are any, they
rhughes@241
   363
		  will cause a CONTRADICTION error when we try to
rhughes@241
   364
		  re-satisfy them the next time through.)