DEPSOLVE.txt
author Dan Winship <danw@gnome.org>
Thu Mar 13 08:29:05 2008 -0400 (2008-03-13)
changeset 178 1efd1945b81a
parent 161 7762db2848bf
child 179 a6dfb81e3484
permissions -rw-r--r--
Merge branch 'master' of git://people.freedesktop.org/~krh/razor
danw@174
     1
YUM vs RAZOR
danw@174
     2
------------
danw@174
     3
danw@174
     4
At a very high level, yum's depsolver does something roughly
danw@174
     5
equivalent to:
danw@174
     6
danw@174
     7
    - For each package being installed or removed
danw@174
     8
danw@174
     9
	- For each relevant property (provides, requires, conflicts,
danw@174
    10
          obsoletes):
danw@174
    11
danw@174
    12
	    - Figure out what additional packages need to be added to
danw@174
    13
	      or removed from the system to satisfy this property
danw@174
    14
danw@174
    15
which ends up being roughly O(N^2 * M) where N is the total number of
danw@174
    16
properties and M is the number of packages being acted on.
danw@174
    17
danw@174
    18
(I just figured that out off the top of my head, and I'm not totally
danw@174
    19
familiar with the yum code, so it may be wrong.)
danw@174
    20
danw@174
    21
Razor's depsolver is something like:
danw@174
    22
danw@174
    23
    - do {
danw@174
    24
danw@174
    25
	- For each property to be added to or removed from the system:
danw@174
    26
danw@174
    27
	    - Figure out what packages need to be added to or removed
danw@174
    28
	      from the system to satisfy this property
danw@174
    29
danw@174
    30
    - } until we stop adding/remove more packages
danw@174
    31
danw@174
    32
with the key being that it's very easy to find the PROVIDES
danw@174
    33
corresponding to a REQUIRES and vice versa, because the property
danw@174
    34
arrays are sorted, and so all properties with the same "name" will be
danw@174
    35
adjacent to one another in the array, allowing many dependencies to be
danw@174
    36
satisified in essentially constant time. (Actually... we've been
danw@174
    37
calling it constant, but it's really O(log N) for heavily-depended-on
danw@174
    38
packages, because the more packages you have, the more variations on
danw@174
    39
"requires foo", "requires foo = 1.1", "requires foo > 1.0", etc you're
danw@174
    40
going to have to scan through.)
danw@174
    41
danw@174
    42
Ideally though, each iteration of the inner loop body happens in
danw@174
    43
constant time, and thus the inner loop as a whole is O(N), and thus
danw@174
    44
the depsolver as a whole is O(N * M) (or at least, less than
danw@174
    45
O(N * M * log N).
danw@174
    46
danw@174
    47
danw@174
    48
FILE DEPENDENCIES
danw@174
    49
-----------------
danw@174
    50
danw@174
    51
Whenever we add a package with a file REQUIRES to a razor_set, we also
danw@174
    52
add a PROVIDES for that file to the package or packages which provide
danw@174
    53
that file. This means that if we later add another package that
danw@174
    54
requires the same file (eg, /bin/sh or /usr/bin/perl), we can resolve
danw@174
    55
its file requirement exactly like we would resolve a property
danw@174
    56
requirement, in nearly constant time.
danw@174
    57
danw@174
    58
When adding a *new* file requirement (ie, a requirement on a file that
danw@174
    59
no existing package depends on), we still have to scan through the
danw@174
    60
file tree, which is O(log N) in the number of files.
danw@174
    61
danw@174
    62
(AFAICT, there's no reason yum couldn't do the same optimization.
danw@174
    63
Also, AFAICT, yum currently sticks property dependencies and file
danw@174
    64
dependencies into the same hash table, so that if any package in the
danw@174
    65
transaction has a file dependency, it causes *property* dependencies
danw@174
    66
to become slower to resolve as well...)
danw@174
    67
danw@174
    68
danw@174
    69
THE DEPSOLVER
danw@174
    70
-------------
danw@174
    71
danw@157
    72
We start with two razor_sets, system and upstream, and a list of
danw@157
    73
requested installations and removals.
danw@157
    74
danw@157
    75
    FIXME: what about multiple upstream repos? Having to deal with
danw@157
    76
    arbitrary numbers of razor_sets is possible, but will probably be
danw@157
    77
    messy... It might be easier to either store all upstream repo data
danw@157
    78
    in a single .repo file, or else merge all upstream .repo files
danw@157
    79
    together into a single razor_set at startup. (Or some combination
danw@157
    80
    of those.)
danw@157
    81
danw@157
    82
We create a bit array of the packages in each set, indicating which
danw@157
    83
ones are installed; the system bitarray starts out all 1s, and the
danw@157
    84
upstream bitarray all 0s. Each bit is only allowed to change state
danw@157
    85
once during the transaction; an installed package can be removed, or
danw@157
    86
an uninstalled package installed, but trying to reinstall a removed
danw@157
    87
package, or uninstall a newly-installed package is an error. This
danw@157
    88
means the packages break down into four categories:
danw@157
    89
danw@157
    90
    - installed       (1 bit in the system bit array)
danw@157
    91
    - to-be-removed   (0 bit in the system bit array)
danw@157
    92
    - to-be-installed (1 bit in the upstream bit array)
danw@157
    93
    - installable     (0 bit in the upstream bit array)
danw@157
    94
danw@157
    95
danw@174
    96
Depsolver algorithm:
danw@157
    97
danw@158
    98
    - Create new razor_transaction_packages ("rtp"s) for each
danw@161
    99
      requested install or remove. These will be "unresolved", because
danw@161
   100
      we haven't yet found the razor_packages that correspond to them.
danw@157
   101
danw@158
   102
    - while there are new rtps:
danw@157
   103
danw@161
   104
	- sort the new rtps
danw@157
   105
danw@161
   106
	- Walk the system property list, upstream property list, and
danw@161
   107
          new rtp list in parallel, and:
danw@157
   108
danw@161
   109
	    - For each uninstalled PROVIDES:
danw@161
   110
danw@161
   111
		- If the property is a valid package name (that is,
danw@161
   112
                  either it's a package providing its own name, or it
danw@161
   113
                  has a matching OBSOLETES), and it matches the name
danw@161
   114
                  of a new rtp of type INSTALL or FORCED_UPDATE with
danw@174
   115
                  an unresolved new_package:
danw@174
   116
danw@174
   117
		    - If the upstream package has the same version as
danw@174
   118
		      the system package, we have an UP_TO_DATE error
danw@174
   119
		      (FIXME: not quite right. This doesn't deal with
danw@174
   120
		      the case where we try to update an application
danw@174
   121
		      because of a library update, and it turns out
danw@174
   122
		      there's no new version of the application, but
danw@174
   123
		      there IS a compat package containing the old
danw@174
   124
		      version of the library.)
danw@174
   125
danw@174
   126
		    - Otherwise, set the rtp's new_package to point to
danw@174
   127
		      the package providing this property and set the
danw@174
   128
		      appropriate bit in the upstream bit array.
danw@157
   129
danw@158
   130
	    - For each to-be-installed non-file REQUIRES:
danw@157
   131
danw@158
   132
		- See if there's an installed or to-be-installed
danw@158
   133
		  package that PROVIDES that property.
danw@157
   134
danw@158
   135
		- If not, see if there's an installable package that
danw@158
   136
		  PROVIDES that property, and create a new INSTALL rtp
danw@158
   137
		  for it if so.
danw@157
   138
danw@158
   139
		- If not, see if there's a to-be-removed package that
danw@158
   140
		  PROVIDES that property. (If we find such a package,
danw@158
   141
		  we have a CONTRADICTION error.)
danw@157
   142
danw@158
   143
		- If none of the above, then we have an UNSATISFIABLE
danw@158
   144
                  error
danw@157
   145
danw@158
   146
	    - For each to-be-installed file REQUIRES:
danw@157
   147
danw@158
   148
		- (We create fake file PROVIDES to match file REQUIRES
danw@158
   149
                  when importing/merging razor sets, so if there is
danw@158
   150
                  already another installed package that REQUIRES this
danw@158
   151
                  file, there will be a PROVIDES listed for it as well.)
danw@157
   152
danw@158
   153
		- See if there's an installed package that PROVIDES
danw@158
   154
                  that file.
danw@157
   155
danw@158
   156
		- If not, do a binary search of the system file tree
danw@158
   157
                  looking to see if some installed package provides
danw@158
   158
                  that file but does not have a PROVIDES for it.
danw@157
   159
danw@158
   160
		- If not, see if there's an installable package that
danw@158
   161
		  PROVIDES that property, and create a new INSTALL rtp
danw@158
   162
		  for it if so.
danw@157
   163
danw@158
   164
		- (If we actually work with multiple upstream
danw@158
   165
                  razor_sets, then we will need to search the upstream
danw@158
   166
                  file trees at this point, because it's possible that
danw@158
   167
                  a package in one upstream repo would require a file
danw@158
   168
                  in another upstream repo. But if we merge the
danw@158
   169
                  multiple upstream repos into a single razor_set at
danw@158
   170
                  some point, then we would not need to do that,
danw@158
   171
                  because it would be guaranteed that we would have
danw@158
   172
                  already created a fake PROVIDES if any package
danw@158
   173
                  provides the file.)
danw@157
   174
danw@158
   175
		- If no installed or installable package provides the
danw@158
   176
		  file, see if there's a to-be-removed package that
danw@158
   177
		  provides the file. (If we find such a package, we
danw@158
   178
		  have a CONTRADICTION error.)
danw@157
   179
danw@158
   180
		- If none of the above, then we have an UNSATISFIABLE
danw@158
   181
                  error
danw@157
   182
danw@158
   183
	    - For each to-be-installed PROVIDES:
danw@157
   184
danw@158
   185
		- Check if the new PROVIDES conflicts with an
danw@158
   186
		  installed CONFLICTS. If so, create a new
danw@158
   187
		  FORCED_UPDATE rtp for the installed package, so we
danw@158
   188
		  can try to upgrade it to a non-conflicting version.
danw@158
   189
		  (If we can't, we'll have an OLD_CONFLICT error.)
danw@157
   190
danw@158
   191
		- Check if the new PROVIDES conflicts with an
danw@158
   192
                  installed OBSOLETES *and* the PROVIDES property
danw@158
   193
                  corresponds to the name of its package. (That is,
danw@158
   194
                  OBSOLETES are only matched against package names,
danw@158
   195
                  not arbitrary provided properties.) If so, we have
danw@158
   196
                  an ALREADY_OBSOLETE error.
danw@157
   197
danw@158
   198
		- Check if the new PROVIDES conflicts with a
danw@158
   199
		  to-be-installed CONFLICTS. If so, we have a
danw@158
   200
		  CONTRADICTION error.
danw@157
   201
danw@158
   202
	    - For each to-be-installed CONFLICTS:
danw@157
   203
danw@158
   204
		- Basically the reverse of the previous case: check if
danw@158
   205
		  the new CONFLICTS conflicts with an installed
danw@158
   206
		  PROVIDES. If so, create a new FORCED_UPDATE rtp for
danw@158
   207
		  the installed package, so we can try to upgrade it
danw@158
   208
		  to a non-conflicting version. (If we can't, we'll
danw@158
   209
		  have an NEW_CONFLICT error.)
danw@157
   210
danw@158
   211
		- Check if the new CONFLICTS conflicts with a
danw@158
   212
		  to-be-installed PROVIDES. If so, we have a
danw@158
   213
		  CONTRADICTION error.
danw@157
   214
danw@158
   215
	    - For each to-be-installed OBSOLETES:
danw@157
   216
danw@158
   217
		- Check if there's an installed package that PROVIDES
danw@158
   218
		  that property. If so, create an OBSOLETED rtp for
danw@158
   219
		  the installed package.
danw@157
   220
danw@158
   221
		- If not, check if there's a to-be-installed package
danw@158
   222
		  that PROVIDES that property. If so, we have a
danw@158
   223
		  CONTRADICTION error.
danw@157
   224
danw@161
   225
danw@161
   226
	    - For each installed PROVIDES:
danw@161
   227
danw@161
   228
		- If the property is a valid package name (that is,
danw@161
   229
                  it's a package providing its own name), and it
danw@161
   230
                  matches the name of a new rtp with an unresolved
danw@161
   231
                  old_package, then set the rtp's old_package to point
danw@161
   232
                  to the package providing this property and clear the
danw@161
   233
                  appropriate bit in the system bit array.
danw@161
   234
danw@158
   235
	    - For each to-be-removed PROVIDES:
danw@157
   236
danw@158
   237
		- If there's also an identical to-be-installed
danw@158
   238
		  PROVIDES, we're ok and can skip this
danw@157
   239
danw@158
   240
		- Otherwise, for each installed REQUIRES of this
danw@158
   241
                  property:
danw@157
   242
danw@158
   243
		    - Look for some other installed or to-be-installed
danw@158
   244
		      property that satisfies the REQUIRES.
danw@157
   245
danw@158
   246
		    - If there isn't one, then for each installed
danw@158
   247
		      package in this REQUIRES's package list:
danw@157
   248
danw@158
   249
			- If the PROVIDES was lost because the old
danw@158
   250
			  package was REMOVEd (not FORCED_UPDATE or
danw@158
   251
			  OBSOLETED), then create a new REMOVE rtp for
danw@158
   252
			  this package.
danw@157
   253
danw@158
   254
			- Otherwise, create a new FORCED_UPDATE rtp
danw@158
   255
                          for this package.
danw@157
   256
danw@158
   257
		- (We don't need to look at to-be-installed REQUIRES
danw@158
   258
		  of this property, because if there are any, they
danw@158
   259
		  will cause a CONTRADICTION error when we try to
danw@158
   260
		  re-satisfy them the next time through.)