DEPSOLVE.txt
author Dan Winship <danw@gnome.org>
Thu Mar 13 08:29:05 2008 -0400 (2008-03-13)
changeset 178 1efd1945b81a
parent 161 7762db2848bf
child 179 a6dfb81e3484
permissions -rw-r--r--
Merge branch 'master' of git://people.freedesktop.org/~krh/razor
     1 YUM vs RAZOR
     2 ------------
     3 
     4 At a very high level, yum's depsolver does something roughly
     5 equivalent to:
     6 
     7     - For each package being installed or removed
     8 
     9 	- For each relevant property (provides, requires, conflicts,
    10           obsoletes):
    11 
    12 	    - Figure out what additional packages need to be added to
    13 	      or removed from the system to satisfy this property
    14 
    15 which ends up being roughly O(N^2 * M) where N is the total number of
    16 properties and M is the number of packages being acted on.
    17 
    18 (I just figured that out off the top of my head, and I'm not totally
    19 familiar with the yum code, so it may be wrong.)
    20 
    21 Razor's depsolver is something like:
    22 
    23     - do {
    24 
    25 	- For each property to be added to or removed from the system:
    26 
    27 	    - Figure out what packages need to be added to or removed
    28 	      from the system to satisfy this property
    29 
    30     - } until we stop adding/remove more packages
    31 
    32 with the key being that it's very easy to find the PROVIDES
    33 corresponding to a REQUIRES and vice versa, because the property
    34 arrays are sorted, and so all properties with the same "name" will be
    35 adjacent to one another in the array, allowing many dependencies to be
    36 satisified in essentially constant time. (Actually... we've been
    37 calling it constant, but it's really O(log N) for heavily-depended-on
    38 packages, because the more packages you have, the more variations on
    39 "requires foo", "requires foo = 1.1", "requires foo > 1.0", etc you're
    40 going to have to scan through.)
    41 
    42 Ideally though, each iteration of the inner loop body happens in
    43 constant time, and thus the inner loop as a whole is O(N), and thus
    44 the depsolver as a whole is O(N * M) (or at least, less than
    45 O(N * M * log N).
    46 
    47 
    48 FILE DEPENDENCIES
    49 -----------------
    50 
    51 Whenever we add a package with a file REQUIRES to a razor_set, we also
    52 add a PROVIDES for that file to the package or packages which provide
    53 that file. This means that if we later add another package that
    54 requires the same file (eg, /bin/sh or /usr/bin/perl), we can resolve
    55 its file requirement exactly like we would resolve a property
    56 requirement, in nearly constant time.
    57 
    58 When adding a *new* file requirement (ie, a requirement on a file that
    59 no existing package depends on), we still have to scan through the
    60 file tree, which is O(log N) in the number of files.
    61 
    62 (AFAICT, there's no reason yum couldn't do the same optimization.
    63 Also, AFAICT, yum currently sticks property dependencies and file
    64 dependencies into the same hash table, so that if any package in the
    65 transaction has a file dependency, it causes *property* dependencies
    66 to become slower to resolve as well...)
    67 
    68 
    69 THE DEPSOLVER
    70 -------------
    71 
    72 We start with two razor_sets, system and upstream, and a list of
    73 requested installations and removals.
    74 
    75     FIXME: what about multiple upstream repos? Having to deal with
    76     arbitrary numbers of razor_sets is possible, but will probably be
    77     messy... It might be easier to either store all upstream repo data
    78     in a single .repo file, or else merge all upstream .repo files
    79     together into a single razor_set at startup. (Or some combination
    80     of those.)
    81 
    82 We create a bit array of the packages in each set, indicating which
    83 ones are installed; the system bitarray starts out all 1s, and the
    84 upstream bitarray all 0s. Each bit is only allowed to change state
    85 once during the transaction; an installed package can be removed, or
    86 an uninstalled package installed, but trying to reinstall a removed
    87 package, or uninstall a newly-installed package is an error. This
    88 means the packages break down into four categories:
    89 
    90     - installed       (1 bit in the system bit array)
    91     - to-be-removed   (0 bit in the system bit array)
    92     - to-be-installed (1 bit in the upstream bit array)
    93     - installable     (0 bit in the upstream bit array)
    94 
    95 
    96 Depsolver algorithm:
    97 
    98     - Create new razor_transaction_packages ("rtp"s) for each
    99       requested install or remove. These will be "unresolved", because
   100       we haven't yet found the razor_packages that correspond to them.
   101 
   102     - while there are new rtps:
   103 
   104 	- sort the new rtps
   105 
   106 	- Walk the system property list, upstream property list, and
   107           new rtp list in parallel, and:
   108 
   109 	    - For each uninstalled PROVIDES:
   110 
   111 		- If the property is a valid package name (that is,
   112                   either it's a package providing its own name, or it
   113                   has a matching OBSOLETES), and it matches the name
   114                   of a new rtp of type INSTALL or FORCED_UPDATE with
   115                   an unresolved new_package:
   116 
   117 		    - If the upstream package has the same version as
   118 		      the system package, we have an UP_TO_DATE error
   119 		      (FIXME: not quite right. This doesn't deal with
   120 		      the case where we try to update an application
   121 		      because of a library update, and it turns out
   122 		      there's no new version of the application, but
   123 		      there IS a compat package containing the old
   124 		      version of the library.)
   125 
   126 		    - Otherwise, set the rtp's new_package to point to
   127 		      the package providing this property and set the
   128 		      appropriate bit in the upstream bit array.
   129 
   130 	    - For each to-be-installed non-file REQUIRES:
   131 
   132 		- See if there's an installed or to-be-installed
   133 		  package that PROVIDES that property.
   134 
   135 		- If not, see if there's an installable package that
   136 		  PROVIDES that property, and create a new INSTALL rtp
   137 		  for it if so.
   138 
   139 		- If not, see if there's a to-be-removed package that
   140 		  PROVIDES that property. (If we find such a package,
   141 		  we have a CONTRADICTION error.)
   142 
   143 		- If none of the above, then we have an UNSATISFIABLE
   144                   error
   145 
   146 	    - For each to-be-installed file REQUIRES:
   147 
   148 		- (We create fake file PROVIDES to match file REQUIRES
   149                   when importing/merging razor sets, so if there is
   150                   already another installed package that REQUIRES this
   151                   file, there will be a PROVIDES listed for it as well.)
   152 
   153 		- See if there's an installed package that PROVIDES
   154                   that file.
   155 
   156 		- If not, do a binary search of the system file tree
   157                   looking to see if some installed package provides
   158                   that file but does not have a PROVIDES for it.
   159 
   160 		- If not, see if there's an installable package that
   161 		  PROVIDES that property, and create a new INSTALL rtp
   162 		  for it if so.
   163 
   164 		- (If we actually work with multiple upstream
   165                   razor_sets, then we will need to search the upstream
   166                   file trees at this point, because it's possible that
   167                   a package in one upstream repo would require a file
   168                   in another upstream repo. But if we merge the
   169                   multiple upstream repos into a single razor_set at
   170                   some point, then we would not need to do that,
   171                   because it would be guaranteed that we would have
   172                   already created a fake PROVIDES if any package
   173                   provides the file.)
   174 
   175 		- If no installed or installable package provides the
   176 		  file, see if there's a to-be-removed package that
   177 		  provides the file. (If we find such a package, we
   178 		  have a CONTRADICTION error.)
   179 
   180 		- If none of the above, then we have an UNSATISFIABLE
   181                   error
   182 
   183 	    - For each to-be-installed PROVIDES:
   184 
   185 		- Check if the new PROVIDES conflicts with an
   186 		  installed CONFLICTS. If so, create a new
   187 		  FORCED_UPDATE rtp for the installed package, so we
   188 		  can try to upgrade it to a non-conflicting version.
   189 		  (If we can't, we'll have an OLD_CONFLICT error.)
   190 
   191 		- Check if the new PROVIDES conflicts with an
   192                   installed OBSOLETES *and* the PROVIDES property
   193                   corresponds to the name of its package. (That is,
   194                   OBSOLETES are only matched against package names,
   195                   not arbitrary provided properties.) If so, we have
   196                   an ALREADY_OBSOLETE error.
   197 
   198 		- Check if the new PROVIDES conflicts with a
   199 		  to-be-installed CONFLICTS. If so, we have a
   200 		  CONTRADICTION error.
   201 
   202 	    - For each to-be-installed CONFLICTS:
   203 
   204 		- Basically the reverse of the previous case: check if
   205 		  the new CONFLICTS conflicts with an installed
   206 		  PROVIDES. If so, create a new FORCED_UPDATE rtp for
   207 		  the installed package, so we can try to upgrade it
   208 		  to a non-conflicting version. (If we can't, we'll
   209 		  have an NEW_CONFLICT error.)
   210 
   211 		- Check if the new CONFLICTS conflicts with a
   212 		  to-be-installed PROVIDES. If so, we have a
   213 		  CONTRADICTION error.
   214 
   215 	    - For each to-be-installed OBSOLETES:
   216 
   217 		- Check if there's an installed package that PROVIDES
   218 		  that property. If so, create an OBSOLETED rtp for
   219 		  the installed package.
   220 
   221 		- If not, check if there's a to-be-installed package
   222 		  that PROVIDES that property. If so, we have a
   223 		  CONTRADICTION error.
   224 
   225 
   226 	    - For each installed PROVIDES:
   227 
   228 		- If the property is a valid package name (that is,
   229                   it's a package providing its own name), and it
   230                   matches the name of a new rtp with an unresolved
   231                   old_package, then set the rtp's old_package to point
   232                   to the package providing this property and clear the
   233                   appropriate bit in the system bit array.
   234 
   235 	    - For each to-be-removed PROVIDES:
   236 
   237 		- If there's also an identical to-be-installed
   238 		  PROVIDES, we're ok and can skip this
   239 
   240 		- Otherwise, for each installed REQUIRES of this
   241                   property:
   242 
   243 		    - Look for some other installed or to-be-installed
   244 		      property that satisfies the REQUIRES.
   245 
   246 		    - If there isn't one, then for each installed
   247 		      package in this REQUIRES's package list:
   248 
   249 			- If the PROVIDES was lost because the old
   250 			  package was REMOVEd (not FORCED_UPDATE or
   251 			  OBSOLETED), then create a new REMOVE rtp for
   252 			  this package.
   253 
   254 			- Otherwise, create a new FORCED_UPDATE rtp
   255                           for this package.
   256 
   257 		- (We don't need to look at to-be-installed REQUIRES
   258 		  of this property, because if there are any, they
   259 		  will cause a CONTRADICTION error when we try to
   260 		  re-satisfy them the next time through.)