TODO
author Kristian H?gsberg <krh@redhat.com>
Fri Jun 20 22:26:41 2008 -0400 (2008-06-20)
changeset 255 5cd6aa72bbd5
parent 213 8f8b782b7a0e
child 299 d4f7f167b8bb
permissions -rw-r--r--
Return if we fail to open root.
     1 Towards replacing rpm + yum (0.1):
     2 
     3 - drop the filelists from the main package set file, split out to a
     4   secondary file.  for package sets that depend on other package sets,
     5   we need to be able to generate properties with owning packages that
     6   are in another set.  this way, a package that requires a file, will
     7   look up the provides in the set and find the package that owns it
     8   and then try mark that for update.
     9 
    10 - installer part:
    11 
    12    - pre install check; check that dirs can be created (no files where
    13      want to create dirs), move config files according to file
    14      flags. (.rpmnew etc)
    15 
    16    - store rpm headers for installed packages.
    17 
    18    - implement rpm uninstall and update.
    19 
    20    - triggers? just say no?
    21 
    22 - rpm seems to consider glibc > 2.6.90 to mean greater than
    23   2.6.90-anything.  That is, a comparison that doesn't mention the
    24   release field, shouldn't regard the release field of pkgs it
    25   compares against.  glibc-common-2.6.90 has
    26 
    27 	conflicts: glibc < 2.6.90, glibc > 2.6.90
    28 
    29   since rpm doesn't let you do glibc != 2.6.90, and
    30 
    31 	requires: glibc = 2.6.90
    32 
    33   will always pull in glibc.  But even with a != relation, would
    34   glibc-2.6.90-16 be equal to 2.6.90?  glibc 2.7.90-8 dropped it in
    35   favor of requires = 2.7.90-8 (#225806).
    36 
    37 - signed packages
    38 
    39 - space calculation before transaction, but ideally, do a number of
    40   smaller transactions.
    41 
    42 - pre-link changing binaries and libs on disk screwing up checksum?
    43 
    44 - pipelined download and install; topo-sort packages in update set,
    45   pick one with all deps in the current set, add it to the current set
    46   and satisfy deps against update set => result: minimal update
    47   transaction.  Queue download and install/update transaction for the
    48   packages in the minimal set, start over.  This also makes the
    49   installation phase much more interruptible, basically just stop
    50   after a sub-transaction finishes.  As we keep the update set around
    51   as a target, we can restart if needed.  Probably don't need to, can
    52   just do a new update.  During a sub-transaction we should keep the
    53   target set (i.e. the current set to be) around as a lock file
    54   (system.repo.lock or so, see git) so that razor updates are
    55   prevented if the systems crashes during an update.
    56 
    57 - implement depsolving between multiple package sets by creating an
    58   iterator that has a sorted list of all installed pkgs from all sets,
    59   all installed requires from all sets, all installed provides from
    60   all sets etc.  could be a list of tuples (pkgs index, set index).
    61   should simplify even the two-set depsolving a bit since we can
    62   pretend there's just one set.  this should also be useful for the
    63   'overlay set' idea where the system set is actually made up of a
    64   number of sets, but typically a read-only set from a read-only fs
    65   and a read-write set from a r/w fs.
    66 
    67 - locking: we use advisory file locking on the system set
    68   (/var/lib/razor/system.repo) to indicate a transaction is in
    69   progress.  The locking algorithm is as follows:
    70 
    71     1. obtain advisory lock on system set.  if this is already taken,
    72        we know that a process is actively modifying the system set and
    73        we have to wait.  there's a fcntl that lets you block for the
    74        lock to go away.
    75 
    76     2. if a system-next.repo file already exists an earlier razor
    77        process was interrupted or crashed and we may want to clean
    78        that up.  the system-next.repo file will record what the
    79        previous instance was trying to do and we can just replay that
    80        to clean up.
    81 
    82     3. create the new package set whichever way and write it to
    83        system-next.repo, then start installing/removing rpms.
    84 
    85     4. When the update is complete, rename system-next.repo to
    86        system.repo and remove the advisory lock.
    87 
    88   we should probably introduce a new object that encapsulates this
    89   sequence, the filename conventions, rpm cache, e.g. struct
    90   razor_image, with operations such as
    91 
    92 	#define RAZOR_IMAGE_READ	0x01
    93 	#define RAZOR_IMAGE_WRITE	0x02
    94 
    95 	struct razor_image *
    96 	razor_image_open(const char *root, unsigned int flags);
    97 
    98 	int
    99 	razor_image_begin_transaction(struct razor_image *image,
   100 				      struct razor_set *target);
   101 
   102 	int
   103 	razor_image_finish_transaction(struct razor_image *image);
   104 
   105   the transaction pipelineing described above sits on top of this,
   106   since each step there needs to complete a full transaction that
   107   writes out a new package set.
   108 
   109   for overlay package sets we could do something like
   110 
   111 	struct razor_image *
   112 	razor_image_open_with_base(const char *root, unsigned int flags,
   113 				   struct razor_image *base);
   114 
   115   where base specifies the r/o package set it's layered on.  this
   116   allows for stacking several layers of images.
   117 
   118 
   119 Package set file format items:
   120 
   121 - drop the 4k section alignment
   122 
   123 - just use strings for header identifiers, make the string pool
   124   section have a fixed string (maybe make "strings" always the first
   125   string so its index is 0), or maybe just require that it's the first
   126   section in the file.
   127 
   128 - nail down byte-order of repo file.
   129 
   130 - version the sections in the file, put the element size in the header
   131   so we can add stuff to elements in a backwards compatible way.
   132   maybe not necessary, we can just add sections that augment the
   133   sections we want to add to (similar to how rpm has add versioned
   134   deps).
   135 
   136 
   137 Misc ideas:
   138 
   139 - keep history of installed packages/journal of package transaction,
   140   so we can roll back to yesterday, or see what got installed in the
   141   latest yum update.
   142 
   143 - transactions, proper recovery, make sure we don't poop our package
   144   database (no more rm /var/lib/rpm/__cache*).
   145 
   146 - use hash table for package and property lists so we only store
   147   unique lists (like for string pool).
   148 
   149 - use existing, running system as repo; eg
   150 
   151 	razor update razor://other-box.local evince
   152 
   153   to pull eg the latest evince and dependencies from another box.  We
   154   should be able to regenerate a rzr pkg from the system so we can
   155   reuse the signature from the originating repo.
   156 
   157 - Ok, maybe the fastest package set merge method in the end is to use
   158   the razor_importer, but use a hash table for the properties.  This
   159   way we can assign them unique IDs immediately (like tokenizing
   160   strings).
   161 
   162 - test suite should be easy, just keep .repo files around and test
   163   different type of upgrades that way (obsoletes, conflicts, file
   164   conflicts, file/dir problems etc).  Or maybe just keep a simple file
   165   format ad use a custom importer to create the .repo files.
   166 
   167 - try to clean up the
   168 
   169 	do { ... } while (((e++)->name & RAZOR_ENTRY_LAST) == 0);
   170 
   171   idiom for iteration of directories.
   172 
   173 - overlay package sets?  mount a read-only /usr over nfs or from the
   174   virt-host and have a local package set overlaid over the read-only
   175   one.  shouldn't need new features in the core package set data
   176   structure, but should be just conventions on top.  we have the base
   177   package set from the r/o system, the overlay set from the local
   178   system and we can have an effective package set which is the merge
   179   of everything from the overlay into the base set.  the effective set
   180   is easy to compute and we could do it on the fly or cache it.  or
   181   maybe the effective set is the on-disk representation and the
   182   overlay can be computed when needed, we just keep a link back to the
   183   base.
   184 
   185 - incremental rawhide repo updates? instead of downloading 10MB zipped
   186   repo every time, download a diff repo?  Should be pretty small,
   187   especially if we don't have file checksums in metadata.  Filenames
   188   and properties are for the most part already present, typically just
   189   a version bump plus maybe tweaking a couple requires.  The upstream
   190   repo can store multiple incremental updates in one big file and
   191   provide an index file that maps updates for a given date (we should
   192   use repo-file checksums though) to a range in the file: Download the
   193   index file, search for a match for your latest rawhide.repo file,
   194   download range of updates that brings it up to date.
   195 
   196 - use hash tables for dirs when importing files to avoid qsorting all
   197   files in rawhide.
   198 
   199 Bugs:
   200 
   201 - eliminate duplicate entries in package property lists.