TODO
author Kristian H?gsberg <krh@redhat.com>
Fri Sep 07 00:09:18 2007 -0400 (2007-09-07)
changeset 12 71a410830f3d
parent 5 4bdfd6031b3d
child 18 b2bf852ca8d1
permissions -rw-r--r--
Sort on version as second order key for provides and requires.
     1 - keep history of installed packages/journal of package transaction,
     2   so we can roll back to yesterday, or see what got installed in the
     3   latest yum update.
     4 
     5 - we build a cache of the currently installed set to service
     6   dependency inquiries fast:
     7 
     8 	map from property to pkg (as hash) providing it
     9 	map from property to pkgs requiring it
    10 	map from pkg name to manifest
    11 	map from string to string pool index
    12 
    13 	no implicit provides? not even pkgname?
    14 
    15 - properties are strings, stored in a string table
    16 
    17 - on disk maps are binary files of (string table index, hash) pairs
    18 
    19 - at run time, we mmap the map, and keep changes in memory in a splay
    20   tree or similar.  if searching the splay tree fails we punt to the
    21   mmap.  once the transaction is done, we merge the map and the splay
    22   tree and write it back out.
    23 
    24 - the on-disk string pool is sorted and we keep a list of indices into
    25   the string pool in sorted order so we can bsearch the list with a
    26   string to get its string pool index.  maybe a hash table is better,
    27   less I/O as we will expect to find the string within the block we
    28   look up with the hash function.
    29 
    30 - represent all files as a breadth first traversal of the tree of all
    31   files.  each entry has its name (string pool index), the number of
    32   immediate children, total number of children, and owning package.
    33   for files both these numbers are zero.  a file is identified by its
    34   index in this flattened tree.
    35 
    36   to get the file name from an index, we search through the list.  by
    37   summing up the number of children, we know when to skip a directory
    38   and when to descend into one.  as we go we accumulate the path
    39   elements.
    40 
    41   hmm, dropping number of immediate children and using a sentinel drops
    42   a word from every entry.
    43 
    44 - signed pkgs
    45 
    46 - gzip repository of look-aside pkg xml files somehow?
    47 
    48 - transactions, proper recovery, make sure we don't poop our package
    49   database (no more rm /var/lib/rpm/__cache*).
    50 
    51 - no external dependencies, forget about bdb, sqlite.  It's *simple*
    52   and we need to control the on-disk format for these tools.
    53 
    54 - 20740 requires, 2246 unique... hmm.