TODO
author Kristian H?gsberg <krh@redhat.com>
Wed Sep 12 05:31:07 2007 -0400 (2007-09-12)
changeset 19 d2a716dd92bd
parent 8 7820b7d94662
child 40 305cd8657bc8
permissions -rw-r--r--
Clean up writing/reading package sets a bit.
krh@1
     1
- keep history of installed packages/journal of package transaction,
krh@1
     2
  so we can roll back to yesterday, or see what got installed in the
krh@1
     3
  latest yum update.
krh@1
     4
krh@1
     5
- we build a cache of the currently installed set to service
krh@1
     6
  dependency inquiries fast:
krh@1
     7
krh@1
     8
	map from property to pkg (as hash) providing it
krh@1
     9
	map from property to pkgs requiring it
krh@1
    10
	map from pkg name to manifest
krh@1
    11
	map from string to string pool index
krh@1
    12
krh@1
    13
	no implicit provides? not even pkgname?
krh@1
    14
krh@1
    15
- properties are strings, stored in a string table
krh@1
    16
krh@1
    17
- on disk maps are binary files of (string table index, hash) pairs
krh@1
    18
krh@1
    19
- at run time, we mmap the map, and keep changes in memory in a splay
krh@1
    20
  tree or similar.  if searching the splay tree fails we punt to the
krh@1
    21
  mmap.  once the transaction is done, we merge the map and the splay
krh@1
    22
  tree and write it back out.
krh@1
    23
krh@1
    24
- the on-disk string pool is sorted and we keep a list of indices into
krh@1
    25
  the string pool in sorted order so we can bsearch the list with a
krh@1
    26
  string to get its string pool index.  maybe a hash table is better,
krh@1
    27
  less I/O as we will expect to find the string within the block we
krh@1
    28
  look up with the hash function.
krh@1
    29
krh@5
    30
- represent all files as a breadth first traversal of the tree of all
krh@5
    31
  files.  each entry has its name (string pool index), the number of
krh@5
    32
  immediate children, total number of children, and owning package.
krh@5
    33
  for files both these numbers are zero.  a file is identified by its
krh@5
    34
  index in this flattened tree.
krh@5
    35
krh@5
    36
  to get the file name from an index, we search through the list.  by
krh@5
    37
  summing up the number of children, we know when to skip a directory
krh@5
    38
  and when to descend into one.  as we go we accumulate the path
krh@5
    39
  elements.
krh@5
    40
krh@5
    41
  hmm, dropping number of immediate children and using a sentinel drops
krh@5
    42
  a word from every entry.
krh@5
    43
krh@1
    44
- signed pkgs
krh@8
    45
krh@8
    46
- gzip repository of look-aside pkg xml files somehow?
krh@8
    47
krh@8
    48
- transactions, proper recovery, make sure we don't poop our package
krh@8
    49
  database (no more rm /var/lib/rpm/__cache*).
krh@8
    50
krh@8
    51
- no external dependencies, forget about bdb, sqlite.  It's *simple*
krh@8
    52
  and we need to control the on-disk format for these tools.
krh@8
    53
krh@8
    54
- 20740 requires, 2246 unique... hmm.
krh@18
    55
krh@18
    56
- diff from one package set to another answers: "what changed in
krh@18
    57
  rawhide between since yesterday?"
krh@18
    58
krh@18
    59
- rewrite qsort and bsearch that doesn't require global context var
krh@18
    60
  and can output a map describing the permutaion.
krh@18
    61
krh@18
    62
- move package lists out of the property pool, remap properties and
krh@18
    63
  packages in the two pools by just iterating through the pools.
krh@18
    64
krh@18
    65
- use hash table for package and property lists so we only store
krh@18
    66
  unique lists (like for string pool).