TODO
author Kristian H?gsberg <krh@redhat.com>
Sun Sep 30 00:18:20 2007 -0400 (2007-09-30)
changeset 43 d37d57c99cac
parent 40 305cd8657bc8
child 47 b3c8d19f743e
permissions -rw-r--r--
Split command line interface out into main.c.
krh@1
     1
- keep history of installed packages/journal of package transaction,
krh@1
     2
  so we can roll back to yesterday, or see what got installed in the
krh@1
     3
  latest yum update.
krh@1
     4
krh@1
     5
- we build a cache of the currently installed set to service
krh@1
     6
  dependency inquiries fast:
krh@1
     7
krh@1
     8
	map from property to pkg (as hash) providing it
krh@1
     9
	map from property to pkgs requiring it
krh@1
    10
	map from pkg name to manifest
krh@1
    11
	map from string to string pool index
krh@1
    12
krh@1
    13
	no implicit provides? not even pkgname?
krh@1
    14
krh@1
    15
- properties are strings, stored in a string table
krh@1
    16
krh@1
    17
- on disk maps are binary files of (string table index, hash) pairs
krh@1
    18
krh@1
    19
- at run time, we mmap the map, and keep changes in memory in a splay
krh@1
    20
  tree or similar.  if searching the splay tree fails we punt to the
krh@1
    21
  mmap.  once the transaction is done, we merge the map and the splay
krh@1
    22
  tree and write it back out.
krh@1
    23
krh@1
    24
- the on-disk string pool is sorted and we keep a list of indices into
krh@1
    25
  the string pool in sorted order so we can bsearch the list with a
krh@1
    26
  string to get its string pool index.  maybe a hash table is better,
krh@1
    27
  less I/O as we will expect to find the string within the block we
krh@1
    28
  look up with the hash function.
krh@1
    29
krh@5
    30
- represent all files as a breadth first traversal of the tree of all
krh@5
    31
  files.  each entry has its name (string pool index), the number of
krh@5
    32
  immediate children, total number of children, and owning package.
krh@5
    33
  for files both these numbers are zero.  a file is identified by its
krh@5
    34
  index in this flattened tree.
krh@5
    35
krh@5
    36
  to get the file name from an index, we search through the list.  by
krh@5
    37
  summing up the number of children, we know when to skip a directory
krh@5
    38
  and when to descend into one.  as we go we accumulate the path
krh@5
    39
  elements.
krh@5
    40
krh@5
    41
  hmm, dropping number of immediate children and using a sentinel drops
krh@5
    42
  a word from every entry.
krh@5
    43
krh@1
    44
- signed pkgs
krh@8
    45
krh@8
    46
- gzip repository of look-aside pkg xml files somehow?
krh@8
    47
krh@8
    48
- transactions, proper recovery, make sure we don't poop our package
krh@8
    49
  database (no more rm /var/lib/rpm/__cache*).
krh@8
    50
krh@8
    51
- no external dependencies, forget about bdb, sqlite.  It's *simple*
krh@8
    52
  and we need to control the on-disk format for these tools.
krh@8
    53
krh@18
    54
- diff from one package set to another answers: "what changed in
krh@18
    55
  rawhide between since yesterday?"
krh@18
    56
krh@18
    57
- rewrite qsort and bsearch that doesn't require global context var
krh@18
    58
  and can output a map describing the permutaion.
krh@18
    59
krh@18
    60
- use hash table for package and property lists so we only store
krh@18
    61
  unique lists (like for string pool).
krh@40
    62
krh@40
    63
- use existing, running system as repo; eg
krh@40
    64
krh@40
    65
	razor update razor://other-box.local evince
krh@40
    66
krh@40
    67
  to pull eg the latest evince and dependencies from another box.  We
krh@40
    68
  should be able to regenerate a rzr pkg from the system so we can
krh@40
    69
  reuse the signature from the originating repo.
krh@41
    70
krh@41
    71
- Ok, maybe the fastest package set merge method in the end is to use
krh@41
    72
  the razor_importer, but use a hash table for the properties.  This
krh@41
    73
  way we can assign them unique IDs immediately (like tokenizing
krh@41
    74
  strings).