krh@1: - keep history of installed packages/journal of package transaction, krh@1: so we can roll back to yesterday, or see what got installed in the krh@1: latest yum update. krh@1: krh@1: - we build a cache of the currently installed set to service krh@1: dependency inquiries fast: krh@1: krh@1: map from property to pkg (as hash) providing it krh@1: map from property to pkgs requiring it krh@1: map from pkg name to manifest krh@1: map from string to string pool index krh@1: krh@1: no implicit provides? not even pkgname? krh@1: krh@1: - properties are strings, stored in a string table krh@1: krh@1: - on disk maps are binary files of (string table index, hash) pairs krh@1: krh@1: - at run time, we mmap the map, and keep changes in memory in a splay krh@1: tree or similar. if searching the splay tree fails we punt to the krh@1: mmap. once the transaction is done, we merge the map and the splay krh@1: tree and write it back out. krh@1: krh@1: - the on-disk string pool is sorted and we keep a list of indices into krh@1: the string pool in sorted order so we can bsearch the list with a krh@1: string to get its string pool index. maybe a hash table is better, krh@1: less I/O as we will expect to find the string within the block we krh@1: look up with the hash function. krh@1: krh@5: - represent all files as a breadth first traversal of the tree of all krh@5: files. each entry has its name (string pool index), the number of krh@5: immediate children, total number of children, and owning package. krh@5: for files both these numbers are zero. a file is identified by its krh@5: index in this flattened tree. krh@5: krh@5: to get the file name from an index, we search through the list. by krh@5: summing up the number of children, we know when to skip a directory krh@5: and when to descend into one. as we go we accumulate the path krh@5: elements. krh@5: krh@5: hmm, dropping number of immediate children and using a sentinel drops krh@5: a word from every entry. krh@5: krh@1: - signed pkgs krh@8: krh@8: - gzip repository of look-aside pkg xml files somehow? krh@8: krh@8: - transactions, proper recovery, make sure we don't poop our package krh@8: database (no more rm /var/lib/rpm/__cache*). krh@8: krh@8: - no external dependencies, forget about bdb, sqlite. It's *simple* krh@8: and we need to control the on-disk format for these tools. krh@8: krh@18: - diff from one package set to another answers: "what changed in krh@18: rawhide between since yesterday?" krh@18: krh@18: - rewrite qsort and bsearch that doesn't require global context var krh@18: and can output a map describing the permutaion. krh@18: krh@18: - use hash table for package and property lists so we only store krh@18: unique lists (like for string pool). krh@40: krh@40: - use existing, running system as repo; eg krh@40: krh@40: razor update razor://other-box.local evince krh@40: krh@40: to pull eg the latest evince and dependencies from another box. We krh@40: should be able to regenerate a rzr pkg from the system so we can krh@40: reuse the signature from the originating repo. krh@41: krh@41: - Ok, maybe the fastest package set merge method in the end is to use krh@41: the razor_importer, but use a hash table for the properties. This krh@41: way we can assign them unique IDs immediately (like tokenizing krh@41: strings).