rhughes@241: The repo file format / razor_set data structure rhughes@241: ----------------------------------------------- rhughes@241: rhughes@241: The repo starts with a header, containing some number of sections, rhughes@241: terminated by a section with type 0: rhughes@241: rhughes@241: struct razor_set_header { rhughes@241: uint32_t magic; rhughes@241: uint32_t version; rhughes@241: struct razor_set_section sections[0]; rhughes@241: }; rhughes@241: rhughes@241: struct razor_set_section { rhughes@241: uint32_t type; rhughes@241: uint32_t offset; rhughes@241: uint32_t size; rhughes@241: }; rhughes@241: rhughes@241: razor_set_open() mmaps the repo file, and creates a struct razor_set: rhughes@241: rhughes@241: struct razor_set { rhughes@241: struct array string_pool; rhughes@241: struct array packages; rhughes@241: struct array properties; rhughes@241: struct array files; rhughes@241: struct array package_pool; rhughes@241: struct array property_pool; rhughes@241: struct array file_pool; rhughes@241: struct razor_set_header *header; rhughes@241: }; rhughes@241: rhughes@241: by finding the sections with those IDs and creating "struct array"s rhughes@241: pointing to the right places in the mmapped data. (This is the only rhughes@241: processing needed when reading in the file; everything else is used rhughes@241: exactly as-is.) rhughes@241: rhughes@241: rhughes@241: The sections rhughes@241: ------------ rhughes@241: rhughes@241: RAZOR_STRING_POOL rhughes@241: rhughes@241: Stores one copy of each string that appears in the repo. (At rhughes@241: the moment, this is: package names, package versions, property rhughes@241: names, property versions, and (basenames of) filenames.) The rhughes@241: strings are arbitrarily-sized, 0-terminated, and not in any rhughes@241: particular order (although the empty string always ends up rhughes@241: being at offset 0). rhughes@241: rhughes@241: RAZOR_PACKAGES rhughes@241: rhughes@241: Array of struct razor_package; one for each package in the rhughes@241: set, sorted by name. rhughes@241: rhughes@241: RAZOR_PROPERTIES rhughes@241: rhughes@241: Array of struct razor_property; one for each unique property rhughes@241: in the set, sorted by type, then name, then relation type (eg, rhughes@241: "<" or ">="), then version. (Properties with no version have rhughes@241: relation type RAZOR_VERSION_EQUAL, and version "".) rhughes@241: rhughes@241: RAZOR_FILES rhughes@241: rhughes@241: Array of struct razor_entry; one for each file owned by any rhughes@241: package in the set. The current sort order (which is subject rhughes@241: to change) is breadth-first, sorted by basename. So eg: /, /bin, rhughes@241: /dev, /etc, /bin/false, /bin/true, /dev/null, /etc/passwd. rhughes@241: rhughes@241: RAZOR_PACKAGE_POOL rhughes@241: rhughes@241: Array of struct list, with each list item containing the index rhughes@241: of a struct razor_package in the packages section. See the rhughes@241: discussion of lists below. rhughes@241: rhughes@241: RAZOR_PROPERTY_POOL rhughes@241: rhughes@241: Array of struct list, with each list item containing the index rhughes@241: of a struct razor_property in the properties section. See the rhughes@241: discussion of lists below. rhughes@241: rhughes@241: RAZOR_FILE_POOL rhughes@241: rhughes@241: Array of struct list, with each list item containing the index rhughes@241: of a struct razor_entry in the files section. See the rhughes@241: discussion of lists below. rhughes@241: rhughes@241: rhughes@241: Data types rhughes@241: ---------- rhughes@241: Note that the exact layout of bits involves some historical accidents. rhughes@241: (Particularly the fact that the "name" field in most structs loses its rhughes@241: high bits to a flags field.) rhughes@241: rhughes@241: struct list_head rhughes@241: uint list_ptr : 24; rhughes@241: uint flags : 8; rhughes@241: rhughes@241: struct list rhughes@241: uint data : 24; rhughes@241: uint flags : 8; rhughes@241: rhughes@241: Used to store lists of package, property, or file IDs. "struct rhughes@241: list_head" stores the head of the list, which points to one or rhughes@241: more "struct list"s in the appropriate "pool" section. rhughes@241: ("struct list" should probably be called "struct list_item".) rhughes@241: rhughes@241: "list_first(&head, &pool)" returns a "struct list *" pointing rhughes@241: to the first element of the list (or NULL for an empty list), rhughes@241: and "list_next(list)" will return successive elements, until rhughes@241: NULL is returned. Each "list->data" contains the index of a rhughes@241: package, property, or file in the corresponding section of the rhughes@241: set. rhughes@241: rhughes@241: Peeking underneath the abstraction, a list_head's "flags" is rhughes@241: 0xff if the list is empty, 0x80 if it contains a single rhughes@241: element, or 0x00 if it contains more than one element. In the rhughes@241: single-element case, that element is actually stored in the rhughes@241: list_head directly rather than being stored in a pool (and so rhughes@241: list_first() just casts the list_head* to a list* and returns rhughes@241: it). For multi-element lists, list_ptr is the index in the rhughes@241: pool of the first element of this list; the list continues rhughes@241: through successive elements of the pool until one with rhughes@241: non-zero flags is reached, indicating the end of the list. rhughes@241: rhughes@241: struct razor_package rhughes@241: uint name : 24; rhughes@241: uint flags : 8; rhughes@241: uint version : 32; rhughes@241: struct list_head properties; rhughes@241: struct list_head files; rhughes@241: rhughes@241: name and version are indexes into string_pool. properties is a rhughes@241: list of all of the package's properties, and files is a list rhughes@241: of its files. flags is currently only used during razor_set rhughes@241: merging, to keep track of which set a package came from. rhughes@241: rhughes@241: struct razor_property rhughes@241: uint name : 24; rhughes@241: uint flags : 6; rhughes@241: uint type : 2; rhughes@241: uint relation : 32; rhughes@241: uint version : 32; rhughes@241: struct list_head packages; rhughes@241: rhughes@241: name and version are indexes into string_pool. type is an enum rhughes@241: razor_property_type (eg, RAZOR_PROPERTY_REQUIRES), and rhughes@241: relation is an enum razor_version_relation (eg, rhughes@241: RAZOR_VERSION_GREATER_OR_EQUAL). packages is a list of the rhughes@241: packages that have this property. flags is currently unused. rhughes@241: rhughes@241: struct razor_entry rhughes@241: uint name : 24; rhughes@241: uint flags : 8; rhughes@241: uint start : 32; rhughes@241: struct list_head packages; rhughes@241: rhughes@241: name is an index into string_pool, giving the basename of the rhughes@241: file. start is either 0, or an index pointing to another rhughes@241: razor_entry that is the first child of this entry (for a rhughes@241: non-empty directory). (Entry 0 is always the root of the tree, rhughes@241: so no entry could have entry 0 as a child.) flags is 0x80 rhughes@241: (RAZOR_ENTRY_LAST) if an entry is the last entry in its rhughes@241: directory. Otherwise it is 0. rhughes@241: rhughes@241: Note that given a pointer to a struct_razor_entry (eg, from a rhughes@241: package's "files" list), there is no way to reconstruct its rhughes@241: full name without walking the entire files array up to that rhughes@241: point. Because of this and other problems (fix_file_map()), it rhughes@241: seems like razor_entry should be modified to include a pointer rhughes@241: to its parent. (Storing full paths instead of just basenames rhughes@241: would also fix this problem, but that would use much more rhughes@241: memory.)