docs/REPO.txt
author Kristian H?gsberg <krh@redhat.com>
Mon Jun 30 13:28:59 2008 -0400 (2008-06-30)
changeset 306 cd3954499086
permissions -rw-r--r--
Get rid of razor_set_get_package().

This was always a silly little helper function, not general enough for
real world applications. Use an iterator to search through the set to
find the package of interest.
     1 The repo file format / razor_set data structure
     2 -----------------------------------------------
     3 
     4 The repo starts with a header, containing some number of sections,
     5 terminated by a section with type 0:
     6 
     7 	struct razor_set_header {
     8 		uint32_t magic;
     9 		uint32_t version;
    10 		struct razor_set_section sections[0];
    11 	};
    12 
    13 	struct razor_set_section {
    14 		uint32_t type;
    15 		uint32_t offset;
    16 		uint32_t size;
    17 	};
    18 
    19 razor_set_open() mmaps the repo file, and creates a struct razor_set:
    20 
    21 	struct razor_set {
    22 		struct array string_pool;
    23 	 	struct array packages;
    24 	 	struct array properties;
    25 	 	struct array files;
    26 		struct array package_pool;
    27 	 	struct array property_pool;
    28 	 	struct array file_pool;
    29 		struct razor_set_header *header;
    30 	};
    31 
    32 by finding the sections with those IDs and creating "struct array"s
    33 pointing to the right places in the mmapped data. (This is the only
    34 processing needed when reading in the file; everything else is used
    35 exactly as-is.)
    36 
    37 
    38 The sections
    39 ------------
    40 
    41 RAZOR_STRING_POOL
    42 
    43 	Stores one copy of each string that appears in the repo. (At
    44 	the moment, this is: package names, package versions, property
    45 	names, property versions, and (basenames of) filenames.) The
    46 	strings are arbitrarily-sized, 0-terminated, and not in any
    47 	particular order (although the empty string always ends up
    48 	being at offset 0).
    49 
    50 RAZOR_PACKAGES
    51 
    52 	Array of struct razor_package; one for each package in the
    53 	set, sorted by name.
    54 
    55 RAZOR_PROPERTIES
    56 
    57 	Array of struct razor_property; one for each unique property
    58 	in the set, sorted by type, then name, then relation type (eg,
    59 	"<" or ">="), then version. (Properties with no version have
    60 	relation type RAZOR_VERSION_EQUAL, and version "".)
    61 
    62 RAZOR_FILES
    63 
    64 	Array of struct razor_entry; one for each file owned by any
    65 	package in the set. The current sort order (which is subject
    66 	to change) is breadth-first, sorted by basename. So eg: /, /bin,
    67 	/dev, /etc, /bin/false, /bin/true, /dev/null, /etc/passwd.
    68 
    69 RAZOR_PACKAGE_POOL
    70 
    71 	Array of struct list, with each list item containing the index
    72 	of a struct razor_package in the packages section. See the
    73 	discussion of lists below.
    74 
    75 RAZOR_PROPERTY_POOL
    76 
    77 	Array of struct list, with each list item containing the index
    78 	of a struct razor_property in the properties section. See the
    79 	discussion of lists below.
    80 
    81 RAZOR_FILE_POOL
    82 
    83 	Array of struct list, with each list item containing the index
    84 	of a struct razor_entry in the files section. See the
    85 	discussion of lists below.
    86 
    87 
    88 Data types
    89 ----------
    90 Note that the exact layout of bits involves some historical accidents.
    91 (Particularly the fact that the "name" field in most structs loses its
    92 high bits to a flags field.)
    93 
    94 struct list_head
    95 	uint list_ptr : 24;
    96 	uint flags    : 8;
    97 
    98 struct list
    99 	uint data  : 24;
   100 	uint flags : 8;
   101 
   102 	Used to store lists of package, property, or file IDs. "struct
   103 	list_head" stores the head of the list, which points to one or
   104 	more "struct list"s in the appropriate "pool" section.
   105 	("struct list" should probably be called "struct list_item".)
   106 
   107 	"list_first(&head, &pool)" returns a "struct list *" pointing
   108 	to the first element of the list (or NULL for an empty list),
   109 	and "list_next(list)" will return successive elements, until
   110 	NULL is returned. Each "list->data" contains the index of a
   111 	package, property, or file in the corresponding section of the
   112 	set.
   113 
   114 	Peeking underneath the abstraction, a list_head's "flags" is
   115 	0xff if the list is empty, 0x80 if it contains a single
   116 	element, or 0x00 if it contains more than one element. In the
   117 	single-element case, that element is actually stored in the
   118 	list_head directly rather than being stored in a pool (and so
   119 	list_first() just casts the list_head* to a list* and returns
   120 	it). For multi-element lists, list_ptr is the index in the
   121 	pool of the first element of this list; the list continues
   122 	through successive elements of the pool until one with
   123 	non-zero flags is reached, indicating the end of the list.
   124 
   125 struct razor_package
   126 	uint name    : 24;
   127 	uint flags   : 8;
   128 	uint version : 32;
   129 	struct list_head properties;
   130 	struct list_head files;
   131 
   132 	name and version are indexes into string_pool. properties is a
   133 	list of all of the package's properties, and files is a list
   134 	of its files. flags is currently only used during razor_set
   135 	merging, to keep track of which set a package came from.
   136 
   137 struct razor_property
   138 	uint name     : 24;
   139 	uint flags    : 6;
   140 	uint type     : 2;
   141 	uint relation : 32;
   142 	uint version  : 32;
   143 	struct list_head packages;
   144 
   145 	name and version are indexes into string_pool. type is an enum
   146 	razor_property_type (eg, RAZOR_PROPERTY_REQUIRES), and
   147 	relation is an enum razor_version_relation (eg,
   148 	RAZOR_VERSION_GREATER_OR_EQUAL). packages is a list of the
   149 	packages that have this property. flags is currently unused.
   150 
   151 struct razor_entry
   152 	uint name  : 24;
   153 	uint flags : 8;
   154 	uint start : 32;
   155 	struct list_head packages;
   156 
   157 	name is an index into string_pool, giving the basename of the
   158 	file. start is either 0, or an index pointing to another
   159 	razor_entry that is the first child of this entry (for a
   160 	non-empty directory). (Entry 0 is always the root of the tree,
   161 	so no entry could have entry 0 as a child.) flags is 0x80
   162 	(RAZOR_ENTRY_LAST) if an entry is the last entry in its
   163 	directory. Otherwise it is 0.
   164 
   165 	Note that given a pointer to a struct_razor_entry (eg, from a
   166 	package's "files" list), there is no way to reconstruct its
   167 	full name without walking the entire files array up to that
   168 	point. Because of this and other problems (fix_file_map()), it
   169 	seems like razor_entry should be modified to include a pointer
   170 	to its parent. (Storing full paths instead of just basenames
   171 	would also fix this problem, but that would use much more
   172 	memory.)