1.1 --- /dev/null Thu Jan 01 00:00:00 1970 +0000
1.2 +++ b/docs/package-set.xml Wed Jul 02 14:37:38 2008 -0400
1.3 @@ -0,0 +1,239 @@
1.4 +<?xml version="1.0" encoding="utf-8"?>
1.5 +<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN" "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd">
1.6 +
1.7 +<chapter id="file-format">
1.8 + <title>Package Set File Format</title>
1.9 +
1.10 + <sect2 id="file-header">
1.11 + <title>File header</title>
1.12 +
1.13 + <para>
1.14 + The repo starts with a header, containing some number of
1.15 + sections, terminated by a section with type 0:
1.16 + </para>
1.17 +
1.18 + <programlisting><![CDATA[
1.19 +struct razor_set_header {
1.20 + uint32_t magic;
1.21 + uint32_t version;
1.22 + struct razor_set_section sections[0];
1.23 +};
1.24 +
1.25 +struct razor_set_section {
1.26 + uint32_t type;
1.27 + uint32_t offset;
1.28 + uint32_t size;
1.29 +};
1.30 +]]></programlisting>
1.31 +
1.32 + <para>
1.33 + razor_set_open() mmaps the repo file, and creates a struct razor_set:
1.34 + </para>
1.35 +
1.36 + <programlisting><![CDATA[
1.37 +struct razor_set {
1.38 + struct array string_pool;
1.39 + struct array packages;
1.40 + struct array properties;
1.41 + struct array files;
1.42 + struct array package_pool;
1.43 + struct array property_pool;
1.44 + struct array file_pool;
1.45 + struct razor_set_header *header;
1.46 +};
1.47 +]]></programlisting>
1.48 +
1.49 + <para>
1.50 + by finding the sections with those IDs and creating struct
1.51 + array's pointing to the right places in the mmapped data. (This
1.52 + is the only processing needed when reading in the file;
1.53 + everything else is used exactly as-is.)
1.54 + </para>
1.55 +
1.56 + </sect2>
1.57 +
1.58 + <sect2 id="sections">
1.59 + <title>The sections</title>
1.60 +
1.61 + <itemizedlist>
1.62 + <listitem>
1.63 + <para>
1.64 + <emphasis>RAZOR_STRING_POOL</emphasis> Stores one copy of
1.65 + each string that appears in the repo. (At the moment, this
1.66 + is: package names, package versions, property names,
1.67 + property versions, and (basenames of) filenames.) The
1.68 + strings are arbitrarily-sized, 0-terminated, and not in any
1.69 + particular order (although the empty string always ends up
1.70 + being at offset 0).
1.71 + </para>
1.72 + </listitem>
1.73 +
1.74 + <listitem>
1.75 + <para>
1.76 + <emphasis>RAZOR_PACKAGES</emphasis> Array of struct
1.77 + razor_package; one for each package in the set, sorted by
1.78 + name.
1.79 + </para>
1.80 + </listitem>
1.81 +
1.82 + <listitem>
1.83 + <para>
1.84 + <emphasis>RAZOR_PROPERTIES</emphasis> Array of struct
1.85 + razor_property; one for each unique property in the set,
1.86 + sorted by type, then name, then relation type (eg, "<" or
1.87 + ">="), then version. (Properties with no version have
1.88 + relation type RAZOR_VERSION_EQUAL, and version "".)
1.89 + </para>
1.90 + </listitem>
1.91 +
1.92 + <listitem>
1.93 + <para>
1.94 + <emphasis>RAZOR_FILES</emphasis> Array of struct
1.95 + razor_entry; one for each file owned by any package in the
1.96 + set. The current sort order (which is subject to change)
1.97 + is breadth-first, sorted by basename. So eg: /, /bin,
1.98 + /dev, /etc, /bin/false, /bin/true, /dev/null, /etc/passwd.
1.99 + </para>
1.100 + </listitem>
1.101 +
1.102 + <listitem>
1.103 + <para>
1.104 + <emphasis>RAZOR_PACKAGE_POOL</emphasis> Array of struct
1.105 + list, with each list item containing the index of a struct
1.106 + razor_package in the packages section. See the discussion
1.107 + of lists below.
1.108 + </para>
1.109 + </listitem>
1.110 +
1.111 + <listitem>
1.112 + <para>
1.113 + <emphasis>RAZOR_PROPERTY_POOL</emphasis> Array of struct
1.114 + list, with each list item containing the index of a struct
1.115 + razor_property in the properties section. See the
1.116 + discussion of lists below.
1.117 + </para>
1.118 + </listitem>
1.119 +
1.120 + <listitem>
1.121 + <para>
1.122 + <emphasis>RAZOR_FILE_POOL</emphasis> Array of struct list,
1.123 + with each list item containing the index of a struct
1.124 + razor_entry in the files section. See the discussion of
1.125 + lists below.
1.126 + </para>
1.127 + </listitem>
1.128 + </itemizedlist>
1.129 + </sect2>
1.130 +
1.131 + <sect2 id="data-types">
1.132 + <title>Data types</title>
1.133 +
1.134 + <para>
1.135 + Note that the exact layout of bits involves some historical
1.136 + accidents. (Particularly the fact that the "name" field in most
1.137 + structs loses its high bits to a flags field.)
1.138 + </para>
1.139 +
1.140 + <programlisting><![CDATA[
1.141 +struct list_head
1.142 + uint list_ptr : 24;
1.143 + uint flags : 8;
1.144 +
1.145 +struct list
1.146 + uint data : 24;
1.147 + uint flags : 8;
1.148 +]]></programlisting>
1.149 +
1.150 + <para>
1.151 + Used to store lists of package, property, or file IDs. "struct
1.152 + list_head" stores the head of the list, which points to one or
1.153 + more "struct list"s in the appropriate "pool" section. ("struct
1.154 + list" should probably be called "struct list_item".)
1.155 + </para>
1.156 +
1.157 + <para>
1.158 + "list_first(&head, &pool)" returns a "struct list *"
1.159 + pointing to the first element of the list (or NULL for an empty
1.160 + list), and "list_next(list)" will return successive elements,
1.161 + until NULL is returned. Each "list->data" contains the index of
1.162 + a package, property, or file in the corresponding section of the
1.163 + set.
1.164 + </para>
1.165 +
1.166 + <para>
1.167 + Peeking underneath the abstraction, a list_head's "flags" is
1.168 + 0xff if the list is empty, 0x80 if it contains a single element,
1.169 + or 0x00 if it contains more than one element. In the
1.170 + single-element case, that element is actually stored in the
1.171 + list_head directly rather than being stored in a pool (and so
1.172 + list_first() just casts the list_head* to a list* and returns
1.173 + it). For multi-element lists, list_ptr is the index in the pool
1.174 + of the first element of this list; the list continues through
1.175 + successive elements of the pool until one with non-zero flags is
1.176 + reached, indicating the end of the list.
1.177 + </para>
1.178 +
1.179 + <programlisting><![CDATA[
1.180 +struct razor_package
1.181 + uint name : 24;
1.182 + uint flags : 8;
1.183 + uint version : 32;
1.184 + struct list_head properties;
1.185 + struct list_head files;
1.186 +]]></programlisting>
1.187 +
1.188 + <para>
1.189 + name and version are indexes into string_pool. properties is a
1.190 + list of all of the package's properties, and files is a list of
1.191 + its files. flags is currently only used during razor_set
1.192 + merging, to keep track of which set a package came from.
1.193 + </para>
1.194 +
1.195 + <programlisting><![CDATA[
1.196 +struct razor_property
1.197 + uint name : 24;
1.198 + uint flags : 6;
1.199 + uint type : 2;
1.200 + uint relation : 32;
1.201 + uint version : 32;
1.202 + struct list_head packages;
1.203 +]]></programlisting>
1.204 +
1.205 + <para>
1.206 + name and version are indexes into string_pool. type is an enum
1.207 + razor_property_type (eg, RAZOR_PROPERTY_REQUIRES), and relation
1.208 + is an enum razor_version_relation (eg,
1.209 + RAZOR_VERSION_GREATER_OR_EQUAL). packages is a list of the
1.210 + packages that have this property. flags is currently unused.
1.211 + </para>
1.212 +
1.213 + <programlisting><![CDATA[
1.214 +struct razor_entry
1.215 + uint name : 24;
1.216 + uint flags : 8;
1.217 + uint start : 32;
1.218 + struct list_head packages;
1.219 +]]></programlisting>
1.220 +
1.221 + <para>
1.222 + name is an index into string_pool, giving the basename of the
1.223 + file. start is either 0, or an index pointing to another
1.224 + razor_entry that is the first child of this entry (for a
1.225 + non-empty directory). (Entry 0 is always the root of the tree,
1.226 + so no entry could have entry 0 as a child.) flags is 0x80
1.227 + (RAZOR_ENTRY_LAST) if an entry is the last entry in its
1.228 + directory. Otherwise it is 0.
1.229 + </para>
1.230 +
1.231 + <para>
1.232 + Note that given a pointer to a struct_razor_entry (eg, from a
1.233 + package's "files" list), there is no way to reconstruct its full
1.234 + name without walking the entire files array up to that
1.235 + point. Because of this and other problems (fix_file_map()), it
1.236 + seems like razor_entry should be modified to include a pointer
1.237 + to its parent. (Storing full paths instead of just basenames
1.238 + would also fix this problem, but that would use much more
1.239 + memory.)
1.240 + </para>
1.241 + </sect2>
1.242 +</chapter>