docs/reference/implementation_defined_behaviour.xml
author ali <ali@juiblex.co.uk>
Wed Oct 10 22:58:48 2012 +0100 (2012-10-10)
changeset 1 fe592b4168f3
permissions -rw-r--r--
Added tag 1.0 for changeset bc8c9a11cbfc
     1 <?xml version="1.0"?>
     2 <!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.3//EN"
     3                "http://www.oasis-open.org/docbook/xml/4.3/docbookx.dtd"
     4 [
     5 ]>
     6 <chapter id="implementation-defined-behaviour">
     7   <title>Implementation Defined Behaviour</title>
     8 
     9   <para>
    10     The specification of the XEXPR language is laid out in a W3C note of
    11     <ulink url="http://www.w3.org/TR/2000/NOTE-xexpr-20001121">21 November
    12     2000</ulink>. However, the specification leaves quite a bit of
    13     information to be deduced from the examples and leaves other parts of
    14     the language loosly specified. This chapter documents the way that
    15     libxexpr implements the language and gives the rationale for each
    16     decision taken.
    17   </para>
    18 
    19   <sect1 id="idb-numbers">
    20     <title>Numbers</title>
    21 
    22     <para>
    23       Numbers are defined in pseudo-BNF as:
    24       <programlisting>
    25 number         : whitespace sign simple-number whitespace
    26 	       ;
    27 
    28 whitespace     : [ \t\n]*
    29 	       ;
    30 
    31 sign           : [+-]?
    32 	       ;
    33 
    34 simple-number  : 0x[0-9A-Fa-f]+
    35 	       | [0-9]+
    36 	       | [0-9]+\.[0-9]+
    37 	       | [0-9]+\.[0-9]+[eE][+-][0-9]+
    38 	       ;<!--
    39    --></programlisting>
    40       that is: they have an optional leading sign and may be surrounded with
    41       whitespace.
    42     </para>
    43 
    44     <simplesect id="idb-numbers-rationale">
    45       <title>Rationale</title>
    46 
    47       <para>
    48         While negative numbers can be created using &lt;subtract&gt; without
    49 	the need for any signs, this seems overly cumbersome.
    50       </para>
    51 
    52       <para>
    53 	The examples in the specification make it clear that where two numbers
    54 	are seperated by a space, this should be parsed as just two numbers
    55 	and not two numbers plus an interveening string which a strict reading
    56 	of the specification would imply.
    57       </para>
    58     </simplesect>
    59   </sect1>
    60 
    61   <sect1 id="idb-bindings">
    62     <title>Bindings</title>
    63 
    64     <para>
    65       Bindings are parsed as integers, floats or strings in that order (ie.,
    66       the first type that matches will be used). Thus the following pairs
    67       of expressions are equivalent:
    68       <programlisting>
    69 &lt;func x="+01 "/&gt;
    70 
    71 &lt;func&gt;
    72   &lt;define name="x"&gt;&lt;integer&gt;1&lt;/integer&gt;&lt;/define&gt;
    73 &lt;/func&gt;
    74 
    75 &lt;func x=" 14.0e-1"/&gt;
    76 
    77 &lt;func&gt;
    78   &lt;define name="x"&gt;&lt;float&gt;1.4&lt;/float&gt;&lt;/define&gt;
    79 &lt;/func&gt;
    80 
    81 &lt;func x="Hello "/&gt;
    82 
    83 &lt;func&gt;
    84   &lt;define name="x"&gt;&lt;string&gt;Hello &lt;/string&gt;&lt;/define&gt;
    85 &lt;/func&gt;<!--
    86    --></programlisting>
    87     </para>
    88     <simplesect id="idb-bindings-rationale">
    89       <title>Rationale</title>
    90 
    91       <para>
    92 	This seems to satisfy the doctrine of least-surpise.
    93       </para>
    94     </simplesect>
    95   </sect1>
    96 
    97   <sect1 id="idb-pcdata">
    98     <title>Parsing of PCDATA</title>
    99 
   100     <para>
   101       When numbers and strings are mixed in PCDATA, any whitespace surrounding
   102       the numbers is taken to be part of the numbers rather than the strings.
   103       Thus the following two expressions are equivalent:
   104       <programlisting>
   105 &lt;foo&gt;This is the 0xdeadbeef constant.&lt;/foo&gt;
   106 
   107 &lt;foo&gt;
   108   &lt;string&gt;This is the&lt;/string&gt;
   109   &lt;integer&gt;0xdeadbeef&lt;/integer&gt;
   110   &lt;string&gt;constant.&lt;/string&gt;
   111 &lt;/foo&gt;<!--
   112    --></programlisting>
   113     </para>
   114 
   115     <simplesect id="idb-pcdata-rationale">
   116       <title>Rationale</title>
   117 
   118       <para>
   119 	This seems more consistent with spaces between numbers not being
   120 	parsed as strings than the alternative.
   121       </para>
   122     </simplesect>
   123   </sect1>
   124 
   125   <sect1 id="idb-define">
   126     <title>The &lt;define&gt; Function</title>
   127 
   128     <para>
   129       The &lt;define&gt; function creates a new function in the environment in
   130       which it is invoked. This is different than the &lt;set&gt; function
   131       which will modify the definition of an existing function if such exists.
   132       Only if no such function is defined in any of the active environments will
   133       &lt;set&gt; create a new function (and then in the outermost, or global,
   134       environment).
   135     </para>
   136 
   137     <simplesect id="idb-define-rationale">
   138       <title>Rationale</title>
   139 
   140       <para>
   141 	We know from the examples in the specification (eg., in
   142 	<ulink url="http://www.w3.org/TR/xexpr/#id-0045">section 45</ulink>)
   143 	that &lt;subtract&gt; changes the definition of its first argument
   144 	in at least the grandfather environment. It makes sense that &lt;set&gt;
   145 	should do the same. When we come to &lt;define&gt;, however, we
   146 	know from <ulink url="http://www.w3.org/TR/xexpr/#id-0003">section
   147 	3</ulink> that it is equivalent to an attribute on the parent element
   148 	and so it makes sense that it should create a variable in the parent
   149 	environment.
   150       </para>
   151     </simplesect>
   152   </sect1>
   153 
   154   <sect1 id="idb-get">
   155     <title>The &lt;get&gt; Function</title>
   156 
   157     <para>
   158       The following two expressions are equivalent:
   159       <programlisting>
   160 &lt;get name="x"/&gt;
   161 &lt;get&gt;x&lt;/get&gt;<!--
   162    --></programlisting>
   163       The expression &lt;x/&gt; has the same effect except in the case of
   164       &lt;add&gt; and &lt;subtract&gt; where these two expressions are
   165       different:
   166       <programlisting>
   167 &lt;add&gt;&lt;x/&gt;1&lt;/add&gt;
   168 &lt;add&gt;&lt;get&gt;x&lt;/get&gt;1&lt;/add&gt;<!--
   169    --></programlisting>
   170       The first changes the definition of &lt;x&gt;, the second does not.
   171     </para>
   172 
   173     <para>
   174       Note that IDs are allowed to start with the dot (.) and hyphen (-)
   175       characters which are not valid as the first character in XML tags.
   176       Thus get must be used in the following:
   177       <programlisting>
   178 &lt;expr&gt;
   179   &lt;define name=".net"&gt;4.5.50709&lt;/define&gt;
   180   &lt;print&gt;&lt;get&gt;.net&lt;/get&gt;&lt;/print&gt;
   181 &lt;/expr&gt;<!--
   182    --></programlisting>
   183     </para>
   184 
   185     <para>
   186       Since &lt;get&gt; returns a function definition (just like
   187       &lt;define&gt;), it is possible to define functions of this type that
   188       take arguments and even invoke them in a somewhat circuitous manner:
   189       <programlisting>
   190 &lt;expr&gt;
   191   &lt;define name=".product" args="a b c d"&gt;
   192     &lt;add&gt;
   193       &lt;multiply&gt;
   194         &lt;a/&gt;
   195         &lt;b/&gt;
   196       &lt;/multiply&gt;
   197       &lt;multiply&gt;
   198         &lt;c/&gt;
   199         &lt;d/&gt;
   200       &lt;/multiply&gt;
   201     &lt;/add&gt;
   202   &lt;/define&gt;
   203 
   204   &lt;expr&gt;
   205     &lt;define name="closure"/&gt;
   206     &lt;set name="closure"&gt;
   207       &lt;get&gt;.product&lt;/get&gt;
   208     &lt;/set&gt;
   209     &lt;closure&gt;1 2 3 4&lt;/closure&gt;
   210   &lt;/expr&gt;
   211 &lt;/expr&gt;<!--
   212    --></programlisting>
   213     </para>
   214 
   215     <simplesect id="idb-get-rationale">
   216       <title>Rationale</title>
   217 
   218       <para>
   219 	<ulink url="http://www.w3.org/TR/xexpr/#id-0014">Section 14</ulink>
   220 	tells us that &lt;get&gt;x&lt;/get&gt; and &lt;x/&gt; have the same
   221 	effect in most cases (and thus presumably not all cases) and it
   222 	would seem surprising if &lt;get&gt; were not to insulate a
   223 	function in this manner.
   224       </para>
   225     </simplesect>
   226   </sect1>
   227 
   228   <sect1 id="idb-arithmetic">
   229     <title>Arithmetic Operators</title>
   230 
   231     <para>
   232       The empty arithmetic operators (&lt;add/&gt;, &lt;subtract/&gt;,
   233       &lt;multiply/&gt; and &lt;divide/&gt;) all evaluate to &lt;nil/&gt;.
   234     </para>
   235 
   236     <para>
   237       The &lt;add&gt; and &lt;subtract&gt; operators change their first
   238       argument in some circumstances as in this example from the specification:
   239       <programlisting>
   240 &lt;while&gt;
   241   &lt;gt&gt;&lt;x/&gt; 0&lt;/gt&gt;
   242   &lt;expr&gt;
   243     &lt;print newline="true">&lt;x/&gt;&lt;print&gt;
   244     &lt;subtract&gt;&lt;x/&gt; 1&lt;/subtract&gt;
   245   &lt;/expr&gt;
   246 &lt;/while&gt;<!--
   247    --></programlisting>
   248     </para>
   249 
   250     <para>
   251       In general, the first agument will be modified if it is a function
   252       invocation that has no bindings and no arguments. Thus the following
   253       will print 9:
   254       <programlisting>
   255 &lt;define name="x"&gt;&lt;multiply&gt;2 3&lt;/multiply&gt;&lt;/define&gt;
   256 &lt;add&gt;&lt;x/&gt;3&lt;/add&gt;
   257 &lt;print&gt;&lt;x/&gt;&lt;/print&gt;<!--
   258    --></programlisting>
   259       whereas this will print 6:
   260       <programlisting>
   261 &lt;define name="x"&gt;&lt;multiply&gt;2 3&lt;/multiply&gt;&lt;/define&gt;
   262 &lt;add&gt;&lt;x unused=""/&gt;3&lt;/add&gt;
   263 &lt;print&gt;&lt;x/&gt;&lt;/print&gt;<!--
   264    --></programlisting>
   265     </para>
   266 
   267     <para>
   268       Where arguments are modified, this occurs as the arguments are being
   269       evaluated. Thus this expression:
   270       <programlisting>
   271 &lt;add&gt;&lt;x/&gt;&lt;x/&gt;&lt;x/&gt;&lt;/add&gt;<!--
   272    --></programlisting>
   273       will multiply &lt;x&gt; by 4 rather than by 3.
   274     </para>
   275 
   276     <simplesect id="idb-arithmetic-rationale">
   277       <title>Rationale</title>
   278 
   279       <para>
   280 	The examples in the specification imply that &lt;add&gt; and
   281 	&lt;subtract&gt; modify their first argument when it is a
   282 	variable. The iterative example in
   283 	<ulink url="http://www.w3.org/TR/xexpr/#id-0008">section 8</ulink>
   284 	wouldn't work if &lt;multiply&gt; worked the same way (and the
   285 	definition of &lt;2pi&gt; in <ulink
   286 	url="http://www.w3.org/TR/xexpr/#id-0007">section 7</ulink> would
   287 	not be expected to modify the definition of &lt;pi&gt; each time
   288 	it is called). Note that this example is erroneous: IDs can't contain
   289 	numbers.
   290       </para>
   291 
   292       <para>
   293 	It seems undesirable to modify functions that take arguments.
   294 	Making the decision based on the invocation rather than the
   295 	function definition makes expressions much easier to read.
   296       </para>
   297     </simplesect>
   298   </sect1>
   299 
   300   <sect1 id="idb-comparison">
   301     <title>Comparison Functions</title>
   302 
   303     <para>
   304       The empty comparison functions (&lt;eq/&gt;, &lt;neq/&gt;, &lt;leq/&gt;,
   305       &lt;geq/&gt;, &lt;lt/&gt; and &lt;gt/&gt;) and comparison functions with
   306       exactly one argument all evaluate to &lt;true/&gt;.
   307     </para>
   308 
   309     <para>
   310       The ordered comparison functions (&lt;leq&gt;, &lt;geq&gt;, &lt;lt&gt;
   311       and &lt;gt&gt;) act as if the equivalent mathematical operator was
   312       inserted between their arguments. Thus:
   313       <programlisting>
   314 &lt;lt&gt;
   315   1 2 3
   316 &lt;/lt&gt;<!--
   317    --></programlisting>
   318       is equivalent to the mathematical expression:
   319       <screen>1 &lt; 2 &lt; 3</screen>
   320       and:
   321       <programlisting>
   322 &lt;leq&gt;
   323   1 2 3
   324 &lt;/leq&gt;<!--
   325    --></programlisting>
   326       is equivalent to the mathematical expression:
   327       <screen>1 ≤ 2 ≤ 3</screen>
   328     </para>
   329 
   330     <para>
   331       When comparing objects of different types:
   332       <itemizedlist>
   333 	<listitem>
   334 	  Numbers will be implicitly cast between &lt;float&gt; and
   335 	  &lt;integer&gt; where that involves no loss of precision
   336 	</listitem>
   337 	<listitem>
   338 	  Strings will be compared byte-by-byte as UTF8 encoded strings
   339 	</listitem>
   340 	<listitem>
   341 	  Functions will always be completely evaluated
   342 	</listitem>
   343 	<listitem>
   344 	  Invocations of the constant functions are ordered as &lt;false/&gt;
   345 	  &lt; &lt;nil/&gt; &lt; &lt;true/&gt;
   346 	</listitem>
   347       </itemizedlist>
   348     </para>
   349 
   350     <simplesect id="idb-comparison-rationale">
   351       <title>Rationale</title>
   352 
   353       <para>
   354 	The empty comparison functions equaluate to &lt;true&gt; by analogy
   355 	with the comparison functions.
   356       </para>
   357 
   358       <para>
   359 	The ordering of the comparison functions is confused in the
   360 	specification with the examples for &lt;lt&gt; and &lt;gt&gt; agreeing
   361 	with libxexpr's behaviour and the examples for &lt;leq&gt; and
   362 	&lt;geq&gt; doing the opposite. The choice was arbitary.
   363       </para>
   364     </simplesect>
   365   </sect1>
   366 
   367   <sect1 id="idb-redefining-builtins">
   368     <title>Redefining Builtin Functions</title>
   369 
   370     <para>
   371       Attempting to redefine a builtin function results in an error.
   372     </para>
   373 
   374     <simplesect id="idb-redefining-builtins-rationale">
   375       <title>Rationale</title>
   376 
   377       <para>
   378 	While this could be implemented, there is no mention of it in the
   379 	specification and it would complicate the implementation with no
   380 	obvious benefit.
   381       </para>
   382     </simplesect>
   383   </sect1>
   384 
   385   <sect1 id="idb-namespaces">
   386     <title>Namespaces</title>
   387 
   388     <para>
   389       libxexpr considers an element's namespace to be part of its name
   390       and thus elements in a namespace other than the XEXPR namespace as
   391       are always distinct from functions defined by XEXPR. In addition,
   392       functions defined using &lt;define&gt; are defined in the
   393       XEXPR namespace.
   394     </para>
   395 
   396     <para>
   397       libxexpr provides hooks for extending the XEXPR language by allowing
   398       handlers to be installed for other namespaces.
   399     </para>
   400 
   401     <para>
   402       Elements which are in no namespace are treated as if they were in the
   403       XEXPR namespace.
   404     </para>
   405 
   406     <simplesect id="idb-namespaces-rationale">
   407       <title>Rationale</title>
   408 
   409       <para>
   410 	Being able to extend the XEXPR language is vital for it to be useful
   411 	and namespaces are the obvious way to do this.
   412       </para>
   413     </simplesect>
   414   </sect1>
   415 </chapter>