<?xml version="1.0"?>
<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.3//EN"
               "http://www.oasis-open.org/docbook/xml/4.3/docbookx.dtd"
[
]>
<chapter id="implementation-defined-behaviour">
  <title>Implementation Defined Behaviour</title>

  <para>
    The specification of the XEXPR language is laid out in a W3C note of
    <ulink url="http://www.w3.org/TR/2000/NOTE-xexpr-20001121">21 November
    2000</ulink>. However, the specification leaves quite a bit of
    information to be deduced from the examples and leaves other parts of
    the language loosly specified. This chapter documents the way that
    libxexpr implements the language and gives the rationale for each
    decision taken.
  </para>

  <sect1 id="idb-numbers">
    <title>Numbers</title>

    <para>
      Numbers are defined in pseudo-BNF as:
      <programlisting>
number         : whitespace sign simple-number whitespace
	       ;

whitespace     : [ \t\n]*
	       ;

sign           : [+-]?
	       ;

simple-number  : 0x[0-9A-Fa-f]+
	       | [0-9]+
	       | [0-9]+\.[0-9]+
	       | [0-9]+\.[0-9]+[eE][+-][0-9]+
	       ;<!--
   --></programlisting>
      that is: they have an optional leading sign and may be surrounded with
      whitespace.
    </para>

    <simplesect id="idb-numbers-rationale">
      <title>Rationale</title>

      <para>
        While negative numbers can be created using &lt;subtract&gt; without
	the need for any signs, this seems overly cumbersome.
      </para>

      <para>
	The examples in the specification make it clear that where two numbers
	are seperated by a space, this should be parsed as just two numbers
	and not two numbers plus an interveening string which a strict reading
	of the specification would imply.
      </para>
    </simplesect>
  </sect1>

  <sect1 id="idb-bindings">
    <title>Bindings</title>

    <para>
      Bindings are parsed as integers, floats or strings in that order (ie.,
      the first type that matches will be used). Thus the following pairs
      of expressions are equivalent:
      <programlisting>
&lt;func x="+01 "/&gt;

&lt;func&gt;
  &lt;define name="x"&gt;&lt;integer&gt;1&lt;/integer&gt;&lt;/define&gt;
&lt;/func&gt;

&lt;func x=" 14.0e-1"/&gt;

&lt;func&gt;
  &lt;define name="x"&gt;&lt;float&gt;1.4&lt;/float&gt;&lt;/define&gt;
&lt;/func&gt;

&lt;func x="Hello "/&gt;

&lt;func&gt;
  &lt;define name="x"&gt;&lt;string&gt;Hello &lt;/string&gt;&lt;/define&gt;
&lt;/func&gt;<!--
   --></programlisting>
    </para>
    <simplesect id="idb-bindings-rationale">
      <title>Rationale</title>

      <para>
	This seems to satisfy the doctrine of least-surpise.
      </para>
    </simplesect>
  </sect1>

  <sect1 id="idb-pcdata">
    <title>Parsing of PCDATA</title>

    <para>
      When numbers and strings are mixed in PCDATA, any whitespace surrounding
      the numbers is taken to be part of the numbers rather than the strings.
      Thus the following two expressions are equivalent:
      <programlisting>
&lt;foo&gt;This is the 0xdeadbeef constant.&lt;/foo&gt;

&lt;foo&gt;
  &lt;string&gt;This is the&lt;/string&gt;
  &lt;integer&gt;0xdeadbeef&lt;/integer&gt;
  &lt;string&gt;constant.&lt;/string&gt;
&lt;/foo&gt;<!--
   --></programlisting>
    </para>

    <simplesect id="idb-pcdata-rationale">
      <title>Rationale</title>

      <para>
	This seems more consistent with spaces between numbers not being
	parsed as strings than the alternative.
      </para>
    </simplesect>
  </sect1>

  <sect1 id="idb-define">
    <title>The &lt;define&gt; Function</title>

    <para>
      The &lt;define&gt; function creates a new function in the environment in
      which it is invoked. This is different than the &lt;set&gt; function
      which will modify the definition of an existing function if such exists.
      Only if no such function is defined in any of the active environments will
      &lt;set&gt; create a new function (and then in the outermost, or global,
      environment).
    </para>

    <simplesect id="idb-define-rationale">
      <title>Rationale</title>

      <para>
	We know from the examples in the specification (eg., in
	<ulink url="http://www.w3.org/TR/xexpr/#id-0045">section 45</ulink>)
	that &lt;subtract&gt; changes the definition of its first argument
	in at least the grandfather environment. It makes sense that &lt;set&gt;
	should do the same. When we come to &lt;define&gt;, however, we
	know from <ulink url="http://www.w3.org/TR/xexpr/#id-0003">section
	3</ulink> that it is equivalent to an attribute on the parent element
	and so it makes sense that it should create a variable in the parent
	environment.
      </para>
    </simplesect>
  </sect1>

  <sect1 id="idb-get">
    <title>The &lt;get&gt; Function</title>

    <para>
      The following two expressions are equivalent:
      <programlisting>
&lt;get name="x"/&gt;
&lt;get&gt;x&lt;/get&gt;<!--
   --></programlisting>
      The expression &lt;x/&gt; has the same effect except in the case of
      &lt;add&gt; and &lt;subtract&gt; where these two expressions are
      different:
      <programlisting>
&lt;add&gt;&lt;x/&gt;1&lt;/add&gt;
&lt;add&gt;&lt;get&gt;x&lt;/get&gt;1&lt;/add&gt;<!--
   --></programlisting>
      The first changes the definition of &lt;x&gt;, the second does not.
    </para>

    <para>
      Note that IDs are allowed to start with the dot (.) and hyphen (-)
      characters which are not valid as the first character in XML tags.
      Thus get must be used in the following:
      <programlisting>
&lt;expr&gt;
  &lt;define name=".net"&gt;4.5.50709&lt;/define&gt;
  &lt;print&gt;&lt;get&gt;.net&lt;/get&gt;&lt;/print&gt;
&lt;/expr&gt;<!--
   --></programlisting>
    </para>

    <para>
      Since &lt;get&gt; returns a function definition (just like
      &lt;define&gt;), it is possible to define functions of this type that
      take arguments and even invoke them in a somewhat circuitous manner:
      <programlisting>
&lt;expr&gt;
  &lt;define name=".product" args="a b c d"&gt;
    &lt;add&gt;
      &lt;multiply&gt;
        &lt;a/&gt;
        &lt;b/&gt;
      &lt;/multiply&gt;
      &lt;multiply&gt;
        &lt;c/&gt;
        &lt;d/&gt;
      &lt;/multiply&gt;
    &lt;/add&gt;
  &lt;/define&gt;

  &lt;expr&gt;
    &lt;define name="closure"/&gt;
    &lt;set name="closure"&gt;
      &lt;get&gt;.product&lt;/get&gt;
    &lt;/set&gt;
    &lt;closure&gt;1 2 3 4&lt;/closure&gt;
  &lt;/expr&gt;
&lt;/expr&gt;<!--
   --></programlisting>
    </para>

    <simplesect id="idb-get-rationale">
      <title>Rationale</title>

      <para>
	<ulink url="http://www.w3.org/TR/xexpr/#id-0014">Section 14</ulink>
	tells us that &lt;get&gt;x&lt;/get&gt; and &lt;x/&gt; have the same
	effect in most cases (and thus presumably not all cases) and it
	would seem surprising if &lt;get&gt; were not to insulate a
	function in this manner.
      </para>
    </simplesect>
  </sect1>

  <sect1 id="idb-arithmetic">
    <title>Arithmetic Operators</title>

    <para>
      The empty arithmetic operators (&lt;add/&gt;, &lt;subtract/&gt;,
      &lt;multiply/&gt; and &lt;divide/&gt;) all evaluate to &lt;nil/&gt;.
    </para>

    <para>
      The &lt;add&gt; and &lt;subtract&gt; operators change their first
      argument in some circumstances as in this example from the specification:
      <programlisting>
&lt;while&gt;
  &lt;gt&gt;&lt;x/&gt; 0&lt;/gt&gt;
  &lt;expr&gt;
    &lt;print newline="true">&lt;x/&gt;&lt;print&gt;
    &lt;subtract&gt;&lt;x/&gt; 1&lt;/subtract&gt;
  &lt;/expr&gt;
&lt;/while&gt;<!--
   --></programlisting>
    </para>

    <para>
      In general, the first agument will be modified if it is a function
      invocation that has no bindings and no arguments. Thus the following
      will print 9:
      <programlisting>
&lt;define name="x"&gt;&lt;multiply&gt;2 3&lt;/multiply&gt;&lt;/define&gt;
&lt;add&gt;&lt;x/&gt;3&lt;/add&gt;
&lt;print&gt;&lt;x/&gt;&lt;/print&gt;<!--
   --></programlisting>
      whereas this will print 6:
      <programlisting>
&lt;define name="x"&gt;&lt;multiply&gt;2 3&lt;/multiply&gt;&lt;/define&gt;
&lt;add&gt;&lt;x unused=""/&gt;3&lt;/add&gt;
&lt;print&gt;&lt;x/&gt;&lt;/print&gt;<!--
   --></programlisting>
    </para>

    <para>
      Where arguments are modified, this occurs as the arguments are being
      evaluated. Thus this expression:
      <programlisting>
&lt;add&gt;&lt;x/&gt;&lt;x/&gt;&lt;x/&gt;&lt;/add&gt;<!--
   --></programlisting>
      will multiply &lt;x&gt; by 4 rather than by 3.
    </para>

    <simplesect id="idb-arithmetic-rationale">
      <title>Rationale</title>

      <para>
	The examples in the specification imply that &lt;add&gt; and
	&lt;subtract&gt; modify their first argument when it is a
	variable. The iterative example in
	<ulink url="http://www.w3.org/TR/xexpr/#id-0008">section 8</ulink>
	wouldn't work if &lt;multiply&gt; worked the same way (and the
	definition of &lt;2pi&gt; in <ulink
	url="http://www.w3.org/TR/xexpr/#id-0007">section 7</ulink> would
	not be expected to modify the definition of &lt;pi&gt; each time
	it is called). Note that this example is erroneous: IDs can't contain
	numbers.
      </para>

      <para>
	It seems undesirable to modify functions that take arguments.
	Making the decision based on the invocation rather than the
	function definition makes expressions much easier to read.
      </para>
    </simplesect>
  </sect1>

  <sect1 id="idb-comparison">
    <title>Comparison Functions</title>

    <para>
      The empty comparison functions (&lt;eq/&gt;, &lt;neq/&gt;, &lt;leq/&gt;,
      &lt;geq/&gt;, &lt;lt/&gt; and &lt;gt/&gt;) and comparison functions with
      exactly one argument all evaluate to &lt;true/&gt;.
    </para>

    <para>
      The ordered comparison functions (&lt;leq&gt;, &lt;geq&gt;, &lt;lt&gt;
      and &lt;gt&gt;) act as if the equivalent mathematical operator was
      inserted between their arguments. Thus:
      <programlisting>
&lt;lt&gt;
  1 2 3
&lt;/lt&gt;<!--
   --></programlisting>
      is equivalent to the mathematical expression:
      <screen>1 &lt; 2 &lt; 3</screen>
      and:
      <programlisting>
&lt;leq&gt;
  1 2 3
&lt;/leq&gt;<!--
   --></programlisting>
      is equivalent to the mathematical expression:
      <screen>1 ≤ 2 ≤ 3</screen>
    </para>

    <para>
      When comparing objects of different types:
      <itemizedlist>
	<listitem>
	  Numbers will be implicitly cast between &lt;float&gt; and
	  &lt;integer&gt; where that involves no loss of precision
	</listitem>
	<listitem>
	  Strings will be compared byte-by-byte as UTF8 encoded strings
	</listitem>
	<listitem>
	  Functions will always be completely evaluated
	</listitem>
	<listitem>
	  Invocations of the constant functions are ordered as &lt;false/&gt;
	  &lt; &lt;nil/&gt; &lt; &lt;true/&gt;
	</listitem>
      </itemizedlist>
    </para>

    <simplesect id="idb-comparison-rationale">
      <title>Rationale</title>

      <para>
	The empty comparison functions equaluate to &lt;true&gt; by analogy
	with the comparison functions.
      </para>

      <para>
	The ordering of the comparison functions is confused in the
	specification with the examples for &lt;lt&gt; and &lt;gt&gt; agreeing
	with libxexpr's behaviour and the examples for &lt;leq&gt; and
	&lt;geq&gt; doing the opposite. The choice was arbitary.
      </para>
    </simplesect>
  </sect1>

  <sect1 id="idb-redefining-builtins">
    <title>Redefining Builtin Functions</title>

    <para>
      Attempting to redefine a builtin function results in an error.
    </para>

    <simplesect id="idb-redefining-builtins-rationale">
      <title>Rationale</title>

      <para>
	While this could be implemented, there is no mention of it in the
	specification and it would complicate the implementation with no
	obvious benefit.
      </para>
    </simplesect>
  </sect1>

  <sect1 id="idb-namespaces">
    <title>Namespaces</title>

    <para>
      libxexpr considers an element's namespace to be part of its name
      and thus elements in a namespace other than the XEXPR namespace as
      are always distinct from functions defined by XEXPR. In addition,
      functions defined using &lt;define&gt; are defined in the
      XEXPR namespace.
    </para>

    <para>
      libxexpr provides hooks for extending the XEXPR language by allowing
      handlers to be installed for other namespaces.
    </para>

    <para>
      Elements which are in no namespace are treated as if they were in the
      XEXPR namespace.
    </para>

    <simplesect id="idb-namespaces-rationale">
      <title>Rationale</title>

      <para>
	Being able to extend the XEXPR language is vital for it to be useful
	and namespaces are the obvious way to do this.
      </para>
    </simplesect>
  </sect1>
</chapter>
