docs/reference/implementation_defined_behaviour.xml
author ali <ali@juiblex.co.uk>
Wed Oct 10 22:58:48 2012 +0100 (2012-10-10)
changeset 1 fe592b4168f3
permissions -rw-r--r--
Added tag 1.0 for changeset bc8c9a11cbfc
ali@0
     1
<?xml version="1.0"?>
ali@0
     2
<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.3//EN"
ali@0
     3
               "http://www.oasis-open.org/docbook/xml/4.3/docbookx.dtd"
ali@0
     4
[
ali@0
     5
]>
ali@0
     6
<chapter id="implementation-defined-behaviour">
ali@0
     7
  <title>Implementation Defined Behaviour</title>
ali@0
     8
ali@0
     9
  <para>
ali@0
    10
    The specification of the XEXPR language is laid out in a W3C note of
ali@0
    11
    <ulink url="http://www.w3.org/TR/2000/NOTE-xexpr-20001121">21 November
ali@0
    12
    2000</ulink>. However, the specification leaves quite a bit of
ali@0
    13
    information to be deduced from the examples and leaves other parts of
ali@0
    14
    the language loosly specified. This chapter documents the way that
ali@0
    15
    libxexpr implements the language and gives the rationale for each
ali@0
    16
    decision taken.
ali@0
    17
  </para>
ali@0
    18
ali@0
    19
  <sect1 id="idb-numbers">
ali@0
    20
    <title>Numbers</title>
ali@0
    21
ali@0
    22
    <para>
ali@0
    23
      Numbers are defined in pseudo-BNF as:
ali@0
    24
      <programlisting>
ali@0
    25
number         : whitespace sign simple-number whitespace
ali@0
    26
	       ;
ali@0
    27
ali@0
    28
whitespace     : [ \t\n]*
ali@0
    29
	       ;
ali@0
    30
ali@0
    31
sign           : [+-]?
ali@0
    32
	       ;
ali@0
    33
ali@0
    34
simple-number  : 0x[0-9A-Fa-f]+
ali@0
    35
	       | [0-9]+
ali@0
    36
	       | [0-9]+\.[0-9]+
ali@0
    37
	       | [0-9]+\.[0-9]+[eE][+-][0-9]+
ali@0
    38
	       ;<!--
ali@0
    39
   --></programlisting>
ali@0
    40
      that is: they have an optional leading sign and may be surrounded with
ali@0
    41
      whitespace.
ali@0
    42
    </para>
ali@0
    43
ali@0
    44
    <simplesect id="idb-numbers-rationale">
ali@0
    45
      <title>Rationale</title>
ali@0
    46
ali@0
    47
      <para>
ali@0
    48
        While negative numbers can be created using &lt;subtract&gt; without
ali@0
    49
	the need for any signs, this seems overly cumbersome.
ali@0
    50
      </para>
ali@0
    51
ali@0
    52
      <para>
ali@0
    53
	The examples in the specification make it clear that where two numbers
ali@0
    54
	are seperated by a space, this should be parsed as just two numbers
ali@0
    55
	and not two numbers plus an interveening string which a strict reading
ali@0
    56
	of the specification would imply.
ali@0
    57
      </para>
ali@0
    58
    </simplesect>
ali@0
    59
  </sect1>
ali@0
    60
ali@0
    61
  <sect1 id="idb-bindings">
ali@0
    62
    <title>Bindings</title>
ali@0
    63
ali@0
    64
    <para>
ali@0
    65
      Bindings are parsed as integers, floats or strings in that order (ie.,
ali@0
    66
      the first type that matches will be used). Thus the following pairs
ali@0
    67
      of expressions are equivalent:
ali@0
    68
      <programlisting>
ali@0
    69
&lt;func x="+01 "/&gt;
ali@0
    70
ali@0
    71
&lt;func&gt;
ali@0
    72
  &lt;define name="x"&gt;&lt;integer&gt;1&lt;/integer&gt;&lt;/define&gt;
ali@0
    73
&lt;/func&gt;
ali@0
    74
ali@0
    75
&lt;func x=" 14.0e-1"/&gt;
ali@0
    76
ali@0
    77
&lt;func&gt;
ali@0
    78
  &lt;define name="x"&gt;&lt;float&gt;1.4&lt;/float&gt;&lt;/define&gt;
ali@0
    79
&lt;/func&gt;
ali@0
    80
ali@0
    81
&lt;func x="Hello "/&gt;
ali@0
    82
ali@0
    83
&lt;func&gt;
ali@0
    84
  &lt;define name="x"&gt;&lt;string&gt;Hello &lt;/string&gt;&lt;/define&gt;
ali@0
    85
&lt;/func&gt;<!--
ali@0
    86
   --></programlisting>
ali@0
    87
    </para>
ali@0
    88
    <simplesect id="idb-bindings-rationale">
ali@0
    89
      <title>Rationale</title>
ali@0
    90
ali@0
    91
      <para>
ali@0
    92
	This seems to satisfy the doctrine of least-surpise.
ali@0
    93
      </para>
ali@0
    94
    </simplesect>
ali@0
    95
  </sect1>
ali@0
    96
ali@0
    97
  <sect1 id="idb-pcdata">
ali@0
    98
    <title>Parsing of PCDATA</title>
ali@0
    99
ali@0
   100
    <para>
ali@0
   101
      When numbers and strings are mixed in PCDATA, any whitespace surrounding
ali@0
   102
      the numbers is taken to be part of the numbers rather than the strings.
ali@0
   103
      Thus the following two expressions are equivalent:
ali@0
   104
      <programlisting>
ali@0
   105
&lt;foo&gt;This is the 0xdeadbeef constant.&lt;/foo&gt;
ali@0
   106
ali@0
   107
&lt;foo&gt;
ali@0
   108
  &lt;string&gt;This is the&lt;/string&gt;
ali@0
   109
  &lt;integer&gt;0xdeadbeef&lt;/integer&gt;
ali@0
   110
  &lt;string&gt;constant.&lt;/string&gt;
ali@0
   111
&lt;/foo&gt;<!--
ali@0
   112
   --></programlisting>
ali@0
   113
    </para>
ali@0
   114
ali@0
   115
    <simplesect id="idb-pcdata-rationale">
ali@0
   116
      <title>Rationale</title>
ali@0
   117
ali@0
   118
      <para>
ali@0
   119
	This seems more consistent with spaces between numbers not being
ali@0
   120
	parsed as strings than the alternative.
ali@0
   121
      </para>
ali@0
   122
    </simplesect>
ali@0
   123
  </sect1>
ali@0
   124
ali@0
   125
  <sect1 id="idb-define">
ali@0
   126
    <title>The &lt;define&gt; Function</title>
ali@0
   127
ali@0
   128
    <para>
ali@0
   129
      The &lt;define&gt; function creates a new function in the environment in
ali@0
   130
      which it is invoked. This is different than the &lt;set&gt; function
ali@0
   131
      which will modify the definition of an existing function if such exists.
ali@0
   132
      Only if no such function is defined in any of the active environments will
ali@0
   133
      &lt;set&gt; create a new function (and then in the outermost, or global,
ali@0
   134
      environment).
ali@0
   135
    </para>
ali@0
   136
ali@0
   137
    <simplesect id="idb-define-rationale">
ali@0
   138
      <title>Rationale</title>
ali@0
   139
ali@0
   140
      <para>
ali@0
   141
	We know from the examples in the specification (eg., in
ali@0
   142
	<ulink url="http://www.w3.org/TR/xexpr/#id-0045">section 45</ulink>)
ali@0
   143
	that &lt;subtract&gt; changes the definition of its first argument
ali@0
   144
	in at least the grandfather environment. It makes sense that &lt;set&gt;
ali@0
   145
	should do the same. When we come to &lt;define&gt;, however, we
ali@0
   146
	know from <ulink url="http://www.w3.org/TR/xexpr/#id-0003">section
ali@0
   147
	3</ulink> that it is equivalent to an attribute on the parent element
ali@0
   148
	and so it makes sense that it should create a variable in the parent
ali@0
   149
	environment.
ali@0
   150
      </para>
ali@0
   151
    </simplesect>
ali@0
   152
  </sect1>
ali@0
   153
ali@0
   154
  <sect1 id="idb-get">
ali@0
   155
    <title>The &lt;get&gt; Function</title>
ali@0
   156
ali@0
   157
    <para>
ali@0
   158
      The following two expressions are equivalent:
ali@0
   159
      <programlisting>
ali@0
   160
&lt;get name="x"/&gt;
ali@0
   161
&lt;get&gt;x&lt;/get&gt;<!--
ali@0
   162
   --></programlisting>
ali@0
   163
      The expression &lt;x/&gt; has the same effect except in the case of
ali@0
   164
      &lt;add&gt; and &lt;subtract&gt; where these two expressions are
ali@0
   165
      different:
ali@0
   166
      <programlisting>
ali@0
   167
&lt;add&gt;&lt;x/&gt;1&lt;/add&gt;
ali@0
   168
&lt;add&gt;&lt;get&gt;x&lt;/get&gt;1&lt;/add&gt;<!--
ali@0
   169
   --></programlisting>
ali@0
   170
      The first changes the definition of &lt;x&gt;, the second does not.
ali@0
   171
    </para>
ali@0
   172
ali@0
   173
    <para>
ali@0
   174
      Note that IDs are allowed to start with the dot (.) and hyphen (-)
ali@0
   175
      characters which are not valid as the first character in XML tags.
ali@0
   176
      Thus get must be used in the following:
ali@0
   177
      <programlisting>
ali@0
   178
&lt;expr&gt;
ali@0
   179
  &lt;define name=".net"&gt;4.5.50709&lt;/define&gt;
ali@0
   180
  &lt;print&gt;&lt;get&gt;.net&lt;/get&gt;&lt;/print&gt;
ali@0
   181
&lt;/expr&gt;<!--
ali@0
   182
   --></programlisting>
ali@0
   183
    </para>
ali@0
   184
ali@0
   185
    <para>
ali@0
   186
      Since &lt;get&gt; returns a function definition (just like
ali@0
   187
      &lt;define&gt;), it is possible to define functions of this type that
ali@0
   188
      take arguments and even invoke them in a somewhat circuitous manner:
ali@0
   189
      <programlisting>
ali@0
   190
&lt;expr&gt;
ali@0
   191
  &lt;define name=".product" args="a b c d"&gt;
ali@0
   192
    &lt;add&gt;
ali@0
   193
      &lt;multiply&gt;
ali@0
   194
        &lt;a/&gt;
ali@0
   195
        &lt;b/&gt;
ali@0
   196
      &lt;/multiply&gt;
ali@0
   197
      &lt;multiply&gt;
ali@0
   198
        &lt;c/&gt;
ali@0
   199
        &lt;d/&gt;
ali@0
   200
      &lt;/multiply&gt;
ali@0
   201
    &lt;/add&gt;
ali@0
   202
  &lt;/define&gt;
ali@0
   203
ali@0
   204
  &lt;expr&gt;
ali@0
   205
    &lt;define name="closure"/&gt;
ali@0
   206
    &lt;set name="closure"&gt;
ali@0
   207
      &lt;get&gt;.product&lt;/get&gt;
ali@0
   208
    &lt;/set&gt;
ali@0
   209
    &lt;closure&gt;1 2 3 4&lt;/closure&gt;
ali@0
   210
  &lt;/expr&gt;
ali@0
   211
&lt;/expr&gt;<!--
ali@0
   212
   --></programlisting>
ali@0
   213
    </para>
ali@0
   214
ali@0
   215
    <simplesect id="idb-get-rationale">
ali@0
   216
      <title>Rationale</title>
ali@0
   217
ali@0
   218
      <para>
ali@0
   219
	<ulink url="http://www.w3.org/TR/xexpr/#id-0014">Section 14</ulink>
ali@0
   220
	tells us that &lt;get&gt;x&lt;/get&gt; and &lt;x/&gt; have the same
ali@0
   221
	effect in most cases (and thus presumably not all cases) and it
ali@0
   222
	would seem surprising if &lt;get&gt; were not to insulate a
ali@0
   223
	function in this manner.
ali@0
   224
      </para>
ali@0
   225
    </simplesect>
ali@0
   226
  </sect1>
ali@0
   227
ali@0
   228
  <sect1 id="idb-arithmetic">
ali@0
   229
    <title>Arithmetic Operators</title>
ali@0
   230
ali@0
   231
    <para>
ali@0
   232
      The empty arithmetic operators (&lt;add/&gt;, &lt;subtract/&gt;,
ali@0
   233
      &lt;multiply/&gt; and &lt;divide/&gt;) all evaluate to &lt;nil/&gt;.
ali@0
   234
    </para>
ali@0
   235
ali@0
   236
    <para>
ali@0
   237
      The &lt;add&gt; and &lt;subtract&gt; operators change their first
ali@0
   238
      argument in some circumstances as in this example from the specification:
ali@0
   239
      <programlisting>
ali@0
   240
&lt;while&gt;
ali@0
   241
  &lt;gt&gt;&lt;x/&gt; 0&lt;/gt&gt;
ali@0
   242
  &lt;expr&gt;
ali@0
   243
    &lt;print newline="true">&lt;x/&gt;&lt;print&gt;
ali@0
   244
    &lt;subtract&gt;&lt;x/&gt; 1&lt;/subtract&gt;
ali@0
   245
  &lt;/expr&gt;
ali@0
   246
&lt;/while&gt;<!--
ali@0
   247
   --></programlisting>
ali@0
   248
    </para>
ali@0
   249
ali@0
   250
    <para>
ali@0
   251
      In general, the first agument will be modified if it is a function
ali@0
   252
      invocation that has no bindings and no arguments. Thus the following
ali@0
   253
      will print 9:
ali@0
   254
      <programlisting>
ali@0
   255
&lt;define name="x"&gt;&lt;multiply&gt;2 3&lt;/multiply&gt;&lt;/define&gt;
ali@0
   256
&lt;add&gt;&lt;x/&gt;3&lt;/add&gt;
ali@0
   257
&lt;print&gt;&lt;x/&gt;&lt;/print&gt;<!--
ali@0
   258
   --></programlisting>
ali@0
   259
      whereas this will print 6:
ali@0
   260
      <programlisting>
ali@0
   261
&lt;define name="x"&gt;&lt;multiply&gt;2 3&lt;/multiply&gt;&lt;/define&gt;
ali@0
   262
&lt;add&gt;&lt;x unused=""/&gt;3&lt;/add&gt;
ali@0
   263
&lt;print&gt;&lt;x/&gt;&lt;/print&gt;<!--
ali@0
   264
   --></programlisting>
ali@0
   265
    </para>
ali@0
   266
ali@0
   267
    <para>
ali@0
   268
      Where arguments are modified, this occurs as the arguments are being
ali@0
   269
      evaluated. Thus this expression:
ali@0
   270
      <programlisting>
ali@0
   271
&lt;add&gt;&lt;x/&gt;&lt;x/&gt;&lt;x/&gt;&lt;/add&gt;<!--
ali@0
   272
   --></programlisting>
ali@0
   273
      will multiply &lt;x&gt; by 4 rather than by 3.
ali@0
   274
    </para>
ali@0
   275
ali@0
   276
    <simplesect id="idb-arithmetic-rationale">
ali@0
   277
      <title>Rationale</title>
ali@0
   278
ali@0
   279
      <para>
ali@0
   280
	The examples in the specification imply that &lt;add&gt; and
ali@0
   281
	&lt;subtract&gt; modify their first argument when it is a
ali@0
   282
	variable. The iterative example in
ali@0
   283
	<ulink url="http://www.w3.org/TR/xexpr/#id-0008">section 8</ulink>
ali@0
   284
	wouldn't work if &lt;multiply&gt; worked the same way (and the
ali@0
   285
	definition of &lt;2pi&gt; in <ulink
ali@0
   286
	url="http://www.w3.org/TR/xexpr/#id-0007">section 7</ulink> would
ali@0
   287
	not be expected to modify the definition of &lt;pi&gt; each time
ali@0
   288
	it is called). Note that this example is erroneous: IDs can't contain
ali@0
   289
	numbers.
ali@0
   290
      </para>
ali@0
   291
ali@0
   292
      <para>
ali@0
   293
	It seems undesirable to modify functions that take arguments.
ali@0
   294
	Making the decision based on the invocation rather than the
ali@0
   295
	function definition makes expressions much easier to read.
ali@0
   296
      </para>
ali@0
   297
    </simplesect>
ali@0
   298
  </sect1>
ali@0
   299
ali@0
   300
  <sect1 id="idb-comparison">
ali@0
   301
    <title>Comparison Functions</title>
ali@0
   302
ali@0
   303
    <para>
ali@0
   304
      The empty comparison functions (&lt;eq/&gt;, &lt;neq/&gt;, &lt;leq/&gt;,
ali@0
   305
      &lt;geq/&gt;, &lt;lt/&gt; and &lt;gt/&gt;) and comparison functions with
ali@0
   306
      exactly one argument all evaluate to &lt;true/&gt;.
ali@0
   307
    </para>
ali@0
   308
ali@0
   309
    <para>
ali@0
   310
      The ordered comparison functions (&lt;leq&gt;, &lt;geq&gt;, &lt;lt&gt;
ali@0
   311
      and &lt;gt&gt;) act as if the equivalent mathematical operator was
ali@0
   312
      inserted between their arguments. Thus:
ali@0
   313
      <programlisting>
ali@0
   314
&lt;lt&gt;
ali@0
   315
  1 2 3
ali@0
   316
&lt;/lt&gt;<!--
ali@0
   317
   --></programlisting>
ali@0
   318
      is equivalent to the mathematical expression:
ali@0
   319
      <screen>1 &lt; 2 &lt; 3</screen>
ali@0
   320
      and:
ali@0
   321
      <programlisting>
ali@0
   322
&lt;leq&gt;
ali@0
   323
  1 2 3
ali@0
   324
&lt;/leq&gt;<!--
ali@0
   325
   --></programlisting>
ali@0
   326
      is equivalent to the mathematical expression:
ali@0
   327
      <screen>1 ≤ 2 ≤ 3</screen>
ali@0
   328
    </para>
ali@0
   329
ali@0
   330
    <para>
ali@0
   331
      When comparing objects of different types:
ali@0
   332
      <itemizedlist>
ali@0
   333
	<listitem>
ali@0
   334
	  Numbers will be implicitly cast between &lt;float&gt; and
ali@0
   335
	  &lt;integer&gt; where that involves no loss of precision
ali@0
   336
	</listitem>
ali@0
   337
	<listitem>
ali@0
   338
	  Strings will be compared byte-by-byte as UTF8 encoded strings
ali@0
   339
	</listitem>
ali@0
   340
	<listitem>
ali@0
   341
	  Functions will always be completely evaluated
ali@0
   342
	</listitem>
ali@0
   343
	<listitem>
ali@0
   344
	  Invocations of the constant functions are ordered as &lt;false/&gt;
ali@0
   345
	  &lt; &lt;nil/&gt; &lt; &lt;true/&gt;
ali@0
   346
	</listitem>
ali@0
   347
      </itemizedlist>
ali@0
   348
    </para>
ali@0
   349
ali@0
   350
    <simplesect id="idb-comparison-rationale">
ali@0
   351
      <title>Rationale</title>
ali@0
   352
ali@0
   353
      <para>
ali@0
   354
	The empty comparison functions equaluate to &lt;true&gt; by analogy
ali@0
   355
	with the comparison functions.
ali@0
   356
      </para>
ali@0
   357
ali@0
   358
      <para>
ali@0
   359
	The ordering of the comparison functions is confused in the
ali@0
   360
	specification with the examples for &lt;lt&gt; and &lt;gt&gt; agreeing
ali@0
   361
	with libxexpr's behaviour and the examples for &lt;leq&gt; and
ali@0
   362
	&lt;geq&gt; doing the opposite. The choice was arbitary.
ali@0
   363
      </para>
ali@0
   364
    </simplesect>
ali@0
   365
  </sect1>
ali@0
   366
ali@0
   367
  <sect1 id="idb-redefining-builtins">
ali@0
   368
    <title>Redefining Builtin Functions</title>
ali@0
   369
ali@0
   370
    <para>
ali@0
   371
      Attempting to redefine a builtin function results in an error.
ali@0
   372
    </para>
ali@0
   373
ali@0
   374
    <simplesect id="idb-redefining-builtins-rationale">
ali@0
   375
      <title>Rationale</title>
ali@0
   376
ali@0
   377
      <para>
ali@0
   378
	While this could be implemented, there is no mention of it in the
ali@0
   379
	specification and it would complicate the implementation with no
ali@0
   380
	obvious benefit.
ali@0
   381
      </para>
ali@0
   382
    </simplesect>
ali@0
   383
  </sect1>
ali@0
   384
ali@0
   385
  <sect1 id="idb-namespaces">
ali@0
   386
    <title>Namespaces</title>
ali@0
   387
ali@0
   388
    <para>
ali@0
   389
      libxexpr considers an element's namespace to be part of its name
ali@0
   390
      and thus elements in a namespace other than the XEXPR namespace as
ali@0
   391
      are always distinct from functions defined by XEXPR. In addition,
ali@0
   392
      functions defined using &lt;define&gt; are defined in the
ali@0
   393
      XEXPR namespace.
ali@0
   394
    </para>
ali@0
   395
ali@0
   396
    <para>
ali@0
   397
      libxexpr provides hooks for extending the XEXPR language by allowing
ali@0
   398
      handlers to be installed for other namespaces.
ali@0
   399
    </para>
ali@0
   400
ali@0
   401
    <para>
ali@0
   402
      Elements which are in no namespace are treated as if they were in the
ali@0
   403
      XEXPR namespace.
ali@0
   404
    </para>
ali@0
   405
ali@0
   406
    <simplesect id="idb-namespaces-rationale">
ali@0
   407
      <title>Rationale</title>
ali@0
   408
ali@0
   409
      <para>
ali@0
   410
	Being able to extend the XEXPR language is vital for it to be useful
ali@0
   411
	and namespaces are the obvious way to do this.
ali@0
   412
      </para>
ali@0
   413
    </simplesect>
ali@0
   414
  </sect1>
ali@0
   415
</chapter>