diff -r 000000000000 -r fe592b4168f3 docs/reference/implementation_defined_behaviour.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/docs/reference/implementation_defined_behaviour.xml Wed Oct 10 22:58:48 2012 +0100 @@ -0,0 +1,415 @@ + + + + Implementation Defined Behaviour + + + The specification of the XEXPR language is laid out in a W3C note of + 21 November + 2000. However, the specification leaves quite a bit of + information to be deduced from the examples and leaves other parts of + the language loosly specified. This chapter documents the way that + libxexpr implements the language and gives the rationale for each + decision taken. + + + + Numbers + + + Numbers are defined in pseudo-BNF as: + +number : whitespace sign simple-number whitespace + ; + +whitespace : [ \t\n]* + ; + +sign : [+-]? + ; + +simple-number : 0x[0-9A-Fa-f]+ + | [0-9]+ + | [0-9]+\.[0-9]+ + | [0-9]+\.[0-9]+[eE][+-][0-9]+ + ; + that is: they have an optional leading sign and may be surrounded with + whitespace. + + + + Rationale + + + While negative numbers can be created using <subtract> without + the need for any signs, this seems overly cumbersome. + + + + The examples in the specification make it clear that where two numbers + are seperated by a space, this should be parsed as just two numbers + and not two numbers plus an interveening string which a strict reading + of the specification would imply. + + + + + + Bindings + + + Bindings are parsed as integers, floats or strings in that order (ie., + the first type that matches will be used). Thus the following pairs + of expressions are equivalent: + +<func x="+01 "/> + +<func> + <define name="x"><integer>1</integer></define> +</func> + +<func x=" 14.0e-1"/> + +<func> + <define name="x"><float>1.4</float></define> +</func> + +<func x="Hello "/> + +<func> + <define name="x"><string>Hello </string></define> +</func> + + + Rationale + + + This seems to satisfy the doctrine of least-surpise. + + + + + + Parsing of PCDATA + + + When numbers and strings are mixed in PCDATA, any whitespace surrounding + the numbers is taken to be part of the numbers rather than the strings. + Thus the following two expressions are equivalent: + +<foo>This is the 0xdeadbeef constant.</foo> + +<foo> + <string>This is the</string> + <integer>0xdeadbeef</integer> + <string>constant.</string> +</foo> + + + + Rationale + + + This seems more consistent with spaces between numbers not being + parsed as strings than the alternative. + + + + + + The <define> Function + + + The <define> function creates a new function in the environment in + which it is invoked. This is different than the <set> function + which will modify the definition of an existing function if such exists. + Only if no such function is defined in any of the active environments will + <set> create a new function (and then in the outermost, or global, + environment). + + + + Rationale + + + We know from the examples in the specification (eg., in + section 45) + that <subtract> changes the definition of its first argument + in at least the grandfather environment. It makes sense that <set> + should do the same. When we come to <define>, however, we + know from section + 3 that it is equivalent to an attribute on the parent element + and so it makes sense that it should create a variable in the parent + environment. + + + + + + The <get> Function + + + The following two expressions are equivalent: + +<get name="x"/> +<get>x</get> + The expression <x/> has the same effect except in the case of + <add> and <subtract> where these two expressions are + different: + +<add><x/>1</add> +<add><get>x</get>1</add> + The first changes the definition of <x>, the second does not. + + + + Note that IDs are allowed to start with the dot (.) and hyphen (-) + characters which are not valid as the first character in XML tags. + Thus get must be used in the following: + +<expr> + <define name=".net">4.5.50709</define> + <print><get>.net</get></print> +</expr> + + + + Since <get> returns a function definition (just like + <define>), it is possible to define functions of this type that + take arguments and even invoke them in a somewhat circuitous manner: + +<expr> + <define name=".product" args="a b c d"> + <add> + <multiply> + <a/> + <b/> + </multiply> + <multiply> + <c/> + <d/> + </multiply> + </add> + </define> + + <expr> + <define name="closure"/> + <set name="closure"> + <get>.product</get> + </set> + <closure>1 2 3 4</closure> + </expr> +</expr> + + + + Rationale + + + Section 14 + tells us that <get>x</get> and <x/> have the same + effect in most cases (and thus presumably not all cases) and it + would seem surprising if <get> were not to insulate a + function in this manner. + + + + + + Arithmetic Operators + + + The empty arithmetic operators (<add/>, <subtract/>, + <multiply/> and <divide/>) all evaluate to <nil/>. + + + + The <add> and <subtract> operators change their first + argument in some circumstances as in this example from the specification: + +<while> + <gt><x/> 0</gt> + <expr> + <print newline="true"><x/><print> + <subtract><x/> 1</subtract> + </expr> +</while> + + + + In general, the first agument will be modified if it is a function + invocation that has no bindings and no arguments. Thus the following + will print 9: + +<define name="x"><multiply>2 3</multiply></define> +<add><x/>3</add> +<print><x/></print> + whereas this will print 6: + +<define name="x"><multiply>2 3</multiply></define> +<add><x unused=""/>3</add> +<print><x/></print> + + + + Where arguments are modified, this occurs as the arguments are being + evaluated. Thus this expression: + +<add><x/><x/><x/></add> + will multiply <x> by 4 rather than by 3. + + + + Rationale + + + The examples in the specification imply that <add> and + <subtract> modify their first argument when it is a + variable. The iterative example in + section 8 + wouldn't work if <multiply> worked the same way (and the + definition of <2pi> in section 7 would + not be expected to modify the definition of <pi> each time + it is called). Note that this example is erroneous: IDs can't contain + numbers. + + + + It seems undesirable to modify functions that take arguments. + Making the decision based on the invocation rather than the + function definition makes expressions much easier to read. + + + + + + Comparison Functions + + + The empty comparison functions (<eq/>, <neq/>, <leq/>, + <geq/>, <lt/> and <gt/>) and comparison functions with + exactly one argument all evaluate to <true/>. + + + + The ordered comparison functions (<leq>, <geq>, <lt> + and <gt>) act as if the equivalent mathematical operator was + inserted between their arguments. Thus: + +<lt> + 1 2 3 +</lt> + is equivalent to the mathematical expression: + 1 < 2 < 3 + and: + +<leq> + 1 2 3 +</leq> + is equivalent to the mathematical expression: + 1 ≤ 2 ≤ 3 + + + + When comparing objects of different types: + + + Numbers will be implicitly cast between <float> and + <integer> where that involves no loss of precision + + + Strings will be compared byte-by-byte as UTF8 encoded strings + + + Functions will always be completely evaluated + + + Invocations of the constant functions are ordered as <false/> + < <nil/> < <true/> + + + + + + Rationale + + + The empty comparison functions equaluate to <true> by analogy + with the comparison functions. + + + + The ordering of the comparison functions is confused in the + specification with the examples for <lt> and <gt> agreeing + with libxexpr's behaviour and the examples for <leq> and + <geq> doing the opposite. The choice was arbitary. + + + + + + Redefining Builtin Functions + + + Attempting to redefine a builtin function results in an error. + + + + Rationale + + + While this could be implemented, there is no mention of it in the + specification and it would complicate the implementation with no + obvious benefit. + + + + + + Namespaces + + + libxexpr considers an element's namespace to be part of its name + and thus elements in a namespace other than the XEXPR namespace as + are always distinct from functions defined by XEXPR. In addition, + functions defined using <define> are defined in the + XEXPR namespace. + + + + libxexpr provides hooks for extending the XEXPR language by allowing + handlers to be installed for other namespaces. + + + + Elements which are in no namespace are treated as if they were in the + XEXPR namespace. + + + + Rationale + + + Being able to extend the XEXPR language is vital for it to be useful + and namespaces are the obvious way to do this. + + + +