XMLProbe 1.5 provides XPath variables both global and local in scope.
Depending on the size and nature of the XML documents evaluated using XMLProbe, performance may be enhanced by using global XPath variables.
Performance degradation of (unoptimised) XPath processors often occurs where typically large nodesets are evaluated against one another, for example:
//foo[ . = //bar[@blort] ]
In this case, /descendant-or-self::node() (abbreviated to //) occurs both in the main part of the expression and in the predicate. XPath expressions of this sort can exhibit quadratic behaviour.
The syntax to declare an XPath variable in XMLProbe is as follows (expressed in DTD syntax):
<!ELEMENT probe:variable (probe:name, (probe:literal|probe:eval))> <!ELEMENT probe:name (#PCDATA)> <!ELEMENT probe:literal (#PCDATA)> <!ELEMENT probe:eval (#PCDATA)>
The name of a variable should be a valid XPath name.
The eval element should contain a valid XPath expression. If this expression evaluates correctly, the variable shall be assigned this value.
The literal element is provided as a convenience for those cases where a literal string is required for the value of a variable.
Global XPath variables may be included in the ruleset anywhere the SILCN grammar allows, except within a silcn:expression (a variable instruction in this location is evaluated as a local variable). Each variable must have a valid XPath name and an XPath expression, as follows:
<silcn:set-criterion> <silcn:id>variable-test</silcn:id> <silcn:expression>//foo[ . = $bar-blorts ]</silcn:expression> <probe:variable> <probe:name>bar-blorts</probe:name> <probe:eval>//bar[@blort]</probe:eval> </probe:variable> <probe:message>found a foo equal to a bar/@blort</probe:message> </silcn:set-criterion>
XMLProbe evaluates all global variables and caches the results before evaluating the individual QA rules. Global variables may be referenced from within any rule in the ruleset.
As of XMLProbe 1.5, it is an error to declare more than one global variable with the same name.
When employed to improve efficiency, global variables are best used where large nodesets are reused within a ruleset. While it is tempting to include XPath variables wherever possible throughout a ruleset, a point may be reached where this kind of optimisation is no longer beneficial.
Local variables are XPath variables which are local in scope to the XPath expression being evaluated. They are useful in writing expressions where a reference is made to something evaluated in an earlier part of an expression.
For example, take this rule:
"Locate any element whose ID is referenced by the target attribute of an xref element, and whose string value is the same as that of the xref element which refers to it."
When couched in a single XPath expression, this rule is complicated by the call to id(), which shifts the evaluation context to the referenced element. Expressions like this (and often those with similar calls to document()) reach a dead-end when one is compelled to write:
//xref[ id( @target )[ . = . ] ]
The comparison ". = ." will always evaluate true, because the context node (.) is now the referenced element, and not the xref.
Using a local variable here allows the earlier string value – that of the xref – to be stored for evaluation against the string value of the referenced element:
//xref <probe:variable> <probe:name>val</probe:name> <probe:eval>.</probe:eval> </probe:variable> [ id( @target )[ $val = . ] ]
When using a variable, as in this case, the variable ($val) is evaluated against each node in the node-set most recently evaluated (//xref). The remainder of the expression ([ id( @target )[ $val = . ] ]) is evaluated as though it had immediately followed //xref and the comparison of values succeeds.
Local variables are in scope for the silcn:set-criterion in which they are declared.
It is an error for a local variable to have the same name as a global variable.
It is an error for multiple local variables within the same silcn:set-criterion to share the same name.
Local variables which are in scope for the current silcn:set-criterion may be referenced in probe:eval statements in constructing a probe:message. For example:
<probe:message>my local variable=<probe:eval>$var</probe:eval></probe:message>
The same syntax as for global variables should be used to declare a local variable.
Note that it is an error if a local variable is declared in final position within the silcn:expression.
The silcn:variable element must be located at a point within a silcn:expression where the assembled expression is self-contained and valid and also evaluates to a node-set.
For instance, the following use of a local variable is incorrect because the expression before the variable declaration - "//para[" - is incomplete:
<!--INCORRECT!--> //para[ <probe:variable> <probe:name>bad-variable</probe:name> <probe:eval>.</probe:eval> </probe:variable> starts-with( $bad-variable, 'Incorrect' ) ]
This next example is also wrong, because the expression "string( //para )", against which the variable $another-bad-variable is to be evaluated, evaluates to a string rather than a node-set:
<!--INCORRECT AGAIN!--> string( //para ) <probe:variable> <probe:name>another-bad-variable</probe:name> <probe:eval>.</probe:eval> </probe:variable> [ $bad-variable = 'so very wrong' ]