Class TXQueryEngine

DescriptionHierarchyFieldsMethodsProperties

Unit

Declaration

type TXQueryEngine = class(TObject)

Description

This is the XPath/XQuery-engine

You can use this class to evaluate a XPath/XQuery-expression on a certain document tree.
For example, TXQueryEngine.evaluateStaticXPath2('expression', nil) returns the value of the evaluation of expression.

A simple functional interface is provided by the function query.

Syntax of a XQuery / XPath / Pseudo-XPath-Expression

This query engine currently supports XPath 2.0, XQuery 1.0 and JSONiq, with some extensions and minor deviations, as well as parts of XPath 3.0 and XQuery 3.0.

A formal syntax definition of these languages is given at: http://www.w3.org/TR/xpath20/ , http://www.w3.org/TR/xquery/ , http://www.jsoniq.org/ , http://www.w3.org/TR/xpath-30/ and http://www.w3.org/TR/xquery-30/ .

Some very basic, standard XPath examples, for people who have never seen XPath before:

  • "something" or "something"
    This returns the string 'something'.

  • $var
    This returns the value of the variable var.

  • a + b
    This returns the numerical sum of a and b
    Instead of +, you can also use one of operators -, *, div, idiv, =, !=, <, >, <=, =>, to, or, and, eq, ne, lt, gt, le, ge

  • 1245.567
    This returns the number 1245.567

  • concat("a","b","c")
    This concatenates the strings a,b and c.
    There are many more functions than concat, a list of extension functions is given below. Standard functions are described at http://www.w3.org/TR/xquery-operators/ and http://www.w3.org/TR/xpath-functions-30/

  • (1,2,3)
    This returns a sequence (1,2,3).
    Sequences cannot be nested.

  • (1,2,3)[. mod 2 = 1]
    This returns the sequence (1,3) of all odd numbers.

  • @attrib
    This is the value of the attribute attrib of the current tag

  • text()
    This returns a sequence of all direct text node children of the current tag (see below)

  • comment()
    This returns a sequence of all direct comment children of the current tag

  • xyz
    This returns a sequence of all direct children of the current tag whose node name is xyz

  • .//xyz
    This returns a sequence of all children of the current tag whose node name is xyz

  • abc/def
    This returns a sequence of all children of the current tag whose name is def and whose parent has the name abc and is a direct child of the current tag

  • abc//def
    This returns a sequence of all children of the current tag whose name is def and whose parent has the name abc

  • /html
    This returns a the html root tag

  • /html//text()
    This returns all text nodes in the current html document

  • /html//.[condition]/text()
    This returns all text nodes in the current html document whose parent satisfies condition

  • for $x in seq return $x + 1
    This adds 1 to all elements in the sequence seq

  • some $x in seq satisfies condition
    This returns true iff one element of seq satisfies condition

  • every $x in seq satisfies condition
    This returns true iff every element of seq satisfies condition

  • if (condition) then $x else $y
    This returns $x if condition is true, and $y otherwise

  • function ($a, $b) { $a + $b }
    This returns an anonymous function which adds two numbers.

A complete list of supported functions is given at http://www.benibela.de/documentation/internettools/xpath-functions.html

Differences between this implementation and standard XPath/XQuery (most differences can be turned off with the respective option or the field in the default StaticContext):

Extended syntax:

  • x"something{$var}{1+2+3}..."
    If a string is prefixed with an x, all expressions within {..}-parenthesis are evaluated and concattenated to the raw text, similarily to the value of a xquery direct attribute constructor. (option: extended-strings)

  • var:=value
    This assigns the value value to the global variable var and returns value
    So you can e.g. write ((a := 2) + 3) and get 5 and a variable $a with the value 2
    $a := 2 is also allowed
    Can also be used to change object properties, array elements and sequences. $a("property")(1)("foo")[] := 17 appends 17 to {"property": [{"foo": THIS }]}. (but remember that everything is immutable! so it makes a copy, except objects which are shared)

  • All string comparisons are case insensitive, and "clever", e.g. '9xy' = '9XY' < '10XY' < 'xy',
    unless you use collations.

  • The default type system is weaker typed, most values are automatically converted if necessary, e.g. "1" + 2 returns 3.
    (option: strict-type-checking)

  • If a namespace prefix is unknown, the namespace is resolved using the current context item.
    This basically allows you to do namespace prefix only matching. (option: use-local-namespaces)

  • JSON-objects: JSON/JSONiq objects are supported. (option: json)
    Arrays can be created with [a,b,c]
    Like a sequence they store a list of values, but they can be nested with each other and within sequences.

    Object can be created with {"foobar": 123, "hallo": "world!", ...}
    They store a set of values as associative map. The values can be accessed similar to a function call, e.g.: {"name": value, ...}("name") as documented in the JSONiq extension standard.
    This implementation also provides an alternative property syntax, where these properties can be accessed with the usual OOP property dot syntax, i.e. {"name": 123}.name will evaluate to 123 (can be changed with the option property-dot-notation).
    If an object is assigned to a variable, you can append the dot to the variable name, e.g. let $obj := {"name": 123} return $obj.name. (drawback: variable names are not allowed to contains dots, if this extension is enabled. If set to "unambiguous", the dot operator can be used in cases where no confusion with variables with dots in their name can occur, e.g. ($a).b, $a .b or $a."b". )
    Objects are immutable, but the properties of objects that are global variables can seemingly be changed with $obj.property := newvalue. This creates a new object with name $obj that has all the properties of the old objects plus the changed properties.

    Objects can be assigned to each other (e.g. obj1 := {}, obj2 := {}, obj2.prop := 123, obj1.sub := obj2 ).
    Then obj1.sub.prop = 123, but changing obj1.sub.prop won't change obj2.prop (i.e. the objects are always copied, there are no pointers).
    An alternative, older object creation syntax is the object-function (see below).
    At default all values are allowed as object properties. If the option pure-json-objects is enabled, property values are converted to pure JSON types. (empty sequence => null, sequences => array, nodes => string)


    The additional module xquery_json implements all JSONiq functions, except JSONiq update and roundtrip-serialization.
    Using it also activates the JSONiq literal mode, in which true, false, null evaluate to true(), false(), jn:null(). (option: json-literals).

  • Element tests based on types of the xml are not supported (since it cannot read schemas )

  • Regex remarks (it might be changed to standard compatible matching in future):

    • The usual s/i/m/x-flags are allowed, and you can also use '-g' to disable greedy matching.

    • $0 and $& can be used as substitute for the whole regex, and $i or ${i} is substituted with the i-th submatch, for any integer i. Therefore $12 is match 12, while ${1}2 is match 1 followed by digit 2

  • Most of them can be disabled with 'declare option pxp:respective-option "off"' (that there are syntax modifying options is another extension)

You can look at the unit tests in the tests directory to see many (> 5000) examples.

Using the class in FPC

The simplest way to use it with the function query and the defaultQueryEngine.

You can evaluate XQuery/XPath-expressions by calling the class methods, e.g. TXQueryEngine.evaluateStaticXPath3('expression', nil) or TXQueryEngine.evaluateStaticXPath2('expression', nil).toInt64 which returns the value of the expression, converted to the corresponding type.
If you want to process a html/xml document, you have to pass the root TTreeNode (obtained by TTreeParser) instead of nil.


If you call TXQueryEngine.evaluateStaticXPath3('expression', nil) without a following toType-call, you obtain the result as an IXQValue. (see IXQValue on how to use it)
With a toType-call it is converted in the corresponding type, e.g. toInt64 returns a int64, toString a string, toNode a TTreeNode or toFloat an extended.

You can also create a TXQueryEngine instance and then call parseXPath2('expression') and evaluateXPath2().
This is not as easy, but you have more options.

The unit simpleinternet provided a simpler procedural interface which is now deprecated.



Compatibility to previous version
The following major breaking changes occured to make it more standard compatible:

  • Language changes:

    • Function and type names are now case sensitive.

    • The function pxp:filter has been removed to avoid confusion with the function fn:filter. (The replacement pxp:extract has existed for years)

    • Declarations in XQuery need to be separated by ; and conflicting declarations or non-module queries containing only declarations are forbidden

    • Variables are no longer replaced inside "-strings. Instead x"-strings were added. All old uses of "$var;" therefore have to be replaced by x"{$var}"

    • All string comparisons are now (non-localized ascii) case-insensitive, not only equal comparisons (as always mentioned in the documentation)

    • Variables defined by a PXPath expression inside an PXPath eval call are exported to the outside

    • == is no longer allowed as alias to =

    • the function deepNodeText is now called deep-text

    • regex flag s defaults to off

  • API changes to previous versions:

    • IXQValue.getChild was renamed to get, TXQValueSequence.addChild to add and addChildMerging to addOrdered. "child" never made any sense here

    • ParentElement/RootElement/TextElement have been moved from TXQueryEngine to TXQEvaluationContext. Avoid using them, just pass the element to evaluate.

    • Callbacks for external variables/functions have been changed to ask for a namespace URI instead a namespace object with URI/prefix (for 3's EQNames which do not have a prefix)

    • Parsing modifying properties Allow* are now moved in a ParsingOptions record. It was becoming too confusing

    • everything has been renamed, pseudoxpath.pas => xquery.pas, TPseudoXPathParser => TXQueryEngine, TPXPValue => IXQValue

    • The TPXPValue class has been replaced by an interface => memory deallocation has become implicit and .free must not be called.
      => functions like toString and asString become identically and latter has been removed. Similarly functions like getValueAsString() are not needed anymore and have been removed as well

    • TPXPValue is now a class with subclasses instead of a case record

    • Some things have been renamed, the new names should be obvious

    • The evaluate functions return now a TPXPValue instead of a string, since they may return a typed value or sequence.

Hierarchy

  • TObject
  • TXQueryEngine

Overview

Fields

Public CurrentDateTime: TDateTime;
Public ImplicitTimezoneInMinutes: Integer;
Public StaticContext: TXQStaticContext;
Public VariableChangelog: TXQVariableChangeLog;
Public OnDeclareExternalVariable: TXQDeclareExternalVariableEvent;
Public OnDeclareExternalFunction: TXQDeclareExternalFunctionEvent;
Public OnImportModule: TXQImportModuleEvent;
Public OnTrace: TXQTraceEvent;
Public OnCollection: TXQEvaluateVariableEvent;
Public OnUriCollection: TXQEvaluateVariableEvent;
Public OnParseDoc: TXQParseDocEvent;
Public ParsingOptions: TXQParsingOptions;
Public GlobalNamespaces: TNamespaceList;
Public AutomaticallyRegisterParsedModules: boolean;
Public DefaultParser: TTreeParser;

Methods

Public procedure clear;
Public function parseXPath2(s:string; sharedContext: TXQStaticContext = nil): IXQuery;
Public function parseXQuery1(s:string; sharedContext: TXQStaticContext = nil): IXQuery;
Public function parseXPath3(s:string; sharedContext: TXQStaticContext = nil): IXQuery;
Public function parseXQuery3(s:string; sharedContext: TXQStaticContext = nil): IXQuery;
Public function parseCSS3(s:string): IXQuery;
Public function parseQuery(s:string; model: TXQParsingModel; sharedContext: TXQStaticContext = nil): IXQuery;
Public function evaluate(var context: TXQEvaluationContext): IXQValue;
Public function evaluate(const contextItem: IXQValue): IXQValue;
Public function evaluate(tree:TTreeNode = nil): IXQValue;
Public constructor create;
Public destructor Destroy; override;
Public function evaluateXPath2(expression: string; tree:TTreeNode = nil): IXQValue;
Public function evaluateXPath2(expression: string; const contextItem: IXQValue): IXQValue;
Public function evaluateXQuery1(expression: string; tree:TTreeNode = nil): IXQValue;
Public function evaluateXQuery1(expression: string; const contextItem: IXQValue): IXQValue;
Public function evaluateXPath3(expression: string; tree:TTreeNode = nil): IXQValue;
Public function evaluateXPath3(expression: string; const contextItem: IXQValue): IXQValue;
Public function evaluateXQuery3(expression: string; tree:TTreeNode = nil): IXQValue;
Public function evaluateXQuery3(expression: string; const contextItem: IXQValue): IXQValue;
Public function evaluateCSS3(expression: string; tree:TTreeNode = nil): IXQValue;
Public function evaluateCSS3(expression: string; const contextItem: IXQValue): IXQValue;
Public class function evaluateStaticXPath2(expression: string; tree:TTreeNode = nil): IXQValue;
Public class function evaluateStaticXPath2(expression: string; const contextItem: IXQValue): IXQValue;
Public class function evaluateStaticXPath3(expression: string; tree:TTreeNode = nil): IXQValue;
Public class function evaluateStaticXPath3(expression: string; const contextItem: IXQValue): IXQValue;
Public class function evaluateStaticXQuery1(expression: string; tree:TTreeNode = nil): IXQValue;
Public class function evaluateStaticXQuery1(expression: string; const contextItem: IXQValue): IXQValue;
Public class function evaluateStaticXQuery3(expression: string; tree:TTreeNode = nil): IXQValue;
Public class function evaluateStaticXQuery3(expression: string; const contextItem: IXQValue): IXQValue;
Public class function evaluateStaticCSS3(expression: string; tree:TTreeNode = nil): IXQValue;
Public procedure registerModule(module: IXQuery);
Public function findModule(const namespaceURL: string): TXQuery;
Public class function findNativeModule(const ns: string): TXQNativeModule;
Public class procedure registerCollation(const collation: TXQCollation);
Public class function getCollation(id:string; base: string; errCode: string = 'FOCH0002'): TXQCollation;
Public class procedure registerNativeModule(const module: TXQNativeModule);
Public class function collationsInternal: TStringList;
Public function getEvaluationContext(staticContextOverride: TXQStaticContext = nil): TXQEvaluationContext;
Public function findNamespace(const nsprefix: string): INamespace;
Public class function findOperator(const pos: pchar): TXQOperatorInfo;

Properties

Public property ExternalDocumentsCacheInternal: TStringList read FExternalDocuments write FExternalDocuments;
Public property LastQuery: IXQuery read FLastQuery;

Description

Fields

Public CurrentDateTime: TDateTime;

Current time

Public ImplicitTimezoneInMinutes: Integer;

Local timezone (high(integer) = unknown, 0 = utc).

Public StaticContext: TXQStaticContext;

XQuery static context, defining various default values.

Public VariableChangelog: TXQVariableChangeLog;

All global variables that have been set (if a variable was overriden, it stores the old and new value)

Public OnDeclareExternalVariable: TXQDeclareExternalVariableEvent;
 
Public OnDeclareExternalFunction: TXQDeclareExternalFunctionEvent;

Event called to import a function that is declared as "declare function ... external" in a XQuery expression.

Public OnImportModule: TXQImportModuleEvent;

Event called to import a XQuery module that has not previously be defined

Public OnTrace: TXQTraceEvent;

Event called by fn:trace

Public OnCollection: TXQEvaluateVariableEvent;

Event called by fn:collection

Public OnUriCollection: TXQEvaluateVariableEvent;

Event called by fn:collection

Public OnParseDoc: TXQParseDocEvent;

Event called by fn:doc (if nil, a default xml parser is used)

Public ParsingOptions: TXQParsingOptions;
 
Public GlobalNamespaces: TNamespaceList;

Globally defined namespaces

Public AutomaticallyRegisterParsedModules: boolean;
 
Public DefaultParser: TTreeParser;
 

Methods

Public procedure clear;

Clears all data.

Public function parseXPath2(s:string; sharedContext: TXQStaticContext = nil): IXQuery;

Parses a new XPath 2.0 expression and stores it in tokenized form.

Public function parseXQuery1(s:string; sharedContext: TXQStaticContext = nil): IXQuery;

Parses a new XQuery 1.0 expression and stores it in tokenized form.

Public function parseXPath3(s:string; sharedContext: TXQStaticContext = nil): IXQuery;

Parses a new XPath 3.0 expression and stores it in tokenized form. Work in progress, only a small set of 3.0 statements is supported

Public function parseXQuery3(s:string; sharedContext: TXQStaticContext = nil): IXQuery;

Parses a new XQuery 3.0 expression and stores it in tokenized form. Work in progress, only a small set of 3.0 statements is supported

Public function parseCSS3(s:string): IXQuery;

Parses a new CSS 3.0 Selector expression and stores it in tokenized form.

Public function parseQuery(s:string; model: TXQParsingModel; sharedContext: TXQStaticContext = nil): IXQuery;

Parses a new expression and stores it in tokenized form.

Public function evaluate(var context: TXQEvaluationContext): IXQValue;
 
Public function evaluate(const contextItem: IXQValue): IXQValue;

Evaluates a previously parsed query and returns its value as IXQValue

Public function evaluate(tree:TTreeNode = nil): IXQValue;

Evaluates a previously parsed query and returns its value as IXQValue

Public constructor create;
 
Public destructor Destroy; override;
 
Public function evaluateXPath2(expression: string; tree:TTreeNode = nil): IXQValue;

Evaluates an XPath 2.0 expression with a certain tree element as current node.

Public function evaluateXPath2(expression: string; const contextItem: IXQValue): IXQValue;
 
Public function evaluateXQuery1(expression: string; tree:TTreeNode = nil): IXQValue;

Evaluates an XQuery 1.0 expression with a certain tree element as current node.

Public function evaluateXQuery1(expression: string; const contextItem: IXQValue): IXQValue;
 
Public function evaluateXPath3(expression: string; tree:TTreeNode = nil): IXQValue;

Evaluates an XPath 3.0 expression with a certain tree element as current node. Work in progress, only a small set of 3.0 statements is supported

Public function evaluateXPath3(expression: string; const contextItem: IXQValue): IXQValue;
 
Public function evaluateXQuery3(expression: string; tree:TTreeNode = nil): IXQValue;

Evaluates an XQuery 3.0 expression with a certain tree element as current node. Work in progress, only a small set of 3.0 statements is supported

Public function evaluateXQuery3(expression: string; const contextItem: IXQValue): IXQValue;
 
Public function evaluateCSS3(expression: string; tree:TTreeNode = nil): IXQValue;

Evaluates an CSS 3 Selector expression with a certain tree element as current node.

Public function evaluateCSS3(expression: string; const contextItem: IXQValue): IXQValue;
 
Public class function evaluateStaticXPath2(expression: string; tree:TTreeNode = nil): IXQValue;

Evaluates an expression with a certain tree element as current node.

Public class function evaluateStaticXPath2(expression: string; const contextItem: IXQValue): IXQValue;
 
Public class function evaluateStaticXPath3(expression: string; tree:TTreeNode = nil): IXQValue;
 
Public class function evaluateStaticXPath3(expression: string; const contextItem: IXQValue): IXQValue;
 
Public class function evaluateStaticXQuery1(expression: string; tree:TTreeNode = nil): IXQValue;
 
Public class function evaluateStaticXQuery1(expression: string; const contextItem: IXQValue): IXQValue;
 
Public class function evaluateStaticXQuery3(expression: string; tree:TTreeNode = nil): IXQValue;
 
Public class function evaluateStaticXQuery3(expression: string; const contextItem: IXQValue): IXQValue;
 
Public class function evaluateStaticCSS3(expression: string; tree:TTreeNode = nil): IXQValue;

Evaluates an expression with a certain tree element as current node.

Public procedure registerModule(module: IXQuery);

Registers an XQuery module. A XQuery module is created by parsing (not evaluating) a XQuery expression that contains a "module" declaration

Public function findModule(const namespaceURL: string): TXQuery;

Finds a certain registered XQuery module

Public class function findNativeModule(const ns: string): TXQNativeModule;

Finds a native module.

Public class procedure registerCollation(const collation: TXQCollation);

Registers a collation for custom string comparisons

Public class function getCollation(id:string; base: string; errCode: string = 'FOCH0002'): TXQCollation;

Returns the collation for an url id

Public class procedure registerNativeModule(const module: TXQNativeModule);
 
Public class function collationsInternal: TStringList;
 
Public function getEvaluationContext(staticContextOverride: TXQStaticContext = nil): TXQEvaluationContext;
 
Public function findNamespace(const nsprefix: string): INamespace;
 
Public class function findOperator(const pos: pchar): TXQOperatorInfo;
 

Properties

Public property ExternalDocumentsCacheInternal: TStringList read FExternalDocuments write FExternalDocuments;
 
Public property LastQuery: IXQuery read FLastQuery;

Last parsed query


Generated by PasDoc 0.14.0.