Unit extendedhtmlparser

DescriptionUsesClasses, Interfaces, Objects and RecordsFunctions and ProceduresTypesConstantsVariables

Description

This units contains a template based html parser named THtmlTemplateParser

Overview

Classes, Interfaces, Objects and Records

Name Description
Class ETemplateParseException  
Class EHTMLParseException  
Class EHTMLParseMatchingException  
Class THtmlTemplateParser This is the template processor class which can apply a template to one or more html documents.
Class TXQTerm_VisitorFindWeirdGlobalVariableDeclarations  

Functions and Procedures

function guessExtractionKind(e: string): TExtractionKind;

Types

TTemplateElementType = (...);
TTemplateElementFlag = (...);
TTemplateElementFlags = set of TTemplateElementFlag;
TReplaceFunction = procedure (variable: string; var value:string) of object;
TStringAttributeList = tStringList;
TTrimTextNodes = (...);
TKeepPreviousVariables = (...);
TXQTermVariableArray = array of TXQTermVariable;
TExtractionKind = (...);

Constants

HTMLPARSER_NAMESPACE_URL = 'http://www.benibela.de/2011/templateparser';

Description

Functions and Procedures

function guessExtractionKind(e: string): TExtractionKind;
 

Types

TTemplateElementType = (...);

These are all possible template commands, for internal use

Values
  • tetIgnore: useless thing
  • tetHTMLOpen: normal html opening tag, searched in the processed document
  • tetHTMLClose: normal html closing tag, searched in the processed document
  • tetHTMLText: text node, , searched in the processed document
  • tetMatchElementOpen:  
  • tetMatchElementClose:  
  • tetMatchText:  
  • tetCommandMeta: <template:meta> command to specify how strings are compared
  • tetCommandMetaAttribute: <template:meta-attribute> command to specify how attributes are compared
  • tetCommandRead: <template:read> command to set a variable
  • tetCommandShortRead: <template:s> command to execute a xq expression
  • tetCommandLoopOpen: <template:loop> command to repeat something as long as possible
  • tetCommandLoopClose:  
  • tetCommandIfOpen: <template:if> command to skip something
  • tetCommandIfClose:  
  • tetCommandElseOpen: <template:else> command to skip something
  • tetCommandElseClose:  
  • tetCommandSwitchOpen: <template:switch> command to branch
  • tetCommandSwitchClose:  
  • tetCommandSwitchPrioritizedOpen: <template:switch-prioritized> command to branch
  • tetCommandSwitchPrioritizedClose:  
  • tetCommandSiblingsOpen:  
  • tetCommandSiblingsClose:  
  • tetCommandSiblingsHeaderOpen:  
  • tetCommandSiblingsHeaderClose:  
TTemplateElementFlag = (...);
 
Values
  • tefOptional:  
  • tefSwitchChild:  
TTemplateElementFlags = set of TTemplateElementFlag;
 
TReplaceFunction = procedure (variable: string; var value:string) of object;

Possible callback for getting the value of a variable

TStringAttributeList = tStringList;
 
TTrimTextNodes = (...);

Specifies when the text of text nodes is trimmed. Each value removes strictly more whitespace than the previous ones.

Values
  • ttnNever: never, all whitespace is kept
  • ttnForMatching: When comparing two text nodes, whitespace is ignored; but all whitespace will be returned when reading text
  • ttnWhenLoadingEmptyOnly:  
  • ttnWhenLoading: All starting/ending whitespace is unconditionally removed from all text nodes
TKeepPreviousVariables = (...);

This specifies the handling of the variables read in the previous document

In every case all node variables are converted to strings (because the nodes point to elements of the previous document, but the previous document will be deleted)

Values
  • kpvForget: Old variables are deleted
  • kpvKeepValues: Old variables are moved from the property variableChangelog to the property oldVariableChangelog
  • kpvKeepInNewChangeLog: Old variables stay where they are (i.e. in the variableChangelog property merged with the new ones)
TXQTermVariableArray = array of TXQTermVariable;
 
TExtractionKind = (...);
 
Values
  • ekAuto:  
  • ekXPath2:  
  • ekXPath3:  
  • ekPatternHTML:  
  • ekPatternXML:  
  • ekCSS:  
  • ekXQuery1:  
  • ekXQuery3:  
  • ekMultipage:  

Constants

HTMLPARSER_NAMESPACE_URL = 'http://www.benibela.de/2011/templateparser';

xml compatible namespace url to define new template prefixes

Author


Generated by PasDoc 0.14.0.