Portable EXPath Extension Functions

Adam Retter

adam@evolvedbinary.com
@adamretter

Adam Retter

  • Consultant

    • Scala / Java

    • Concurrency

    • XQuery, XSLT

  • Open Source Hacker

    • Predominantly NoSQL Database Internals

    • e.g. eXist, RocksDB, Shadoop (Hadoop M/R framework)

  • W3C Invited Expert for XQuery WG

  • Author of  the "eXist" book for O'Reilly

  • XML Summer School Faculty (13/09/15)

A talk about incompatibility...

TODO...

  1. The Portability Problem

  2. Previous Efforts

  3. Processor Varieties

  4. Our Solution

Context

  • XPDL

    • XPath Derived Language e.g. XQuery/XSLT/XProc/XForms

    • Typically uses F+O as Standard Library

  • Assumption: We want to write apps in XPDLs

    • Less code/impedance-mismatch

      • ~67% reduction in LoC vs Java 1

    • Serve/Process the Web

    • Process structure/semi-structured data

    • Process mixed-content

1 Developing an Enterprise Web Application in XQuery
http://download.28msec.com/sausalito/technical_reading/enterprise_webapps.pdf

The Portability Problem

XPDLs are typically specified as open standards

...however...

Applications written in XPDLs are rarely useable across implementations

Vendor Extensions are EVIL!

  • Seem like a good idea at the time

    • Easy/Quick to get something done

  • Many Types

    • Syntax extensions

      • e.g xquery "1.0-ml";

    • Data Type Extensions

      • e.g xs:binary-document

    • Deviation from Standards

      • e.g fn:matches($input*, $pattern)

    • Indexes, Triggers, etc.

    • Extension Functions

XPDL Extension Functions

  • Our focus, due to their impact

    • Disguised by standard function call interface

      • FunctionCall ::= EQName ArgumentList

    • Distributed throughout an XPDL code-base

  • XPDL Extension Functions

    • Typically implemented in lower-level language

      • C / C++ / Java / .NET etc.

    • Vendor/Processor specific

      • Consistent across processor versions?

    • EXPath

      • Requires reimplementation for every processor

      • Not supported by all processors

Impact of Extension Functions

Impact of Extension Functions

Vendor Extensions ultimately:

  • Introduce Hurdles to Portability

  • Restrict user freedom

    • Vendor lock-in

    • Lesson the impact of frameworks

  • Fragment the XPDL community

    • Create knowledge/skills silos

    • Reduce code-sharing

    • Limit code-reuse

    • Reduce collaboration

    • XPDL Processor specific forks of XPDL apps

Other Efforts to Improve Portability

  • XSLT 1.1 (2000)

    • Stated primary goal - " improve stylesheet portability"

    • Adds xsl:script for extension functions

    • Highly contentious. Abandoned!

  • EXSLT (2001)

    • Extended the XSLT 1.0 Standard Library

    • Just a Specification

    • Each vendor implemented for own processor

Other Efforts to Improve Portability

  • FunctX (2006)

    • A Library of >150 useful common functions

    • Implementations in both XQuery and XSLT

  • EXQuery (2008)

    • Just one specification to date: RESTXQ

    • Common implementation in Java

  • EXPath (2009)

    • Standards for extension functions

    • Some common implementations in Java

Lessons Learnt

  • Standards are nice, but require implementations

    • Really need >50% of market-share to implement

  • Vendors are lazy/limited

    • Standards are often retrospective!

  • Implementation Type Mapping (XSLT 1.1)

    • Showed great promise for integration

    • Must be implementation language agnostic

  • No single language for low-level implementation

    • Won't be accepted by developers

    • Won't be accepted by vendors

Lessons Learnt

  • XPDL Processors are surprisingly similar!

interface StandardFunc {
  Item item(QueryContext qc, InputInfo ii) throws QueryException;
}
interface BasicFunction {
  Sequence eval(Sequence[] args, Sequence contextSequence)
    throws XPathException;
}
interface ExtensionFunctionCall {
  SequenceIterator call(SequenceIterator[] arguments, XPathContext context)
    throws XPathException;
}
class XQFunction {
  public:
    Sequence createSequence(DynamicContext* context, int flags=0) const;
};

Processor Varieties

  • We want to support XPDL Extension Functions

    • For all XPDL processors

    • What XPDL procesor implementations exist?

Our Requirements

  • Focus on Extension Function Implementation

    • Standardisation is alive in W3C and EXPath

    • Ideally implement just once (ever!)

    • Ideally compatible with any XPDL processor

  • Polyglot

    • Must support at least Java and C++ implementations

    • Ideally also C for integration with other languages

  • Specify an Implementation Type Mapping

    • XDM types to/from XPDL processor implementation language types

Our Solution

  • Source-to-source Compilation

    • Using the Haxe cross-platform tookit

    • Haxe Lang for high-level implementation

      • Similar to ECMAScript

    • Haxe cross-compiler for target implementation

  • XDM Implementation Type Mapping to Haxe Lang Interfaces

  • Function Implementation Type Mapping to Haxe Lang Interfaces

    • Based on: XPath 3.0 Function Call

    • Based on: XQuery 3.0 Function Declaration

interface Item {
  public function stringValue() : xpdl.xdm.String;
}

interface AnyType {}

interface AnyAtomicType extends Item extends AnyType {}

class Boolean implements AnyAtomicType {
  var value: Bool;
  public function new(value) {
    this.value = value;
  }
  public function stringValue() {
    return new xpdl.xdm.String(Std.string(value));
  }
  public function haxe() {
    return value;
  }
}

class String implements AnyAtomicType {
  var value: HString;
  public function new(value) {
    this.value = value;
  }
  public function stringValue() {
    return this;
  }
  public function haxe() {
    return value;
  }
}

Haxe XDM Impl. Type Mapping

Haxe Function Implementation Type Mapping

interface Function {

    public function signature() : FunctionSignature;

    public function eval(arguments: Array<Argument>, context: Context) : Sequence;
}

class FunctionSignature {
    var name: QName;
    var returnType: SequenceType;
    var paramLists: Array<Array<Param>>;

    public function new(name, returnType, paramLists) {
        this.name = name;
        this.returnType = returnType;
        this.paramLists = paramLists;
    }
}

Proof-of-concept

  • Implementation of EXPath File Module

    • Implemented in Haxe Lang

    • Coded to XDM Implementation Type Mapping Interfaces

  • Focused on just file:exists function

    •  

    • Function Call Type + Function Implementation Type

    • xs:string

    • xs:boolean

  • Status

    • Runnable on any processor that supports Haxe Implementation Type Mapping

file:exists($path as xs:string) as xs:boolean

file:exists in Haxe

class ExistsFunction implements Function {

    private static var sig = new FunctionSignature(
        new QName("exists", FileModule.NAMESPACE, FileModule.PREFIX),
        new SequenceType(Some(new ItemOccurrence(Boolean))),
        [
            [
                new Param(new QName("path"),
                new SequenceType(Some(new ItemOccurrence(xpdl.xdm.Item.String))))
            ]
        ]
    );

    public function new() {}

    public function signature() {
        return sig;
    }

    public function eval(arguments : Array<Argument>, context: Context) {
        var path = arguments[0].getArgument().iterator().next().stringValue().haxe();
        var exists = FileSystem.exists(path);
        return new ArraySequence( [ new Boolean(exists) ] );
    }
}

Proof-of-concept: Processor

  • Added support to eXist

    • Static mapping of Haxe XDM types

    • Dynamic mapping of Haxe function call interfaces

      • Bytecode generation of classes and objects: cglib

    • Currently ~300 lines of Java code

  • Status

Conclusion

  • Implement Once

  • Cross-Compile and Compile Once

  • Supports any processor

    • Requires Vendor to ( just once) implement:

      • XDM Implementation Type Mapping

      • Function Implementation Type Mapping

Win!

XPDL Extension
Function in
Haxe

XQuery

XSLT

XPath

XProc

XForms