Comparison with JAXB

JAXB is the default java-to-XML binding API in java. The philosophy of JAXB (as many other java-to-XML binding API) is just the opposite of X-O lite philosophy: automate all the parsing so it is transparent to java objects. JAXB is one of the most advanced binding API, starting either from existing java classes or XML schema, it can generate the everything you need (i.e. generate the classes from the schema or the schema from the classes) and perform automatic marshalling/unmarshalling. JAXB is also highly configurable, with annotations in your java classes or your XML schema (depending on the starting point you've chosen), you can override the default tags, XML types, internal structures ... used in the java-to-XML binding.

To compare JAXB to X-O lite, I rewrote the recursive model and model extension examples with JAXB. The aim of the comparison is not performance as both JAXB and X-O lite are based on SAX and adds only a small overhead above it. So, they should have both performance comparable to bare SAX (which, by-the-way, is more than enough for most XML parsing use cases). The comparison will rather focus on the amount of code you have to write, and the fine-grained control you have on you model structure and on the XML used to serialize it.

With JAXB, the XML schema you use is tightly coupled to the structure of your model. For example, using an element with maxOccurs>1 in the schema will force you to have a List somewhere in your code (unless you write some 'XmlAdapter' objects by hand). So, to explore how bindings are done with JAXB, I wrote the examples with 3 XML schema flavors:

  • Using abstract XML types: It's the default way used by JAXB to represent polymorphism (when you use the schemagen tool to generate a XML schema from your sources)
  • Using substitution group: It's the way it is done in the X-O lite version of the example. JAXB supports XML substitution group but it has some impact on your model.
  • Using choice inside sequence: It allows modeling in a schema the same XML (as the one defined with substitution groups) but using only 'standard' features.

Source files

The complete source code for the rewrite of the 2 previous example (recursive expressions & extension) times 3 XML schema flavors (abstract type, substitution group & choice) are located respectively in the:

  • xo-lite-1.0/src-projects/xo-lite-examples/src/main/java/net/sf/xolite/expression/jaxb_abstr
  • xo-lite-1.0/src-projects/xo-lite-examples/src/main/java/net/sf/xolite/expression/jaxb_subst
  • xo-lite-1.0/src-projects/xo-lite-examples/src/main/java/net/sf/xolite/expression/jaxb_choice
  • xo-lite-1.0/src-projects/xo-lite-examples/src/main/java/net/sf/xolite/extension/jaxb_abstr
  • xo-lite-1.0/src-projects/xo-lite-examples/src/main/java/net/sf/xolite/extension/jaxb_subst

directory inside the xo-lite-1.0-all.zip or xo-lite-1.0-all.tgz distribution archive. (Rem: the combination choice+extension is not possible)

To build and run all the examples:

  • Extract the distribution archive in any directory.
  • ensure you have Maven latest version installed and configured on your computer.
  • issue the "mvn test" shell command from the .../xo-lite-1.0/src-projects/xo-lite-examples directory.

Implementation Description

TODO.

Analysis of the JAXB examples

First JAXB tend to force your model to match your schema in a one-to-one relation if you want to keep things simple and the bindings as transparent as possible (which is usually your aim if you use such a mapping tool). So, for all the schema flavors, I've modified the model to:

  • Put the variables of the ExpressionContext in a List instead of a Map. It forces me to create a new VariableDefinition class. Keeping the Map imply that we have to provide a 'XmlAdapter' object to JAXB to help the parsing of the Map. In this case, JAXB would parse the XML into a JAXB-compliant structure (so a List of VariableDefinition is used anyway) but the provided XmlAdapter can transform it to the structure we like. It means a lot of 'plumbing' code just to use a Map as internal representation.
  • The initial model object ConstantExpression which can have two values (true or false) and is mapped to two XML elements (<true/> and <false/>). This is difficult to bind with JAXB because the binding is not one-to-one. So, to solve that problem I transformed the class ConstantExpression into two classes TrueExpression and FalseExpression mapped respectively on the <true/> and <false/> XML elements.
  • The types of the schema are mapped to classes so JAXB tend to force you to write a type per class, generating extra types in the schema definition. For example, in the example the and and or have exactly the same type as far as XML is concerned, but the java objects are different because the behaviour of the two operators is different. The different schema types are required by JAXB to perform the mapping.

In addition to those 'global' changes, each of the three schema flavors have some direct implication in the code or XML.

JAXB binding with abstract XML element

In this case, the schema (BooleanExpression_abstr.xsd) defines an abstract 'booleanExpression' type and a set concrete types extending (named 'and', 'or', 'not' ...). The objects containing an expression or a list of expression just define they contain <expression> element(s) of type 'booleanExpression' so in the XML, we can use the concrete sub-types instead of expression.

The advantage of this solution is that the java code is not affected by the mapping.

The disadvantage is on the XML side:

  • this solution really requires one-to-one mapping between java types and XML types.
  • You have to use the <expression> element for all expressions and each time use the 'xsi:type' meta-attribute in your XML to define what exact sub-type of expression you use.

Unfortunately, the resulting XML is not readable anymore, what used to be:

<?xml version="1.0"?>
<or xmlns="xo-lite.sf.net/examples/expression">
    <and>
        <variable name="ga" />
        <or>
            <true />
            <variable name="bu" />
        </or>
    </and>
    ...

becomes

<?xml version="1.0"?>
<expression xsi:type="or" 
            xmlns="xo-lite.sf.net/examples/expression/jaxb_abstr" 
            xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    <expression xsi:type="and">
        <expression xsi:type="variable" name="ga" />
        <expression xsi:type="or">
            <expression xsi:type="true" />
            <expression xsi:type="variable" name="bu" />
        </expression>
    </expression>
    ...
 

JAXB with substitution groups

As JAXB supports substitution groups, it is possible to keep a schema equivalent to the one used for the X-O lite example. In this case, the resulting XML stays exactly the same, but unfortunately it's on the code side that you have border effects.

If you use substitution groups with JAXB, you model is forced to use JAXBElement... to hold the related objects. Hence, JAXB-specific classes are polluting your code, the binding is not transparent anymore.

Of course, it's easy write additional code (special getters and setters) doing on-the-fly transformation between the JAXBElement and their contained values (and JAXB generates an ObjectFactory to help you doing that). But it means additional coding (not done in the example).

JAXB with choices

A way to avoid XML substitution groups is to embed a <choice> in a <sequence>. The limitation here is that the choice must enumerate all the possible XML elements. It means that this option deosn't allow extension (as defined in the model extension example). But, again, the resulting XML stays exactly the same as in X-O lite example.

As <sequence> and <choice> are regular XML constructs, JAXB supports it without problem and without requiring special JAXBElement.... The only drawback in code is that a <choice> with maxOccurs="1" is mapped by JAXB to a the set of all possible attributes leaving to the implementer to ensure that only one value is defined in time.

Again, it's easy write additional code (special getters and setters) doing on-the-fly casting to one single internal element (not done in the example). But it means, again, that the parsing is not transparent and you are obliged to write code by hand to have the object model you want.

Extensions

JAXB is a very capable API. Rewriting the extension example with both the schema using abstract types and substitution groups was possible without additional impact on the model.

Summary

While for very simple objects the advantages of automatic java-to-XML binding tools like JAXB is evident, for more complex cases like the example tested here the advantage of JAXB is far less obvious. I would not say that in this case, X-O lite is better that JAXB. What you will perceive as the winner will depends of the compromise you are ready to do to have automated bindings.

Good point for JAXB: it is possible to use it to implement the complex examples of recursive model and extension.

Bad point for JAXB: for those examples it is not possible to have an easy solution (easy meaning not writing plumbing code by hand) preserving both 'ideal' model and XML.

I think that those examples is a java-to-XML binding problem with limit complexity for JAXB where an generic automated binding start to impose too much constraints and hence require as much effort as a hand-written solution like X-O lite. It's up to you to evaluate if your specific case is simpler or more complex than that. There are lot of cases simpler but we can also imagine cases like supporting several schemas with the same data structures or parsing complex graphs without 'refids' attributes like in the last X-O lite example.