X-O lite

Graph Model Example

This document explain how to use X-O lite API to serialize/deserialize a complex graph of a java objects.

Serialization/deserialization of graphs of objects is a common problem. As the graph must be serialized to a XML tree, you must at some point create references to objects serialized elsewhere in the XML. When parsing the XML back, you want to obtain the exact same object graph. So you have to resolve the 'references' XML elements to the correct objects.

To perform such Serialization/deserialization of graphs, you have to answer the following questions:

How to uniquely identify objects ?
How to represent the references to an object in the serialized XML ?
How to ensure that only one instance of each object is created when deserializing ?
What to do when a reference to an object is found before the object itself has been parsed ?

Let's see how to do it with X-O lite with a simple graph example. In this example, we will serialize/deserialize a graph of Person objects. For this simple example a person has relation to:

Its father
Its mother
Its children

This is a quite simple example because there is only one node type in the graph but the topology of this graph is complex enough to demonstrate the capabilities of X-O lite.

Source files

The complete source code of this example is located in the:

xo-lite-1.0/src-projects/xo-lite-examples/src/main/java/net/sf/xolite/graph

directory inside the xo-lite-1.0-all.zip or xo-lite-1.0-all.tgz distribution archive.

To build and run all the examples:

Extract the distribution archive in any directory.
ensure you have Maven latest version installed and configured on your computer.
issue the "mvn compile exec:java" shell command from the .../xo-lite-1.0/src-projects/xo-lite-examples directory.

Object identity and references

With X-O lite you can choose to identify an object with:

A business key = an attribute (or a set of attributes) of the object with a value that is unique across the whole graph. Attributes like a name, user id, social security numbers ... will be good candidates.
A generated technical key = an unique key, generated for the purpose of the serialization that is associated to each object. In java, such a key can be given by the utility method System.identityHashCode(Object obj). This technical key is dropped once the graph deserialization is done (hence at each serialization another key will be used)

The object key(s) has just to be unique and invariant in the scope of XML serialization/deserialization process.

For this example, we will use the following set of attributes: firstName, lastName, birth date. Used together they uniquely identify a Person. Note that the fact of ensuring that this 3 fields combination is really unique is not a X-O lite concern, it should be ensured by your object graph implementation (using, for example, database unicity constraints).

With X-O lite, the references can be written as you want. There is no need to combine them in one XML field or to use special attribute types like "xsd:id" or "xsd:iderf". However, it greatly facilitate the XML parsing if the data composing you key is defined as attribute of your XML elements. This means that when you encounter the start of an serialized object or object reference you can directly know its key (because all attributes are accessible from the startElement method invocation).

so, in our example, a person will be represented in XML as:

    <person firstName="John" lastName="Smith" birthDate="1970-09-17">
        (... other person data ...)
    </person>

and a reference to the same person as:

    <ref_tag firstName="John" lastName="Smith" birthDate="1970-09-17" />

where ref_tag will "father", "mother" or "child" depending of the type of reference.

With those definitions, the serialization of the graph is already straightforward to write.

Resolving references at deserialization

In X-O lite, the object ensuring that the references and definition represent the same object is the XMLObjectFactory. You will have to wrote your own XMLObjectFactory returning always the same Person object for the same business key.

It's quite simple as XMLObjectFactory interface defines only one method and this method can access the attribute of the XML element that is currently parsed.

public class PersonObjectFactory implements XMLObjectFactory {


    private Map<String, Person> allPersons = new HashMap<String, Person>();


    public XMLSerializable createObject(String namespaceUri, String localName, XMLEventParser parser) throws XMLParseException {
        String firstName = Attributes.getMandatoryString(Person.FIRST_NAME_ATTRIBUTE, parser);
        String lastName = Attributes.getMandatoryString(Person.LAST_NAME_ATTRIBUTE, parser);
        String dateString = Attributes.getMandatoryString(Person.BIRTH_DATE_ATTRIBUTE, parser);
        String key = firstName + "|" + lastName + "|" + dateString;
        Person p = allPersons.get(key);
        if (p == null) {
            p = new Person();
            allPersons.put(key, p);
        }
        parser.delegateParsingTo(p);
        return p;
    }

}

You can see that, here, the factory doesn't bother with the element tag or URI. It also doesn't have to check if we are currently parsing a reference or an object definition. It just have to know about the attributes that are used to identify the objects and ensure that for one 'key' there is only one Person instance created. You can also see that the 'key' attribute(s) doesn't have to be attributes of the Person object, it is the case here because we choose a business key type, but the factory doesn't have to know it. (Of course if your graph contains more than one type of object, you have to check the tag to create the correct object type. But is stays very simple to implement).

Whether a person definition of a reference is encountered first doesn't change anything. The API does't impose that the person objects are defined before they are referenced in the XML.

The parser.delegateParsingTo(p) is factorized in the factory (rather than to put it after each call to the factory). The parsing methods of the Person object will naturally parse only the 3 attributes in case of reference or the full object in case of person definition. Calling this method also for references is not mandatory but it ensure that if, by mistake, a person is referenced but not defined in your XML the resulting object is not totally empty (it has at least the 3 key attributes parsed).

Putting all together

With that factory specified on the XMLEventParser, parsing implementing the parsing code in the Person object is business-as-usual:

    public void startElement(String uri, String localName, XMLEventParser parser) throws XMLParseException {
        if (parser.isFirstEvent()) {
            firstName = Attributes.getMandatoryString(FIRST_NAME_ATTRIBUTE, parser);
            lastName = Attributes.getMandatoryString(LAST_NAME_ATTRIBUTE, parser);
            birthDate = Attributes.getMandatoryDate(BIRTH_DATE_ATTRIBUTE, DATE_PATTERN, parser);
        } else if (localName.equals(FATHER_TAG)) {
            setFather((Person) parser.getFactory().createObject(uri, PERSON_TAG, parser));
        } else if (localName.equals(MOTHER_TAG)) {
            setMother((Person) parser.getFactory().createObject(uri, PERSON_TAG, parser));
        } else if (localName.equals(CHILD_TAG)) {
            addChild((Person) parser.getFactory().createObject(uri, PERSON_TAG, parser));
        }
    }


    public void endElement(String uri, String localName, XMLEventParser parser) throws XMLParseException {
        if (localName.equals(NICK_NAME_TAG)) {
            nickName = ElementText.getString(parser);
        } else if (localName.equals(GENDER_TAG)) {
            gender = ElementText.getMandatoryEnum(Gender.class, parser);
        } else if (localName.equals(DECEASE_DATE_TAG)) {
            deceaseDate = ElementText.getDate(DATE_PATTERN, null, parser);
        }
    }

You can see that the part resolving the object references (when a FATHER_TAG, MOTHER_TAG or CHILD_TAG) is notified have nothing special.

All the person are held in a PersonSet object which also use the PersonObjectFactory to instantiate persons.

Conclusion

The API does't impose a particular XML structure for serialized graphs. References can occur in the XML before the object is actually defined in the XML. You can also have mixed structures like having the choice to embed an object or doing a reference. The 'key' used for reference can be anything (either technically generated key or a set of business attributes of your objects).

The only restriction that you have to follow is that the data identifying a graph node must be serialized as attributes of the objects definitions and references.

Serializing/deserializing graphs of objects to XML is done naturally (in one pass) with X-O lite. The system scale very well even if the graph structure is more complex than in the given example (several graph nodes types, bi-directional references, ...)

Documentation

Examples

Comparisons