This document explain how following Object-Oriented (OO) development practices lead to this X-O lite API.
OO is a paradigm applying to all phases of software development (analysis, design, programming, ...). I will list here only some points relevant to XML parsing:
1. OO is a way to model your software system. An OO development starts by building an object model. Technical considerations (such as serialization to XML) come after and should only enrich or specialize architecture based on the business model.
2. The aim of OO design is to increase modularity. The objects don't expose their internal data or implementation details; they rather expose interfaces to other objects.
3. OO works with inheritance and composition. These are the to ways to assemble objects into a model.
4. OO promotes object reuse. The reuse can be both inside the model (the same object is used at several places in the model) and outside the model (the whole model or parts of it can be reused by third-party software).
5. Objects have behaviors and manage themselves their data.
From those generic rules we can deduce requirements for our XML to Object mapping:
These are strong rules that are quite difficult to fulfill. In fact, to build a XML to object mapping achieving them we have to drop another requirement common to a lot of XML to object mapping solution: automatic mapping. "Automatic" meaning that either the mapping is done automatically at runtime by the API or that a tool generates mapping code from XML Schemas or sample XML documents.
I we remove the need from automation, most of the requirements become simple: no need to interpret complex XML schemas, no need to access object internals or to handle objects relations.
So, X-O lite, provides the infrastructure it organizes the parsing but the real parsing must be coded "by hand"
The whole point of this API is a balance between its cost (the hand-coding) and it's benefits (the OO behavior).
Smart organization, careful design of the interfaces and powerful helper objects are provided to reduce the pain of coding to the minimum.
In some cases, the XML mapping coding will not be a big overhead, something like few straightforward lines of code per mapped class.
Here are some advantages of the hand-coding way (in addition to OO behavior):
and some disadvantages (in addition to coding by hand):
See the applicability document for more detailed discussion about advantages/inconvenient of the X-O lite API.
One aim of X-O lite is to be as light as possible, both in term of infrastructure, API or usage.
The starting point is that SAX (Simple API for XML) is almost what we need. SAX has a lot of good points: it has a default implementation shipped with the JRE (Xerces), it fully supports W3C schemas and it is stream-oriented so it has correct performance even for big XMLs. But ... it just lacks the ease of use and the object orientation. To know more about SAX itself, go to http://en.wikipedia.org/wiki/Simple_API_for_XML or http://www.saxproject.org/.
Fortunately, a thin layer above SAX is enough to achieve our goal. The result is that X-O lite is mainly composed by one interface (XmlSerializable) that your objects must implement and few support classes. It allows you to get the best of SAX without compromising object orientation.
The object should parse themselves the XML fragment(s) relative to them. But this don't mean that the parsing state (like parser reference, the Locator, the defined prefix mappings, the accumulated text, the stack of opened elements ...) should be in the objects. On the contrary, a well designed object should not contains attributes that are relevant only during some operations.
For example if the object holds a StringBuffer to accumulate the characters notified by a XMLReader as a private field, this field will be not used (and hopefully null) once the parsing is finished. This denotes bad design because it's the sign you try to fit in the same objects two responsibilities that have a different life span.
When parsing with SAX, the XMLReader push all the state to the client code using the ContentHandler interface. So a SAX XMLReader cannot be used directly for OO parsing. The X-O lite solution is to add a layer above the SAX XMLReader (called XMLEventReader) to maintain the parsing state and provide methods to query it. With that, the parsed object will just have to react on simplified SAX events (because all the SAX events used solely to push state are removed) and focus on formatting, verifying and storing the parsed data into the object.