Binding XML to Java Back | TOC

If you have ever worked with XML in your Java applications, you have probably asked yourself whether you would choose the SAX or the DOM API. We know that SAX is interpreted and fast, while DOM creates an in-memory data structure that you can then lazily evaluate, but slows things up a bit.

SAX & DOM
The SAX API provides a callback mechanism that is more suited to procedure-oriented programming. You run with the parser and catch up with the events as they pass by. You device a scheme to store what you need for later use. The object-oriented approach takes a back seat. You are left with data that is devoid of structure, and held in artificial constructs like in a C program.

Data parsing with SAX therefore produces a lossy transformation. You lose the structural information associated with the data that was originally present in XML. DOM, in contrast, preserves the hierarchical tree structure that the data was originally intended to be in. However, there is a limitation to its use. When the XML file is huge, the application guzzles memory and affects performance. In some cases it may become impractical to use DOM where size does matter, and memory consideration affects your choice of parsing.

The DOM API is unweildy and bothersome. You are given a tree that is purely a logical construct. You may not want to see your data wrapped up in a structure that is unsuitable to your application design. To have to deal with nodes, children and parents, just to get at a piece of data is arduous and error prone. Of course, DOM presents an object-oriented approach, but that does not make it less hard and cumbersome. As a developer I would be interested to read data types from an XML file or serialize to it, rather than deal with elements, attributes and documents.

Consider this piece of XML.

<?xml version="1.0"?>
<collection>
<type>Vector</type>
<capacity>100</capacity>
<increment>5</increment>
<size>50</size>
<element-type>String</element-type>
</collection>

If you use SAX, you store the values in local variables by constantly checking against the value of the current element. With DOM, you build a tree-like object that you inspect and navigate to locate the data values you need. Instead of using SAX or DOM, would you not prefer to have an API that would allow you to say:

Collection list = new Collection();
int capacity = list.getCapacity();
int size = list.getSize();
int increment = list.getIncrement();
String elementType = list.getElementType();

Beyond SAX & DOM
Through the Java commnity process, Sun has undertaken an API set that makes it possible to do just that. It is the Java Architecture for XML Binding, a new entrant in the JAX API family.

The upcoming release of Java version 1.5 is likely to change the way we use XML in Java code, if the XML binding technology gets included in it. A preview is available in an early release of JAXB, based on the XML to Java Binding specification. Does it mean that you no longer have to think in terms of using SAX API or DOM API directly in your code? No more running with the parser, or tearing your hair over nodes and the node hierarchy? We will now explore JAXB in more detail, and find answers to these questions and some more.

The JAXB Approach
JAXB requires that you define a DTD for your XML document according to the XML 1.0 specification. You also have to define an XML binding schema according to the JAXB specification, currently a working draft version 0.21. JAXB provides a schema compiler that takes the DTD file and the binding schema file as input and generates classes that you can use in your application. From your application it is possible to use these classes to instantiate Java objects or to generate new XML data files that validate against the DTD you have earlier defined.

DTD does not allow you to define data types, but are required by the schema compiler to generate the Java source files. The binding schema enables you to specify which element data must map to which Java data type. However, JAXB does not mandate that every DTD declaration has a a binding instruction. In the absence of a specific binding, JAXB uses defaults as appropriate.

The JAXB API
The current version contains just two packages:

As you might have guessed, the first package contains classes related to binding, or more precisely, mapping XML to Java. The second package contains classes to serialize Java data to a valid XML data and back again, a process called marshalling and unmarshalling respectively.

JAXB early access release is bundled as jaxb-1_0-ea-bin.zip for the Windows platform. When you extract this, you will find bin and lib directories in the install folder. The lib folder contains two jar files - jaxb-rt-1.0-ea.jar and jaxb-xjc-1.0-ea.jar with the JAXB runtime classes and the JAXB compiler classes respectively. The bin folder contains Unix/Linux shell script file that you can use to invoke the JAXB schema compiler. There is no corresponding bat file for the Windows platform. (For those of you out there stuck with a Windows platform like myself, take the approach I have taken in this article to use the JAXB compiler).

A JAXB Example
Let's take our example XML file and see how JAXB maps the data nodes to Java data types. We will call the example XML file collections.xml and the corresponding DTD collections.dtd which looks like so:

<!ELEMENT collection ( type, capacity, increment, size, element-type )>
<!ELEMENT type (#PCDATA)>
<!ELEMENT capacity (#PCDATA)>
<!ELEMENT increment (#PCDATA)>
<!ELEMENT size (#PCDATA)>
<!ELEMENT element-type (#PCDATA)>

Now we need to write a binding schema document as per the JAXB specification. It is also a text file and we name it as collection.xjs, and type the following content in our favorite text editor. The first line is a tag that identifies the file as a binding schema.

<xml-java-binding-schema version="1.0ea">
<element name="collection" type="class" root="true" />
</xml-java-binding-schema version="1.0ea">

This is the minimal binding schema. The compiler assumes defaults for elements not included in the schema. The defaults depend on whether there is an attribute list, or the content is simple or complex. A simple content model has no attributes, like our example here.

The Compilation
Instal JAXB as described in the release.html file. You will find it in the doc folder under the install directory.

To invoke the binding schema compiler on collection.xml file, just type this on the command shell -

java -jar jaxb-xjc-1.0-ea.jar collection.dtd collection.xjs

This of course assumes that you have all the files in the same folder. The schema compiler returns just one Java source file - Collection.java produced in Listing 1.

Listing 1

import java.io.IOException;
import java.io.InputStream;
import javax.xml.bind.ConversionException;
import javax.xml.bind.Dispatcher;
import javax.xml.bind.InvalidAttributeException;
import javax.xml.bind.LocalValidationException;
import javax.xml.bind.MarshallableRootElement;
import javax.xml.bind.Marshaller;
import javax.xml.bind.MissingContentException;
import javax.xml.bind.RootElement;
import javax.xml.bind.StructureValidationException;
import javax.xml.bind.UnmarshalException;
import javax.xml.bind.Unmarshaller;
import javax.xml.bind.Validator;
import javax.xml.marshal.XMLScanner;
import javax.xml.marshal.XMLWriter;

public class Collection
extends MarshallableRootElement
implements RootElement
{
private String _Type;
private String _Capacity;
private String _Increment;
private String _Size;
private String _ElementType;

public String getType() {
return _Type;
}

public void setType(String _Type) {
this._Type = _Type;
if (_Type == null) {
invalidate();
}
}

public String getCapacity() {
return _Capacity;
}

public void setCapacity(String _Capacity) {
this._Capacity = _Capacity;
if (_Capacity == null) {
invalidate();
}
}

public String getIncrement() {
return _Increment;
}

public void setIncrement(String _Increment) {
this._Increment = _Increment;
if (_Increment == null) {
invalidate();
}
}

public String getSize() {
return _Size;
}

public void setSize(String _Size) {
this._Size = _Size;
if (_Size == null) {
invalidate();
}
}

public String getElementType() {
return _ElementType;
}

public void setElementType(String _ElementType) {
this._ElementType = _ElementType;
if (_ElementType == null) {
invalidate();
}
}

public void validateThis()
throws LocalValidationException
{
if (_Type == null) {
throw new MissingContentException("type");
}
if (_Capacity == null) {
throw new MissingContentException("capacity");
}
if (_Increment == null) {
throw new MissingContentException("increment");
}
if (_Size == null) {
throw new MissingContentException("size");
}
if (_ElementType == null) {
throw new MissingContentException("element-type");
}
}

public void validate(Validator v)
throws StructureValidationException
{
}

public void marshal(Marshaller m)
throws IOException
{
XMLWriter w = m.writer();
w.start("collection");
w.leaf("type", _Type.toString());
w.leaf("capacity", _Capacity.toString());
w.leaf("increment", _Increment.toString());
w.leaf("size", _Size.toString());
w.leaf("element-type", _ElementType.toString());
w.end("collection");
}

public void unmarshal(Unmarshaller u)
throws UnmarshalException
{
XMLScanner xs = u.scanner();
Validator v = u.validator();
xs.takeStart("collection");
while (xs.atAttribute()) {
String an = xs.takeAttributeName();
throw new InvalidAttributeException(an);
}
if (xs.atStart("type")) {
xs.takeStart("type");
String s;
if (xs.atChars(XMLScanner.WS_COLLAPSE)) {
s = xs.takeChars(XMLScanner.WS_COLLAPSE);
} else {
s = "";
}
try {
_Type = String.valueOf(s);
} catch (Exception x) {
throw new ConversionException("type", x);
}
xs.takeEnd("type");
}
if (xs.atStart("capacity")) {
xs.takeStart("capacity");
String s;
if (xs.atChars(XMLScanner.WS_COLLAPSE)) {
s = xs.takeChars(XMLScanner.WS_COLLAPSE);
} else {
s = "";
}
try {
_Capacity = String.valueOf(s);
} catch (Exception x) {
throw new ConversionException("capacity", x);
}
xs.takeEnd("capacity");
}
if (xs.atStart("increment")) {
xs.takeStart("increment");
String s;
if (xs.atChars(XMLScanner.WS_COLLAPSE)) {
s = xs.takeChars(XMLScanner.WS_COLLAPSE);
} else {
s = "";
}
try {
_Increment = String.valueOf(s);
} catch (Exception x) {
throw new ConversionException("increment", x);
}
xs.takeEnd("increment");
}
if (xs.atStart("size")) {
xs.takeStart("size");
String s;
if (xs.atChars(XMLScanner.WS_COLLAPSE)) {
s = xs.takeChars(XMLScanner.WS_COLLAPSE);
} else {
s = "";
}
try {
_Size = String.valueOf(s);
} catch (Exception x) {
throw new ConversionException("size", x);
}
xs.takeEnd("size");
}
if (xs.atStart("element-type")) {
xs.takeStart("element-type");
String s;
if (xs.atChars(XMLScanner.WS_COLLAPSE)) {
s = xs.takeChars(XMLScanner.WS_COLLAPSE);
} else {
s = "";
}
try {
_ElementType = String.valueOf(s);
} catch (Exception x) {
throw new ConversionException("element-type", x);
}
xs.takeEnd("element-type");
}
xs.takeEnd("collection");
}

public static Collection unmarshal(InputStream in)
throws UnmarshalException
{
return unmarshal(XMLScanner.open(in));
}

public static Collection unmarshal(XMLScanner xs)
throws UnmarshalException
{
return unmarshal(xs, newDispatcher());
}

public static Collection unmarshal(XMLScanner xs, Dispatcher d)
throws UnmarshalException
{
return ((Collection) d.unmarshal(xs, (Collection.class)));
}

public boolean equals(Object ob) {
if (this == ob) {
return true;
}
if (!(ob instanceof Collection)) {
return false;
}
Collection tob = ((Collection) ob);
if (_Type!= null) {
if (tob._Type == null) {
return false;
}
if (!_Type.equals(tob._Type)) {
return false;
}
} else {
if (tob._Type!= null) {
return false;
}
}
if (_Capacity!= null) {
if (tob._Capacity == null) {
return false;
}
if (!_Capacity.equals(tob._Capacity)) {
return false;
}
} else {
if (tob._Capacity!= null) {
return false;
}
}
if (_Increment!= null) {
if (tob._Increment == null) {
return false;
}
if (!_Increment.equals(tob._Increment)) {
return false;
}
} else {
if (tob._Increment!= null) {
return false;
}
}
if (_Size!= null) {
if (tob._Size == null) {
return false;
}
if (!_Size.equals(tob._Size)) {
return false;
}
} else {
if (tob._Size!= null) {
return false;
}
}
if (_ElementType!= null) {
if (tob._ElementType == null) {
return false;
}
if (!_ElementType.equals(tob._ElementType)) {
return false;
}
} else {
if (tob._ElementType!= null) {
return false;
}
}
return true;
}

public int hashCode() {
int h = 0;
h = ((127 *h)+((_Type!= null)?_Type.hashCode(): 0));
h = ((127 *h)+((_Capacity!= null)?_Capacity.hashCode(): 0));
h = ((127 *h)+((_Increment!= null)?_Increment.hashCode(): 0));
h = ((127 *h)+((_Size!= null)?_Size.hashCode(): 0));
h = ((127 *h)+((_ElementType!= null)?_ElementType.hashCode(): 0));
return h;
}

public String toString() {
StringBuffer sb = new StringBuffer("<<collection");
if (_Type!= null) {
sb.append(" type=");
sb.append(_Type.toString());
}
if (_Capacity!= null) {
sb.append(" capacity=");
sb.append(_Capacity.toString());
}
if (_Increment!= null) {
sb.append(" increment=");
sb.append(_Increment.toString());
}
if (_Size!= null) {
sb.append(" size=");
sb.append(_Size.toString());
}
if (_ElementType!= null) {
sb.append(" element-type=");
sb.append(_ElementType.toString());
}
sb.append(">>");
return sb.toString();
}

public static Dispatcher newDispatcher() {
Dispatcher d = new Dispatcher();
d.register("collection", (Collection.class));
d.freezeElementNameMap();
return d;
}

}

It is a pretty long class for a small DTD such as the one we have provided. Notice how the data types have defaulted to String. In addition to the accessor and the mutator methods, the schema compiler generates the code for marshalling and unmarshalling we mentioned before. The unmarshalling code takes XML file as input and outputs an instance of the compiled class. Use the marshal code to serialize a Java object into XML data.

A Test Driver
Let us the test the Collection class so we could get the value of the increment element in the collection.xml file. The code for the test driver program is in Listing 2.

Listing 2

/* The test driver for Collection.java */

import java.io.*;

public class TestCollection {
public static void main(String[] args) {

try {
FileInputStream fis = new FileInputStream("collection.xml");
Collection list = Collection.unmarshal(fis);

String increment = list.getIncrement();
System.out.println("increment: "+increment);
} catch(Exception e) { e.printStackTrace();}
}
}

Run java -cp .;<install dir>\lib\jaxb-rt-1.0-ea.jar TestCollection
Output increment: 5

Issues in JAXB
Cool. Just the way we wanted it. However, there are some points worth noting.

The JAXB API release is a welcome feature to the Java developer repertoire. Check out http://developer.java.sun.com/developer/earlyAccess/xml/jaxb/ in case you are an early adopter like myself.