Which Java XML Library Should You Use

In the past I have always used either JDOM or Dom4J to interact with XML but I’ve never really been overly happy with either library. JDOM is very simple to use and great for simple XML structures but as soon as you want to do something a little more complex it becomes hard work. Dom4J is more powerful but I find the API to be, shall we just say, strange. I’ve never really found any problems with either project I just thought it was time to question why I was using them.

A quick Googling revealed that I’m a dinosaur and that most people have moved over to using XOM which is written by the same guy that wrote JDOM. Apparently it solves a lot of the awkwardness in JDOM and is overall an improvement. In fact I briefly imported it into my current project only to notice that it brings a bunch of it’s friends along with it – namely a particular version of Xerces and a couple of other libraries. Now I’m not one of those “not invented here” kind of people, I love libraries of code, but I’ve been getting the feeling recently that many of my projects are suffering from bloat and I’m fed up with trying to keep all the libraries up to date.

To combat the bloated feeling I decided to get back to basics and take a proper look at what the JDK has to offer. I’ve known for ages that it has XML handling baked in but right back at the start I heard it was hard to work with so avoided it. A quick read of the old JEE 1.4 tutorial* on JAX-P (chapters 4 through 6) got me up to speed and it was time to try it out.

Initial impressions were not good. The built in Java libraries have a much steeper learning curve than any of the other libraries and expose you to every tricky bit right from the start. To add to this there are essentially no helper methods. If, like me, you want to mainly work with data XML then be prepared for a lot of boring typing. On the other hand if you are working with document XML (e.g. XML with nodes containing mixed content) then you have all the power you need at your fingertips.

Having said that it might come as a surprise to hear me say that I’m going to stick with JAX-P for my XML handling. For data XML it’s more work than any of the helper libraries but with a few (and I mean only half a dozen or so) helper methods you can remove a lot of the boring typing. A couple of helper methods are included in the tutorial I linked to above another such helper is this:

/**
* Appends a new element containing the given content to the given parent. <p> A convenience methods for adding a new
* node containing only text to an existing node. This is a common task when outputting data XML.
*
* @param doc {@code Document} where the new {@code Element} will reside.
* @param elementName name of the new {@code Element}.
* @param content content of the new element.
* @param parent a DOM parent node.
*/
public static void appendChild(Document doc, String elementName, String content, Node parent) {
	Element element = doc.createElement(elementName);
	element.appendChild(doc.createTextNode(content));
	parent.appendChild(element);
}

There are libraries out there that apparently take even more work out of interacting with JAX-P but getting away from libraries was sort of the point in learning about the built in tools so I’ve not tried them out yet and I don’t think I will. As I use the built in libraries more I discover more features that I’ve always wanted to use but haven’t had access to, that more than makes up for a little extra typing and some crazy method names.

* With Java 5 the JAX-P libraries were moved into the JDK rather than being part of the JEE. This meant that the tutorials were dropped from the JEE tutorial. As far as I am aware they haven’t been included in the JDK tutorials yet though. Correction: the JAXP tutorial is here.