Xerces is an XML library for several languages, but if a very common library in Java. 

I recently came across a problem with code intermittently throwing a NullPointerException inside the library:

[sourcecode lang=”text”]java.lang.NullPointerException
at org.apache.xerces.dom.ParentNode.nodeListItem(Unknown Source)
at org.apache.xerces.dom.ParentNode.item(Unknown Source)
at com.example.xml.Element.getChildren(Element.java:377)
at com.example.xml.Element.newChildElementHelper(Element.java:229)
at com.example.xml.Element.newChildElement(Element.java:180)

[/sourcecode]You may also find the NullPointerException in ParentNode.nodeListGetLength() and other locations in ParentNode.

Debugging this was not helped by the fact that the xercesImpl.jar is stripped of line numbers, so I couldn’t find the exact issue. After some searching, it appeared that the issue was down to the fact that Xerces is not thread-safe. ParentNode caches iterations through the NodeList of children to speed up performance and stores them in the Node’s Document object. In multi-threaded applications, this can lead to race conditions and NullPointerExceptions.  And because it’s a threading issue, the problem is intermittent and hard to track down.

The solution is to synchronise your code on the DOM, and this means the Document object, everywhere you access the nodes. I’m not certain exactly which methods need to be protected, but I believe it needs to be at least any function that will iterate a NodeList. I would start by protecting every access and testing performance, and removing some if needed.

[sourcecode lang=”java”]/**
* Returns the concatenation of all the text in all child nodes
* of the current element.
*/
public String getText() {
StringBuilder result = new StringBuilder();

synchronized ( m_element.getOwnerDocument()) {
NodeList nl = m_element.getChildNodes();
for (int i = 0; i < nl.getLength(); i++) {
Node n = nl.item(i);

if (n != null && n.getNodeType() == org.w3c.dom.Node.TEXT_NODE) {
result.append(((CharacterData) n).getData());
}
}
}

return result.toString();
}[/sourcecode]Notice the “synchronized ( m_element.getOwnerDocument()) {}” block around the section that deals with the DOM. The NPE would normally be thrown on the nl.getLength() or nl.item() calls.

Since putting in the synchronized blocks, we’ve gone from having 78 NPEs between 2:30am and 3:00am, to having zero in the last 12 hours, so I think it’s safe to say, this has drastically reduced the problem. 

On off the biggest problems with developing servlets under a
container like Tomcat is the amount of time taken to build your code,
deploy it to the container and restart it to pick up any changes. Maven
and the Jetty plugin allow you to cut down on this cycle considerably.
The first step is to allow you to start your application in maven by
running:

mvn jetty:run

We do this by configuring the jetty plugin inside our
pom.xml:

<plugin>
   <groupId>org.mortbay.jetty</groupId>
   <artifactId>maven-jetty-plugin</artifactId>
   <version>6.1.10</version>
</plugin>

Now when you run mvn jetty:run your application will start
up. But we can improve on this. The Jetty plugin can be configured to
scan your project every so often and rebuild it and reload it if
anything changes. We do this by changing our pom.xml to read:

<plugin>
   <groupId>org.mortbay.jetty</groupId>
   <artifactId>maven-jetty-plugin</artifactId>
   <version>6.1.10</version>
   <configuration>
      <scanIntervalSeconds>10</scanIntervalSeconds>
   </configuration>
</plugin>

Now when you save a file in your IDE, by the time you’ve switched to
your web browser, Jetty is already running your updated code. Your
development cycle is almost up to the same speed as Perl or PHP.

You can find more information at the plugin page.