I’d like to announce the initial release of Eddie, a feed parser
library written in Java. It’s taken me over 100 hours, but it now correctly
parses 90% of the FeedParser unit tests, including all the rss and atom
tests. It’s GPLed, with an exception allowing you to use it in any open
sourced program. Get it at my website.
Need to add documentation and character set and encoding support. Also
need to separate the testing infrastructure from the rest of the code.

This is the first time I’ve done any java programming in anger, and I
have to say I’m surprised to discover I quite like it. In many ways it
seems a very quick language to program in. It seems almost like
programming in a scripting language, but stronger typed. This is
probably due to not having to worry about memory management. Certainly I
don’t think I could have written this quite so quickly in C++.

Having said that, there are a couple things that I don’t like about
Java. Everything is a pointer. This is useful at times, but it means
that every time you want to call a method on an object you have to test
whether it is null or you run the risk of getting the dreaded
NullPointerException. Java also doesn’t have keywords for
and, or and not. I know not everyone likes
these, but I keep finding myself trying to use them.

I’m sure there are other things I hated, but I can’t remember them
now. I think I’ll end up doing more java programming in the future.

I’ve recently had cause to parse some date values in Java. As a
result I’ve produced a class which can manage to parse an awful lot of
date formats. I thought I’d better document it in case someone found it
useful. Certainly there doesn’t appear to be anything elsewhere which
shows you how to parse lots of formats. I have found the order of
date_formats to be very brittle, so I don’t recommend you
change it without an awful lot of test cases.

Anyway, without further to do, I present to you, the Pathological
Date Parser for Java

// Copyright 2006 David Pashley <david@davidpashley.com>
// Licensed under the GPL version 2
import java.text.SimpleDateFormat;
import java.util.Calendar;
import java.util.TimeZone;

public class Date {
    private Calendar date;

    static String[] date_formats = {
            "yyyy-MM-dd'T'kk:mm:ss'Z'",        // ISO
            "yyyy-MM-dd'T'kk:mm:ssz",          // ISO
            "yyyy-MM-dd'T'kk:mm:ss",           // ISO
            "EEE, d MMM yy kk:mm:ss z",        // RFC822
            "EEE, d MMM yyyy kk:mm:ss z",      // RFC2882
            "EEE MMM  d kk:mm:ss zzz yyyy",    // ASC
            "EEE, dd MMMM yyyy kk:mm:ss",   //Disney Mon, 26 January 2004 16:31:00 ET
            "-yy-MM",
            "-yyMM",
            "yy-MM-dd",
            "yyyy-MM-dd",
            "yyyy-MM",
            "yyyy-D",
            "-yyMM",
            "yyyyMMdd",
            "yyMMdd",
            "yyyy",
            "yyD"

    };
    public Date(String d) {
        SimpleDateFormat formatter = new SimpleDateFormat();
        d = d.replaceAll("([-+]\d\d:\d\d)", "GMT$1"); // Correct W3C times
        d = d.replaceAll(" ([ACEMP])T$", " $1ST"); // Correct Disney timezones
        for (int i = 0; i < date_formats.length; i++) {
           try {
              formatter.applyPattern(date_formats[i]);
              formatter.parse(d);
              date = formatter.getCalendar();
              break;
           } catch(Exception e) {
              // Oh well. We tried
           }
        }

    }
}

The only date formats I can’t get it to parse are <4-digit
year>-<day of year>
and <2digit year><day of
year>
(e.g. 2003-335 and 03335 for
2003-12-01). If you can add support for those and other date formats
I’ll gladly take patches.

Have you ever wanted to call a member function in your class, but not
known what it will be at compile time? I’m writing a SAX parser and
would like a function for every element name. I could write a massive
switch statement in the startElement function, but this will
quite quickly become unmanagable for a large schema. The alternative is
to look to see if a particular member function exists and call it.

To do this little bit of magic we need to use Java’s introspection API. The
first thing to do is to get a Class object for our class. We
can do that by calling:

Class klass = this.getClass();

We can then look up the method we are looking for using
Class.getMethod, but this function requires an array of types
that the method we are looking for takes as parameters, so we get the
right version of an overloaded method. We can do this with:

Class[] arguments = { Int.class, String.class, URL.class};
Method method = klass.getMethod("foo", arguments);

Now we have our method, we can call it using the
Method.invoke() call. This takes an object as the first
parameter, which we can use this, and an array of
Objects for the parameters.

Object[] values = {bar, baz, quux};
method.invoke(this, values);

But what happens if our class has no member method called
foo()? Well, Class.getMethod() will throw a
NoSuchMethodException, so we can just throw a
try/catch block around the code to deal with unhandled
functions. It’s worth pointing out that Class.getMethod() also
throws SecurityException and Method.invoke() throws
IllegalAccessException, IllegalArgumentException and
InvocationTargetException, so you’ll want to catch
Exception too.

We can chain some of these calls together and the result for my SAX
parser is:

public void startElement(String uri, String localName, String qName, Attributes atts)
            throws SAXException {
   try {
       Class[] argTypes = { String.class, String.class, String.class,
               Attributes.class };
       Object[] values = { uri, localName, qName, atts };
       this.getClass().getMethod("startElement_" + localName, argTypes)
               .invoke(this, values);
   } catch (NoSuchMethodException e) {
       log.debug("unhandled element " + localName);
   } catch (Exception e) {
       e.printStackTrace();
   }
}

With this arrangement, when I want to handle a new element in my code I
can just make a function like:

public void startElement_foo(String uri, String localName, String qName, Attributes atts)
            throws SAXException {
   ...
}