Tue, 30 Jan 2007

Pension issues

In the continuing saga I like to call "Dave, what's wrong with your ex-employer today?", I'm still missing large chunks of my pension. Finally had enough of my former employers not giving me any answers or claiming to be looking into it, as I've been asking since around last March. As per this guide, I've started making efforts to resolve the problem myself. The first step has been to formally request a copy of the pension scheme's dispute resolution procedure. I sent them an email last wednesday and hand delivered a letter last night, giving them until the 9th February to supply me with the procedure. If by then they haven't I'll contact the The Pension Advisory Service. I'm hoping it won't come to that, but given the lack of any progress in the past, I'm not holding out much hope.

[] | # Read Comments (3) |

Comments

Sun, 28 Jan 2007

Lazy Class Infrastructure

Do you ever feel you should implement equals(), hashCode() and toString, but just can't be bothered to do it for every class? Well, if you aren't bothered by speed, you can use Jakarta Commons Lang to do it for you. Just add this to your class:

import org.apache.commons.lang.builder.ToStringBuilder;
import org.apache.commons.lang.builder.EqualsBuilder;
import org.apache.commons.lang.builder.HashCodeBuilder;

class Foo {
   public int hashCode() {
      return HashCodeBuilder.reflectionHashCode(this);
   }
   public boolean equals(Object other) {
      return EqualsBuilder.reflectionEquals(this,other);
   }
   public String toString() {
      return ToStringBuilder.reflectionToString(this);
   }
}

And that's it. Your class will just do the right thing. As you can probably guess from the function names, it uses reflection, so may be suboptimal. If you need performance, you can use tell it to use particular members, but I think I'll leave that up to a future article. I also recommend you don't use this technique if you are using something like Hibernate, which does things behind the scenes on member access; you may find it does undesirable things. :)

[] | # Read Comments (0) |

Comments

Fri, 26 Jan 2007

Eddie 0.2 RSS and Atom Parser

I noticed today that Mark Pilgrim linked to Eddie, my liberal RSS and Atom parsing library for Java, so I figured I should make a new release. It's been a few months since I did any serious work on the parser, but in the last few days I've reduced the number of test case failures to less than 100 out of 3502 test cases which come as part of Mark's Feedparser parser for python. The majority of the failures are in the date parsing routines and due to bugs in the Jython library which cause literal dictionaries not to match with classes inherited fro PyDictionary.

Improvements in this version include:

  • Massively improved support for different character encodings. With Java 6, it also has support for UTF32 feeds.
  • CDF Support.
  • Optional support of TagSoup for sanitizing of HTML in entries.
  • Improved support for different input sources including String, InputStream and byte[].
  • Numerous bug fixes, with 97% of test cases passing, up from 90%

If you use Eddie, drop me an email. I'd like to thank Mark Pilgrim again for providing the community with a fantastic and comprehensive suite of test cases, extensive documentation and a first class Python library.

[] | # Read Comments (0) |

Comments

Thu, 25 Jan 2007

Speedy Java 6

I was quietly minding my own business, fixing some encoding bugs in Eddie, my liberal RSS and Atom parser, when I noticed that Java 6 included support for UTF-32, which is one of the encoding tests that was failing. I downloaded and installed the Ubuntu packages and installed it, and decided to run a quick benchmark using my unit tests.

First up was the Sun Java 5 JVM. I'd been running the unit tests all night, but timed it this time,and got these results:

Ran 3502 tests
Passed 3322 tests
Failed 180 tests

real    1m10.293s
user    0m40.375s
sys     0m3.632s

Next I tried the Sun Java 6 JVM, using the same jar files and got;

Ran 3502 tests
Passed 3326 tests
Failed 176 tests

real    0m56.059s
user    0m39.198s
sys     0m4.212s

One thing to note was that it spend a couple of seconds noticing new jars to read, so I decided to run it again and got:

Ran 3502 tests
Passed 3326 tests
Failed 176 tests

real    0m45.317s
user    0m34.770s
sys     0m3.516s

Wow, I'd gone from 70 seconds to 45 seconds using the new runtime, and interestingly enough, past 4 more tests in the process. I'm assuming they are the UTF-32 tests, although I have't checked yet. The other thing for me to try is recompiling the code to see if that has any additional benefits.

Update: Got around to checking what Java 6 fixed and it turned out it was the additional support for koi-u and cspc862latinhebrew encodings. After I fixed the UTF32 support in Eddie, it passed an additional 16 tests. Down to just 160 out of 3502. I just wish they would add support for some of the stranger encodings. Maybe this will happen when it's open source.

[] | # Read Comments (1) |

Comments

Ogg Player Recommendations and Upcoming Gigs.

Dear Lazyweb,

I'm after recommendations of a portable music player, which is small and can play ogg vorbis files. I'm not sure I need something with a large capacity; 2GB should be fine. Basically want something I can use in the gym, so I don't have to listen to the god-awful dance music they keep playing on MTV Dance. Reasonable audio quality a bonus. Last.fm support would be amazing. I would be tempted by something that could run Rockbox, but I suspect they are going to be on the top end of the size scale.

Spent some time checking upcoming gigs and I've settled on And You Will Know Us By The Trail Of The Dead, The Killers, 65daysofstatic, and The Barfly's Great Escape mini-festival, 3 days of gigs over 20 venues. Trying to decide if I want to go to see Inspiral Carpets and Electric Six in the world's dirtiest club.

[] | # Read Comments (7) |

Comments

Tue, 09 Jan 2007

Reason 82973 why MySQL is a toy

MySQL cleverly maps

CREATE INDEX foo_bar ON Foo(Bar);

to

LOCK TABLE Foo WRITE;
CREATE TEMPORARY TABLE A-Foo ( .... INDEX foo_bar (Bar));
INSERT INTO A-Foo SELECT * FROM Foo;
ALTER TABLE Foo RENAME TO B-Foo;
ALTER TABLE A-Foo RENAME TO Foo;
DROP TABLE B-Foo;

If you have a very large table, expect this operation to take a) a lot of disk space, b) a very very long time and c) block any writes to the table in the process. I don't recommend adding indexes or altering any very large tables that are in production on MySQL, because you won't be in production for quite some time.

Update: Tom Haddon asked me if this applied to recent versions of MySQL or to PostgreSQL. Looking at the docs, it appears to still apply to 5.1:

  • http://dev.mysql.com/doc/refman/5.1/en/create-index.html
  • http://dev.mysql.com/doc/refman/5.1/en/alter-table.html

    In some cases, no temporary table is necessary:

    • If you use ALTER TABLE tbl_name RENAME TO new_tbl_name without any other options, MySQL simply renames any files that correspond to the table tbl_name. (You can also use the RENAME TABLE statement to rename tables. See Section 13.1.16, “RENAME TABLE Syntax”.)

    • ALTER TABLE ... ADD PARTITION creates no temporary table except for MySQL Cluster. ADD or DROP operations for RANGE or LIST partitions are immediate operations or nearly so. ADD or COALESCE operations for HASH or KEY partitions copy data between changed partitions; unless LINEAR HASH/KEY was used, this is much the same as creating a new table (although the operation is done partition by partition). REORGANIZE operations copy only changed partitions and do not touch unchanged ones.

    If other cases, MySQL creates a temporary table, even if the data wouldn't strictly need to be copied (such as when you change the name of a column).

  • http://dev.mysql.com/doc/refman/5.1/en/alter-table-problems.html

As far as PostgreSQL is concerned, it doesn't mention anything about doing the same thing, but does mention that it does a full sequential scan of the table. During this time writes are blocked. You can use the CONCURRENTLY keyword to allow writes to happen, but it does two scans and will take longer, but you can still use your database.

http://www.postgresql.org/docs/8.2/interactive/sql-createindex.html

[database,MySQL,gotchas] | # Read Comments (4) |

Comments

Wed, 03 Jan 2007

UK Software Patent Petition

There is currently a petition on the Prime Minister's website calling for a clear ban on software patents. I was hesitant to sign it, not because I want software patents, but due to the langauge of the petition.

Software patents are used by convicted monopolists to threaten customers who consider using rival software. As a result, patents stifle innovation.

Patents are supposed to increase the rate of innovation by publicising how inventions work. Reading a software patent gives no useful information for creating or improving software. All patents are writen in a sufficiently cryptic language to prevent them from being of any use. Once decoded, the patents turn out to be for something so obvious that programmers find them laughable.

It is not funny because the cost of defending against nuicance lawsuites is huge.

The UK patent office grants software patents against the letter and the spirit of the law. They do this by pretending that there is a difference between software and 'computer implemented inventions'.

Some companies waste money on 'defensive patents'. These have no value against pure litigation companies and do not counter threats made directly to customers.

The aggressive and ad-hominem language doesn't do anything to help the cause. It looks unprofessional and will result in the authorities ignoring it as a fanatic incoherent rant and will put off people from signing the petition. I'd be interested to know how many people didn't sign because of the text.

[patents,software patents,politics] | # Read Comments (2) |

Comments