Thu, 24 Jan 2008

Child-friendly pasting in vim

If you've got various indenting and text wrapping options turned on in vim, pasting text into the editor results in screwed up results. You can get around this by turning on paste mode using :set paste and off with :set nopaste. To make things a little easier, you can use the following snippet in your .vimrc to allow you to toggle paste on and off using a single keypress:

nmap <F4> :set invpaste paste?<CR>
imap <F4> <C-O>:set invpaste<CR>
set pastetoggle=<F4>

(Warning: my vim settings have organically grown over the last 10 years, so they may not be the best or modern way of achieving an effect.)

[vim,gotchas,tips] | # Read Comments (4) |

Comments

Wed, 25 Jul 2007

SIP/Desktop Integration

Dear Lazyweb,

I'm possibly asking for the moon on a stick here, but in the office we have VoIP phones, which talk to our Asterisk server. Unfortunately, the ringtone on them are incredibly quiet and I tend to listen to music and don't notice either the ring or the small green flashing light when a call comes in.

The question then is does anyone know of a program which will talk SIP to the asterisk server and notice when a call comes in and turn my music down and display a notification?

[sip,lazyweb] | # Read Comments (8) |

Comments

Thu, 15 Feb 2007

Old Style Firefox Tabs

Firefox 2 is an improvement on previous versions, but one thing annoys me is the new tab style. I don't like having a close button on each tab and I don't like it hiding tabs after you have a certain number open. Fortunately you can fix this. Go to about:config in the URL and then set browser.tabs.closeButtons to 3 and browser.tabs.tabMinWidth to 0 and now you should have a close button on the right and all tabs displayed.

[firefox,tabs,gotchas] | # Read Comments (4) |

Comments

Tue, 06 Feb 2007

This is a multi-part message in MIME format.

Content-Type: multipart/alternative;
        boundary="----=25532899_4522_4927_1140_664401643181"

This is a multi-part message in MIME format.
------=25532899_4522_4927_1140_664401643181
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

This message is in MIME format. Since your mail reader does not
understand =
this format, some or all of this message may not be legible.
------=25532899_4522_4927_1140_664401643181
Content-Type: text/html; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

No, my mail reader understands it perfectly. It's your crappy mail client that sends out multipart/alternative mails which don't contain alternatives. Fuckers.

[MIME,wtf] | # Read Comments (0) |

Comments

Wed, 03 Jan 2007

UK Software Patent Petition

There is currently a petition on the Prime Minister's website calling for a clear ban on software patents. I was hesitant to sign it, not because I want software patents, but due to the langauge of the petition.

Software patents are used by convicted monopolists to threaten customers who consider using rival software. As a result, patents stifle innovation.

Patents are supposed to increase the rate of innovation by publicising how inventions work. Reading a software patent gives no useful information for creating or improving software. All patents are writen in a sufficiently cryptic language to prevent them from being of any use. Once decoded, the patents turn out to be for something so obvious that programmers find them laughable.

It is not funny because the cost of defending against nuicance lawsuites is huge.

The UK patent office grants software patents against the letter and the spirit of the law. They do this by pretending that there is a difference between software and 'computer implemented inventions'.

Some companies waste money on 'defensive patents'. These have no value against pure litigation companies and do not counter threats made directly to customers.

The aggressive and ad-hominem language doesn't do anything to help the cause. It looks unprofessional and will result in the authorities ignoring it as a fanatic incoherent rant and will put off people from signing the petition. I'd be interested to know how many people didn't sign because of the text.

[patents,software patents,politics] | # Read Comments (2) |

Comments

Fri, 29 Dec 2006

Network Troubleshooting Article

It's been a while since I last wrote an article, but I've just published a new one on troubleshooting networking. It's based on a couple of emails I sent to a mailing list helping someone with their networking problem. I decided to clean it up a bit and publish it. It is mostly for finding where the problem lies, rather than fixing the issue, but I would be grateful for any comments you have, particularly if you think I've missed any obvious steps.

[article,network troubleshooting,linux] | # Read Comments (2) |

Comments

Thu, 23 Nov 2006

Yum Bitching

It can be quite discouraging to type "yum update" and have yum simply go off forever. Among other things, one must wait a great long time to distinguish this behavior from yum's normal mode of operation. Other times, it comes back very quickly with a message saying, for all practical purposes, "RPM crashed, you lose, sorry."

via LWN.net (subscription required)

Jonathan Corbet normally manages to amuse me on a weekly basis, but this time he's outdone himself. Consider my LWN.net subcription renewed for another year.

Update: I;ve been convinced to include a subscriber link to the article for those without LWN.net subscriptions. http://lwn.net/SubscriberLink/210373/8badbe4f9c463fb8/

[lwn.net,RPM,yum,wtf,Jonathan Corbet] | # Read Comments (4) |

Comments

Wed, 26 Jul 2006

Use Atom

Andreas, use atom rather than RSS. It has a <updated/> element for the last time the entry was updated and a <published/> element for the date the entry was published.

If you want to stick with RSS, you can use the <dcterms:issued/> element for the initial published date and one of <pubdate/>, <dc:date/> or <dcterms:modified/>. Don't forget to include the xml namespace for dc and dcterms.

[Atom,RSS] | # Read Comments (0) |

Comments

Mon, 24 Jul 2006

Strict feed parsers are useless

Erich, I'm not entirely sure what you did to break Planet, but using a strict feed parser will just result in you missing a significant number of entries. People sadly don't produce valid feeds and will blame your software rather than their feeds. It doesn't help that a number of validators aren't entirely strict and that RSS doesn't have a very comprehensive spec. RSS is a lot worse than Atom, in part thanks to the Atom validator and very well thought out spec. It's for this reason that I ended up writing Eddie rather than using ROME as it was a DOM parser and just failed to get any information out of a non-wellformed feed. Eddie on the other hand is a SAX-based parser. In a recent comparison, an Eddie based aggregator managed to correctly parse several more entries than a ROME based aggregator one particular day.

You also have major aggregators being liberal. Sam Ruby discussed this recently with Bloglines becoming the defacto validator; if bloglines parses it, then it's valid. We had the same problem with HTML with people making sure their pages worked in a browser rather than met the spec.

I suspect the problem you had with Planet is that you failed to close a tag, causing the rest of the page to be in bold or be a link etc. This is fairly easily solvable and in fact has been with FeedParser, which is the feed parsing library Planet uses. It has support for using HTMLTidy and similar libraries for fixing unbalanced elements. Eddie uses TagSoup to do a similar thing. As a result I've not noticed any particular entry leaking markup and breaking the page. Parhaps Planet Debian just needs to install one of the markup cleaning libraries.

I agree that people should use XML tools where possible. Unfortunately, most blogging tools use text based templating systems, which makes producing non-wellformed XML too easy. To deal with this I pass all my output through an XSLT filter, which means that everything is either well formed or doesn't output at all. Unfortunately I don't think everyone would be capable or willing to use XSLT.

[RSS,Atom,feed parser,Eddie,Rome,SAX,XSLT] | # Read Comments (3) |

Comments

Wed, 19 Jul 2006

I feel dirty

I've just installed a bunch of RPM packages that were built on CentOS and targetting Redhat-like linux distributions onto a Solaris server. Even scarier, it worked.

I feel dirty.

[Solaris,RPM] | # Read Comments (0) |

Comments

Killall sshd considered stupid

I was using our Fedora 3 server and decided to restart sshd running in a chroot:

[root@winky /]# /etc/init.d/sshd restart
Stopping sshd:/etc/init.d/sshd: line 212: kill: (1483) - No such process
Connection to winky closed by remote host.
Connection to winky closed.
mimsy david% ssh winky -l root
ssh: connect to host winky port 22: Connection refused

Thank you very much Fedora.

[sshd,ssh,Fedora,wtf] | # Read Comments (0) |

Comments

Tue, 27 Jun 2006

How Not To Implement Spam Filtering (and web forms)

A "friend" of mine has recently been sending me forwarded jokes and other assorted crap we all grew out of sending about 5 minutes after we learnt how to send emails. This I can cope with, but recently, for every email she sends, I've been receiving an automated email from a server somewhere telling me that it's blocked an attachment, once for each attachment in the original email. That's crime number one. Looking at it further it appears that its not the sender's mail servers doing it, but once of the recipient's mail servers. When it gets an email with an attachment it's blocked, it emails everyone in the To: header to tell them, irrespective of whether they are local users or not. That's crime number two.

I thought I'd email postmaster@capita.co.uk to tell them of this problem, but unsurprisingly it bounced. That's crime number three. Looking on their website, I couldn't find any technical contacts, which wasn't really surprising. I did how ever find a "general enquiries" form, so I filled that in. Unfortunately, they used the following html for the message box:

<textarea name="Feedback1:fldEnquiry" rows="6" cols="1"
   id="Feedback1_fldEnquiry" class="enquiryTable"></textarea>

The result is that you get a text box 6 rows high and one column across, which is basically unuseable. Interestingly they appear to add style="width: 350px;" in IE, which makes it work. I'll make that crimes 4 and 5, cos doing different things for different browsers is a crime in itself.

I await a phone call or email from them.

[rfc2821,wtf] | # Read Comments (1) |

Comments

Wed, 31 May 2006

Init(1) Causing zombies

Had an interesting problem today with one of our servers at work. First thing I noticed was yesterday that an upgrade of Apache2 didn't complete properly because /etc/init.d/apache2 stop didn't return. Killing it and starting apache allowed the upgrade to finish. I noticed there was a zombie process but didn't think too much off it.

Then this morning got an email from the MD saying that various internal services websites were down (webmail, wiki etc). My manager noticed that it was due to logrotate hanging, again because restarting apache had hung. Looking at the server I noticed a few more zombie processes. One thing I'd noticed was that all these processes had reparented themselves under init and a quick web search later confirmed that init(1) should be reaping these processes. I thought maybe restarting init would clear the zombies. I tried running telinit q to reread the config file, but that returned an error message about timing out on /dev/initctl named pipe. I checked that file existed and everything looked fine. The next thing I checked was the other end of the named pipe by running lsof -p 1. This showed that init had /dev/console rather than /dev/initctl as fd 0. I tried running kill -SIGHUP 1, but that didn't do anything. Then I tried kill -SIGUSR1 1, but that didn't do anything either. I checked the console, but there wasn't enough scrollback to see the system booting and decided to schedule a reboot for this evening.

Rebooting the server presented me with an interesting challenge. Normally the shutdown command signals to init to change to runlevel 0 or 6 to shutdown or reboot using /dev/initctl. Of course init wasn't listening on this file, so that was out. Sending it an SIGINT signal (the same signal init gets on ctrl-alt-delete) had no response. Obviously telinit 0 wasn't going to work either. I decided to start shutting services down manually with the help of Brett Parker. The idea was to stop all non-essential services, unexporting nfs exports, remounting disks read-only and then either using sysrq or a hardware reset. Unfortunately someone accidentally ran /etc/init.d/halt stop, hanging the server, but he is suffering from a bad cold today so I forgive him. The server restarted without a hitch (thank god for ext3) and running lsof -p 1 showed init having /dev/initctl open. I don't know what happened to init the last reboot on Monday, but a reboot seemed to fix it. Odd bug, but thankfully it was a nice simple fix. I could have spent the evening debugged init. :)

[gotchas,init] | # Read Comments (2) |

Comments

Laptop Harddisk Failure

I'm currently mourning the loss of my laptop's hard disk. I don't think there was massive amounts of data on there that I needed, but it's still upsetting. Looks like I'll have to buy a new 2.5" drive. At least it gives me a reason to reinstall Debian.

Update: The bad news is that it is actually a 1.8" drive, which means I need to spend 93GBP including tax and postage for a new drive. Should get it by Friday. The good news is that having left the laptop off during the day, I managed to get the laptop booting and am not rsyncing the data off as fast as I can. Shame about the money, but at least I haven't lost much in the way of data.

[hardware failure] | # Read Comments (0) |

Comments

Fri, 26 May 2006

Makes you look big *and* clever

To the anonymous Texan that thought it would be a good idea to point out a Perl module in response to my date parsing class in Java:

I'm well aware of date parsing modules in other languages. In fact I've used them for similar tasks. But the post wasn't asking "How do I parse dates in Perl?" It was giving some code that some people might find useful.

Commenting with "man Date::Parse" doesn't make you look clever; it just makes you look like a twat and the kind of person people are relucant to invite to parties. You may be a geek and you may know stuff, but that doesn't mean you have to try to look clever, because you will invariably fail.

</rant>

Oh and Erich, I've got some more date formats I want to add support for, so I'll add german dates and post the updated code. :)

[] | # Read Comments (0) |

Comments

Wed, 10 May 2006

Oracle User Sessions

Ever wanted to know who was logged into your oracle server and where from? This SQL will show you the username connected, which machine they are connected from and what time they connected.

SELECT s.username, s.program, s.logon_time 
   FROM v$session s, v$process p, sys.v_$sess_io si 
   WHERE s.paddr = p.addr(+) AND si.sid(+) = s.sid 
   AND s.type='USER';
[Oracle] | # Read Comments (0) |

Comments

Thu, 04 May 2006

Gnome Terminal and Character Encodings

Since upgrading to GNOME 2.14, I have been revisited by an annoying problem with gnome-terminal. Gnome-terminal sets your character encoding to being the same as your locale by default, which unfortunately was being detected as ANSI_X3.4-1968, while I had my $LANG set to en_GB.UTF-8 in my ~/.bash_profile. The reason it wasn't being detected was because nothing between logging in and starting gnome-terminal looked at that file, so gnome-terminal thought the locale was C.

The result was corrupt display when programs attempted to display unicode characters. I could fix it by changing the character encoding using the menu, but I'd have to do this for every tab, which quickly becomes annoying. Time to find a fix.

Turns out that you need to tell gdm to set the right locale, which you can do by configuring ~/.dmrc. Mine now looks like:

[Desktop]
Session=gnome
Language=en_GB.UTF-8

Obviously, the important section is the Language line. You need to set it to a locale that exists on your system, which you can find using locale -a. Once you've set that and logged in again, everything should be working correctly.

[gnome-terminal,encodings,gotchas] | # Read Comments (1) |

Comments

Fri, 28 Apr 2006

Dealing with blog spam

Julien, the majority of comment spam can be dealt with very simply by including a turing test. On my blog, when I first started getting comment spam, I added a check box asking if the poster was a human. For a human, it's not a massive inconvience to tick a box, but for an automated tool, it's a major problem. Was implemented in 3 lines of html and one line of python. Since I added it, I haven't recieved a single piece of spam.

I don't believe it's had a major effect on people commenting, although I currently can't tell. I could change it to hide posts that claim to be non-human until I've checked them. If spam tools work out this simple problem, I could change the nature of the test to randomly change between "I am a human" and "I am not a human". After that I could include a simple sum or some other simple question. It also has an advantage over captchas that it is accessible.

It is a simple change which massively reduces spam by increasing the cost of spamming and I'm surprised that most people don't do something similar.

[blog,spam,turing test,comments] | # Read Comments (7) |

Comments

Sun, 23 Apr 2006

Installing Oracle XE on Debian

I've spent the weekend playing around with the new Oracle XE Debian packages in preparation of having to use them at work in the near future. I've written up my experiences of setting the server and connecting remote clients in my latest article.

Talking of work, we have a position for a junior support role open. If you live in or around Brighton, England and know a little bit about Linux, Debian, Tomcat, Java, PostgreSQL and Oracle and willing to learn more, have a look at the job description and get in contact.

[Oracle,Oracle XE,Debian] | # Read Comments (0) |

Comments

Sat, 22 Apr 2006

Empty elements in XSLT

I recently wanted to deal with docbook <ulink> elements that didn't have any contents by displaying the url as the link text. I wanted to convert:

<ulink url="http://www.example.com">Example.com<ulink>
<ulink url="http://www.example.com"/>

to

<a href="http://www.example.com">Example.com<a>
<a href="http://www.example.com">http://www.example.com<a>

I originally had:

<xsl:template match="ulink">
   <xsl:element name="a">
      <xsl:attribute name="href"><xsl:value-of select="@url"/></xsl:attribute>
      <xsl:apply-templates/>
   </xsl:element>
</xsl:template>

This sucessfully dealt with the first form of <ulink> that had content, but not with the second example with an empty element. The solution is the use an <xsl:choose> element with a test to see if the current node has any child nodes. Using child::node() we can get any child nodes. We can then test if the node has any children using the count() function. The resulting xslt is:

<xsl:template match="ulink">
   <xsl:element name="a">
      <xsl:attribute name="href"><xsl:value-of select="@url"/></xsl:attribute>
      <xsl:choose>
         <xsl:when test="count(child::node())">
            <xsl:apply-templates/>
         </xsl:when>
         <xsl:otherwise>
            <xsl:value-of select="@url"/>
         </xsl:otherwise>
      </xsl:choose>
   </xsl:element>
</xsl:template>
[XSLT,DocBook] | # Read Comments (0) |

Comments

Mon, 10 Apr 2006

Creating a Certificate Authority

Sometimes you need to generate several SSL certificates, but don't want to pay money to a Trusted Root, and self-signed certificates just won't cut it. If you've ever had this dilemma, just for you, here's an article describing how to set up your own trusted root certificate and how to import it into several common applications. If you want me to add your favourite application, feel free to email me with instructions and screenshots if appropriate.

[article,SSL,certificate authority,root ca,OpenSSL] | # Read Comments (2) |

Comments

Sun, 19 Mar 2006

PostgreSQL User Administration

Over the weekend, I managed to revive my main desktop machine, which as spent the last 15 months turned off under a desk because it was showing some odd behaviour and didn't have time to fix it. I've upgraded it to the latest sid and given it to my girlfriend to use instead of the Windows machine she had been using. She appears to have fallen in love with tuxpaint. :)

In the process of setting it up I discovered printconf, which automatically sets up parallel and USB printers under cups. Plugged in my printer, went to print in firefox and there was the printer. These things just get easier and easier. Gone are the days when you spent hours writing a printcap entry for your printer. One thing I would like is for DBus support in CUPS so I know when the print job has finished.

Just finished writing an article on PostgreSQL user administration. Go read it.

[article,PostgreSQL,user adminitration,database,tuxpaint,printconf] | # Read Comments (2) |

Comments

Tue, 14 Mar 2006

(In)sane

Opening XSane results in:

[sane,user-friendly,UI,HCI] | # Read Comments (3) |

Comments

Mon, 13 Mar 2006

LDAP Basics

Been stuck at home ill all day, so I took the opportunity to type up an article on LDAP basics, which is hopefully an easy to understand introduction to LDAP. Given the complicated subject matter, I probably failed in a couple of places. If you find something you don't understand, I'd love to know so I can rewrite that section to make it clearer.

I also updated my robust shell scripting article to include a small section on (almost) race-free locking in bash, using IO redirection and bash's noclobber option. Thanks to Ralf Wildenhues for the suggestion.

[article,LDAP,bash] | # Read Comments (0) |

Comments

Thu, 09 Mar 2006

Setting your terminal title in bash

I didn't really want to write three articles about bash in a row, but after my last article about Bash prompts Ralph Aichinger emailed me asking about a feature he had in zsh, where his xterms show him the current process and whether it was possible to do that in Bash. Never one to refuse a challenge, I had a go and my latest article is the result.

[article,bash,shell,terminal,prompt,xterm title] | # Read Comments (3) |

Comments

Wed, 08 Mar 2006

Bash Prompt

I managed to add two more sections to my article on writing robust shell scripts including using trap and making more atomic changes. Had some useful feedback including making the fact that a few small tweaks made it apply to more than just bash. Following from that I've added an article about changing your bash prompt and how mine has been built up over the years to something useful to me. Hopefully it'll give other people some ideas.

[article,shell,bash,prompts,programming] | # Read Comments (4) |

Comments

Privoxy Crack

When you develop a anonymising, ad-removing, popup-blocking proxy server, don't do what Privoxy appear to do, which is s/open\((.*)\)/privoxyWindowOpen(\1)/g because then you'll convert any instances of "open()" in the text, confusing people.

Update:Bug reported, but nothing has happened to it since 2004-06-25.

[privoxy,wtf] | # Read Comments (2) |

Comments

Perl open()

Dear perl programmers,

When using open(), please don't use:

open FILE, "$file" or die "couldn't open file\n";

It really helps if you tell us what file you're trying to open and what went wrong. The correct error message is:

open FILE, "$file" or die "could not open $file: $!\n";

Thank you.

[perl,wtf] | # Read Comments (9) |

Comments

Fri, 03 Mar 2006

Writing Robust Shell Scripts Article

I've just finished writing an article on tips for making your shell scripts more robust. Comments welcome.

[article,bash,shell] | # Read Comments (6) |

Comments

Wed, 01 Mar 2006

IO::File->open() is broken

So you read the documentation for IO::File and see:

open( FILENAME [,MODE [,PERMS]] )
open( FILENAME, IOLAYERS )

so you write:

my $rules = new IO::File('debian/rules','w', 0755);

and wonder why it hasn't changed the permissions from 0666. Stracing confirms it is opened 0666:

open("./debian/rules", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, 0666) = 4

A bit further reading of the documentation you discover:

If IO::File::open receives a Perl mode string (">", "+<", etc.) or an ANSI C fopen() mode string ("w", "r+", etc.), it uses the basic Perl open operator (but protects any special characters).

If IO::File::open is given a numeric mode, it passes that mode and the optional permissions value to the Perl sysopen operator. The permissions default to 0666.

If IO::File::open is given a mode that includes the : character, it passes all the three arguments to the three-argument open operator.

For convenience, IO::File exports the O_XXX constants from the Fcntl module, if this module is available.

and the correct way to write this is

my $rules = new IO::File('debian/rules',O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, 0755)

Thank you very much perl for ignoring the permission parameter when you feel like it.

Update:Steinar, yes sorry, I did have 0755 rather than "0755" originally, but changed it just to check that didn't make a difference and copied the wrong version. I've changed the post to have the right thing.

% strace -eopen perl -MIO::File \
   -e 'my $rules = new IO::File("foo","w", 0755);' 2>&1 | grep foo
open("./foo", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, 0666) = 3
% strace -eopen perl -MIO::File \
   -e 'my $rules = new IO::File("foo","w", "0755");' 2>&1 | grep foo
open("./foo", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, 0666) = 3

Incidentally, 0755 and "0755" are different:

 perl -e 'printf("%d %d\n", 0755, "0755");'
493 755
[perl,IO::File,gotchas] | # Read Comments (1) |

Comments

Tue, 07 Feb 2006

Perl Style and readability

Which of these two fragments is more readable?

$self->{catalina_base} = $ENV{'CATALINA_BASE'};
if (!defined $self->{catalina_base}) {
    $self->{catalina_base} = $self->getTomcatHome() ;
}
if (!defined $self->{catalina_base}) {
    CCM::Util::error ("CATALINA_BASE unset and TOMCAT_HOME undefined", 3);
}

or

$self->{catalina_base} = $ENV{'CATALINA_BASE'} || $self->getTomcatHome()
   || CCM::Util::error ("CATALINA_BASE unset and TOMCAT_HOME undefined", 3);

Update: or

$self->{catalina_base} = ( 
   $ENV{'CATALINA_BASE'} 
   or $self->getTomcatHome()
   or CCM::Util::error ("CATALINA_BASE unset and TOMCAT_HOME undefined", 3) 
);
[perl,style] | # Read Comments (7) |

Comments

OMFG Google evil

MJ Ray, I think you're distorting what is going on. Google are, rightly, protective of their search results and work actively against people that try to manipulate the search results. This is what BMW.de had done by giving a spam page to google and using javascript to redirect users to the right page. As an aside, this would have broken for text browsers or anyone without javascript. It's not the pinicle of accessibility is it? Google were protecting their index, not using it as a vendetta against people it doesn't like as you were suggesting. Stop being so paranoid and stop distorting the story, or you're no better than the devil you're trying to paint Google as.

[google,mjray] | # Read Comments (1) |

Comments

Sun, 05 Feb 2006

SQL::Translator

Today, I discovered SQL::Translator, which seems to have some very interesting use cases. Basically, it is a perl module for translating a database schema from one of a number of formats and turning it into another format. Parsers include:

  • Live querying of DB2, MySQL, DBI-PostgreSQL, SQLite and Sybase databases
  • Access
  • Excel
  • SQL for DB2, MySQL, Oracle, PostgreSQL, SQLite and Sybase
  • Storable
  • XML
  • YAML

Output formats include:

  • Class::DBI
  • SQL for MySQL, Oracle, PostgreSQL, SQLServer, SQLite and Sybase
  • Storable, XML and YAML
  • POD, Diagram, GraphViz and HTML

Several things spring to mind with this:

  1. Defining your Schema in XML and using SQL::Translator to convert it into SQL for several databases and a set of classes for Class::DBI, which would make your application immediately target any of the supported databases.
  2. Documenting an existing database for which you've lost existing documentation by pointing it at a running database instance and outputting HTML page and, thanks to the Diagram output module, visual representation of the structure.
  3. Convert one database from product to another. Point it at a MySQL database and generate SQL for postgresql. If you generated some Class::DBI stuff you could possibly quickly write a script to copy data too.
  4. Using the sqlt-diff script, compare you current SQL to what is running on the database and generate a SQL script to upgrade the database structure using ALTER TABLE etc. Presumably you'd need to convert any data yourself, but is still a time saver for large databases.

I'm sure other people could think of some interesting uses for this. Having looked at the Class::DBI stuff, I think it could do with some improvements. I can't see a way to set the class names, although I haven't spent that much time looking and it insists on having all the classes in one file. Also the XML and YAML formats generated are rather verbose and I haven't looked to see how much I could cut them down to use as the source definition. I suspect that I can make it a lot shorter and rely on sensible defaults.

My initial reason for wanting to use SQL::Translator is that Class::DBI::Pg has a large start up time and isn't really suitable for CGI use if you have a complex database. This might be mitigated by using mod_perl, but in the mean time I was hoping I could speed up startup by telling Class::DBI my column names, rather than it querying the database. SQL::Translator should allow me to save duplicating the database structure, whilst allowing me to support multiple backend databases. If I get this working, I'll write up a short HOWTO.

[perl,SQL,SQL::Translator,databases,MySQL,Oracle,PostgreSQL] | # Read Comments (0) |

Comments

Thu, 26 Jan 2006

Bad Shell: Lazyweb replies

Yesterday I posted about some bad shell code I had found and posted an improved version. Part of the reason for posting it was that I hoping someone could point out any errors in the version I posted. Fortunately Neil Moore emailed me some improvements.

  1. If the script returns more than one line they will be removed by the $(...) expansion when it is split into words. The solution there is to surround it in double quotes.
  2. The next problem Neil pointed out was that $@ should be surrounded by quotes in pretty much every case, otherwise parameters with spaces in will get split into separate parameters.
  3. The final problem is that if the script includes a return statement, it will stop the inner most function or sourced script, but not during eval.The solution is to enclose it in a function:
    
    dummy() { eval "$(perl "$CONF_DIR/project.pl" "$@")"; } dummy "$@"
    

Since making the post, I discovered that Solaris' /bin/sh doesn't like $(...), so it's probably better to use backticks instead if you want to be portable. As I know the output from the script I'm not worried about return statements, so I've ended up with:


eval "`perl "$CONF_DIR/project.pl" "$@"`"
[shell,bash] | # Read Comments (0) |

Comments

Wed, 25 Jan 2006

Bad Shell Scripts

I'm always amazed at the number of bad shell scripts I keep coming across. Take this snippet for example:

TEMP1=`mktemp /tmp/tempenv.XXXXXX` || exit 1
perl $CONF_DIR/project.pl $@ > $TEMP1
if [ $? != 0 ]; then
  return
fi

. $TEMP1
rm -f $TEMP1

There are several things wrong with this. First it uses a temporay file. Secondly it uses more processes than are required and thirdly it doesn't clean up after itself properly. If perl failed, the temp file would still be created, but not deleted. The last problem can be solved with some suitable uses of trap:

TEMP1=`mktemp /tmp/tempenv.XXXXXX` || exit 1
trap "rm -f $TEMP" INT TERM EXIT
perl $CONF_DIR/project.pl $@ > $TEMP1
if [ $? != 0 ]; then
  return
fi

. $TEMP1
rm -f $TEMP1
trap - INT TERM EXIT

Of course this can all be replaced with a single line:

eval $(perl $CONF_DIR/project.pl $@)
[shell,bash,wtf] | # Read Comments (1) |

Comments

Thu, 19 Jan 2006

Mozilla causing XSS in Livejournal

As mentioned in my last entry, Livejournal have changed their url scheme to get around a problem in mozilla. The trouble is down to the -moz-binding CSS property. I'll repost a copy of the short lived posting to the lj_dev community earlier.

Discussing the URL Change

As we recently announced in [info]news, we have changed the canonical URL of most journal, community, and syndicated content. While we did offer subdomains to Paid users in the past, this is now the canonical URL that we will link and redirect to for all journals. However, all communities are located at http://community.livejournal.com/username and syndicated accounts at http://syndicated.livejournal.com/username. Due to the way certain proxy servers are configured, and the fact that the HTTP RFC prohibits it, the canonical URL for journals with a username starting with an underscore will be http://users.livejournal.com/username. We are however offering a free rename if you fall into this group.

So now the technical side of all of this and why it was a required change; bear with me as its 3am my time and I'm sitting in John F. Kennedy Airport after a five hour flight. Late last week we become aware that it was possible to use the "-moz-binding" CSS attribute within Mozilla and Mozilla Firefox to execute arbitrary offsite JavaScript. As this attribute is designed to allow attaching an XBL transform and JavaScript to any node within the DOM, it is quite easy to use in a malicious fashion. We immediately altered our cleaner to strip this attribute from entries and comments, though also realized that wasn't even half the battle.

As we allow custom CSS in many of our styles, as well as the ability to link to an external stylesheet in a variety of fashions, it was quite possible to take advantage of this exploit and hijack the session cookie of any user who views your journal. As we, along with many other sites, used one cookie to authenticate a user, this cookie was quite powerful if stolen. If the user had not chosen to bind their cookie to their IP address, a malicious user could steal it, login as that user, deface the account and SPAM with it, as well as modify that user's style to include the exploit thus causing this problem to spread much like a virus.

Borrowing the idea from another development team within Six Apart, we decided we needed to break our cookies into three categories. One cookie would be our application management cookie, this cookie would only be accessible on www.livejournal.com where we will not display untrusted content. A second cookie will be accessible on all subdomains of livejournal.com, though it only will say if you are logged in or not; it is solely for optimization. We then will issue one cookie for each journal you visit. This cookie will be only accessible on username.livejournal.com or community.livejournal.com/username as it is limited to a single journal. This cookie will only grant you the permission to read protected entries and post in the particular journal. This means that if the journal owner steals your cookie, they will be able to do nothing more than view their journal and comment upon it as if they are you. In the end you will have n+2 cookies, with n being the number of journals you visit.

Due to the fact that we cannot clean every external CSS stylesheet linked to every time we generate a journal page, this change is required. While it does not fully protect us from some new cross site scripting vulnerability that can be exploited via entries or comments, they are much easier to block, patch, and recover from quickly. With Mozilla deciding to allow the execution of arbitrary JavaScript via CSS, there is no other viable solution than the one we have undertaken. We developed a plan to phase all of this in over the next week, URL change first and then followed by the cookie change, though this morning we were made aware that this was being actively exploited. As such we took our week time line and shortened it to about twelve hours. While URLs have been changed at this time, our cookie handling change has not yet happened. This should however be expected to take place within the next day or two as well as various other cleanups and fixing bugs we've already encountered.

Please feel free to let us know if you have any questions and open a support request if you find a bug or encounter a problem. Sorry all of this came with just about zero warning, but in the end we could not wait longer to fix this problem.

Thanks to Daniel Silverstone for providing me with a cache copy of the text.

[livejournal,xss,mozilla,security] | # Read Comments (1) |

Comments

Unintended Consequences of Technology

I've been meaning to post this for a couple of days. I discovered a rather unintended and quite unexpected consequence of using a piece of software. At work I use Workrave to enforce breaks from the computer. For those that have never used it, workrave is a little gnome applet that monitors your keyboard and mouse usage and after about an hour of work locks your screen and gives you a 10 minute break with a few exercises to reduce RSI. It also gives you microbreaks of 20 seconds every 3 minutes of work, so you can look away from the screen, stretch, whatever.. I should point out that it isn't 20 seconds every 3 wallclock minutes, but 3 minutes of using the mouse/keyboard and isn't quite as intrusive as it sounds.

I get to rest my eyes, prevent RSI annd take breaks. This sounds like exactly the thing Workrave was written for. So what is unintended? It turns out that the 20 seconds micro-break is just the right amount of time to have a drink of water or get up and fill my glass if it's empty. Before I would forget to fill my glass up, but now I have time where I can't do anything on the computer. In the 30 minutes since I got in to the office and started writing this entry I've had two micro-breaks and drunk a pint of water. Workrave has managed to increase my water intake during the day and as a result the number of toilet breaks. :)

I notice LiveJournal have enforced http://<username>.livejournal.com/ url scheme now to prevent cross-site scripting attacks where people were stealling session cookies and gaining access to accounts. Not sure what the exact problem was, but I know of several attacks over the last week by one group of scriptkiddies.

Update: Appears the problem was in firefox. Explaination here.

Update: Nope, they pulled it. Basically mozilla allows you to execute javascript via CSS stylesheets. I'll update with any further URLs as they become available.

[Workrave,water,toilet breaks,livejournal,security] | # Read Comments (0) |

Comments

Fri, 02 Dec 2005

OOP Strikes Back

Steinar, you missed my point. I explicitly said not to get bogged down in my example. The point is that changing the internal details of your class shouldn't result in your users being forced to change their code. As I said, exposing your implementation tightly couples your code and your users. As Matthew pointed out, you shouldn't have sets and gets for every item in your class. You should have a clearly thought out api that isn't tied to the internal detail. Exposing internal, either through making data public or blindly creating gets and sets for private data members is just bad OO design.

An addition to my example that makes things little cleared would be a subclass that validated the words you could use. With a public variable I couldn't enforce that, but I could override my setter to add validation.

[OOP] | # Read Comments (0) |

Comments

Thu, 01 Dec 2005

Return of Accessors vs. public member variables

Steinar, Imagine a Word class:

class WordA {
   public:
      char * word;
};
class WordB {
   private:
      char * word;
   public:
      void setWord(char * w) { word = w; }
      char * getWord() { return word; }
};

WordA worda; 
worda.word = "WordA";

WordB wordb;
wordb.setWord("WordB");

printf("%s %s", worda.word, wordb.getWord());

Fine, this works well. Now imagine that you want to change word from a char * to a std::string to stop you dealing with pointers. For the WordBclass you only need to change the getWord function. You don't need to change any of your class's users. For the WordA class, you have a problem because you can't automatically convert from a std::string to a char *, so all your users have to change from object.word to object.word.c_str().

class WordA {
   public:
      std::string word;
};
class WordB {
   private:
      std::string word;
   public:
      void setWord(char * w) { word = w; }
      char * getWord() { return word.c_str(); }
};

WordA worda; 
worda.word = "WordA";

WordB wordb;
wordb.setWord("WordB");

printf("%s %s", worda.word.c_str(), wordb.getWord());

This is the punishment you get for exposing the internal details of your class to your users. Please ignore any specific mistakes in my examples; the principles work the same with different types.

[C++,OOP] | # Read Comments (0) |

Comments

Wed, 30 Nov 2005

Accessors vs. public member variables

Steinar, you use accessor rather than public data members because you may need to change the behaviour of the class to do something when you set a member variable. If you have all your data public, you can't do this. If you force people to go via a function, you can make changes to the class without affecting its users. You have a similar issue with inherited classes.

[OOP] | # Read Comments (0) |

Comments