It’s been a while since I last wrote an article, but I’ve just
published a new one on troubleshooting
networking
. It’s based on a
couple of emails I sent to a mailing list helping someone with their
networking problem. I decided to clean it up a bit and publish it. It is
mostly for finding where the problem lies, rather than fixing the issue,
but I would be grateful for any comments you have, particularly if you
think I’ve missed any obvious steps.

It can be quite discouraging to type “yum update” and have
yum simply go off forever. Among other things, one must wait a great
long time to distinguish this behavior from yum’s normal mode of
operation. Other times, it comes back very quickly with a message
saying, for all practical purposes, “RPM crashed, you lose,
sorry.”

via LWN.net (subscription
required)

Jonathan Corbet normally manages to amuse me on a weekly basis, but
this time he’s outdone himself. Consider my LWN.net subcription renewed
for another year.

Update: I;ve been convinced to include a subscriber
link to the article for those without LWN.net subscriptions. http://lwn.net/SubscriberLink/210373/8badbe4f9c463fb8/

Andreas,
use atom
rather than RSS. It has a <updated/> element for the last time the
entry was updated and a <published/> element for the date the
entry was published.

If you want to stick with RSS, you can use the
<dcterms:issued/> element for the initial published date and one
of <pubdate/>, <dc:date/> or <dcterms:modified/>.
Don’t forget to include the xml namespace for dc and dcterms.

Erich,
I’m not entirely sure what you did to break Planet, but using a strict
feed parser will just result in you missing a significant number of
entries. People sadly don’t produce valid feeds and will blame your
software rather than their feeds. It doesn’t help that a number of
validators aren’t entirely strict and that RSS doesn’t have a very
comprehensive spec. RSS is a lot worse than Atom, in part thanks to the
Atom validator and very well thought out spec. It’s for this reason that
I ended up writing Eddie rather
than using ROME as it was a DOM
parser and just failed to get any information out of a non-wellformed
feed. Eddie on the other hand is a SAX-based parser. In a recent
comparison, an Eddie based aggregator managed to correctly parse several
more entries than a ROME based aggregator one particular day.

You also have major
aggregators being liberal. Sam Ruby discussed
this recently
with Bloglines becoming the defacto validator; if
bloglines parses it, then it’s valid. We had the same problem with HTML
with people making sure their pages worked in a browser rather than met
the spec.

I suspect the problem you had with Planet is that you failed to close
a tag, causing the rest of the page to be in bold or be a link etc.
This is fairly easily solvable and in fact has been with FeedParser,
which is the feed parsing library Planet uses. It has support for using
HTMLTidy and similar libraries for fixing unbalanced elements. Eddie
uses TagSoup to do a similar thing. As a result I’ve not noticed any
particular entry leaking markup and breaking the page. Parhaps Planet
Debian just needs to install one of the markup cleaning libraries.

I agree that people should use XML tools where possible.
Unfortunately, most blogging tools use text based templating systems,
which makes producing non-wellformed XML too easy. To deal with this I
pass all my output through an XSLT filter, which means that everything
is either well formed or doesn’t output at all. Unfortunately I don’t
think everyone would be capable or willing to use XSLT.

I was using our Fedora 3 server and decided to restart sshd running in a
chroot:

[root@winky /]# /etc/init.d/sshd restart
Stopping sshd:/etc/init.d/sshd: line 212: kill: (1483) - No such process
Connection to winky closed by remote host.
Connection to winky closed.
mimsy david% ssh winky -l root
ssh: connect to host winky port 22: Connection refused

Thank you very much Fedora.

A “friend” of mine has recently been sending me forwarded jokes and
other assorted crap we all grew out of sending about 5 minutes after
we learnt how to send emails. This I can cope with, but recently, for
every email she sends, I’ve been receiving an automated email from a
server somewhere telling me that it’s blocked an attachment, once for
each attachment
in the original email. That’s crime number one. Looking
at it further it appears that its not the sender’s mail servers doing
it, but once of the recipient’s mail servers. When it gets an email
with an attachment it’s blocked, it emails everyone in the To: header
to tell them, irrespective of whether they are local users or
not
. That’s crime number two.

I thought I’d email postmaster@capita.co.uk to tell them of
this problem, but unsurprisingly it bounced. That’s crime number
three. Looking on their website, I couldn’t find any technical
contacts, which wasn’t really surprising. I did how ever find a
“general enquiries” form, so I filled that in. Unfortunately, they
used the following html for the message box:

<textarea name="Feedback1:fldEnquiry" rows="6" cols="1"
   id="Feedback1_fldEnquiry" class="enquiryTable"></textarea>

The result is that you get a text box 6 rows high and one column
across, which is basically unuseable. Interestingly they appear to
add style="width: 350px;" in IE, which makes it work. I’ll
make that crimes 4 and 5, cos doing different things for different
browsers is a crime in itself.

I await a phone call or email from them.

Had an interesting problem today with one of our servers at work.
First thing I noticed was yesterday that an upgrade of Apache2 didn’t
complete properly because /etc/init.d/apache2 stop didn’t
return. Killing it and starting apache allowed the upgrade to
finish. I noticed there was a zombie process but didn’t think too much
off it.

Then this morning got an email from the MD saying that various
internal services websites were down (webmail, wiki etc). My manager
noticed that it was due to logrotate hanging, again because restarting
apache had hung. Looking at the server I noticed a few more zombie
processes. One thing I’d noticed was that all these processes had
reparented themselves under init and a quick web search later confirmed
that init(1) should be reaping these processes. I thought maybe
restarting init would clear the zombies. I tried running
telinit q to reread the config file, but that returned an error
message about timing out on /dev/initctl named pipe. I checked that file
existed and everything looked fine. The next thing I checked was the
other end of the named pipe by running lsof -p 1. This
showed that init had /dev/console rather than
/dev/initctl as fd 0. I tried running kill -SIGHUP
1
, but that didn’t do anything. Then I tried kill
-SIGUSR1 1
, but that didn’t do anything either. I checked the
console, but there wasn’t enough scrollback to see the system booting
and decided to schedule a reboot for this evening.

Rebooting the server presented me with an interesting challenge.
Normally the shutdown command signals to init to change
to runlevel 0 or 6 to shutdown or reboot using /dev/initctl. Of
course init wasn’t listening on this file, so that was out. Sending it
an SIGINT signal (the same signal init gets on ctrl-alt-delete) had no
response. Obviously telinit 0 wasn’t going to work
either. I decided to start shutting services down manually with the help
of Brett Parker. The idea
was to stop all non-essential services, unexporting nfs exports,
remounting disks read-only and then either using sysrq or a hardware
reset. Unfortunately someone accidentally ran /etc/init.d/halt
stop
, hanging the server, but he is suffering from a bad cold today so I forgive
him. The server restarted without a hitch (thank god for ext3) and
running lsof -p 1 showed init having
/dev/initctl open. I don’t know what happened to init the last
reboot on Monday, but a reboot seemed to fix it. Odd bug, but thankfully
it was a nice simple fix. I could have spent the evening debugged init.
🙂

I’m currently mourning the loss of my laptop’s hard disk. I don’t think
there was massive amounts of data on there that I needed, but it’s still
upsetting. Looks like I’ll have to buy a new 2.5″ drive. At least it
gives me a reason to reinstall Debian.

Update: The bad news is that it is actually a 1.8″
drive, which means I need to spend 93GBP including tax and postage for a
new drive. Should get it by Friday. The good news is that having left
the laptop off during the day, I managed to get the laptop booting and
am not rsyncing the data off as fast as I can. Shame about the money,
but at least I haven’t lost much in the way of data.

To the anonymous Texan that thought it would be a good idea to point
out a Perl module in response to my date parsing class in Java:

I’m well aware of date parsing modules in other languages. In fact
I’ve used them for similar tasks. But the post wasn’t asking “How do I
parse dates in Perl?” It was giving some code that some people might
find useful.

Commenting with “man Date::Parse” doesn’t make you look clever; it
just makes you look like a twat and the kind of person people are
relucant to invite to parties. You may be a geek and you may know stuff,
but that doesn’t mean you have to try to look clever, because you will
invariably fail.

</rant>

Oh and Erich, I’ve got some more date formats I want to add support
for, so I’ll add german dates and post the updated code. 🙂