Ever wanted to find out how much diskspace each table was taking in a
database? Here’s how:

database=# SELECT
   pg_size_pretty(pg_relation_size(tablename)) AS table_size,
   pg_size_pretty(pg_total_relation_size(tablename)) AS total_table_size
   schemaname = 'public';
 tablename  | table_size | total_table_size
 deferrals  | 205 MB     | 486 MB
 errors     | 58 MB      | 137 MB
 deliveries | 2646 MB    | 10096 MB
 queue      | 7464 kB    | 22 MB
 unknown    | 797 MB     | 2644 MB
 messages   | 1933 MB    | 6100 MB
 rejects    | 25 GB      | 75 GB
(7 rows)

Table size is the size for the current data.
Total table size includes indexes and data that is too large to fix in
the main table store (things like large BLOB fields). You can find more
information in the PostgreSQL

Edit: changed to use pg_size_pretty(), which I thought existed, but couldn’t find in the docs. Brett Parker reminded me it did exist after all and I wasn’t just imagining it.

After my entry yesterday about MySQL truncating data, several people
have pointed out that MySQL 4.1 or later gives you a warning. Yes, this is true. You
can even see it in the example I gave:

Query OK, 1 row affected, 1 warning (0.00 sec)

I ignored mentioning this, but perhaps should have said something
about it. I reason I didn’t mention it was because I didn’t feel that a
warning really helped anyone. Developers have enough problems
remembering to check for errors, let along remembering to check in case
there was a warning as well. Plus, they’d then have to work out if the
warning was something serious or something they could ignore. There’s
also the question of how well the language bindings present this
information. Take for example, PHP. The mysqli extension gained support
for checking for warnings in PHP5 and gives the following
as an
example of getting warnings:

mysqli_query($link, $query);

if (mysqli_warning_count($link)) {
   if ($result = mysqli_query($link, "SHOW WARNINGS")) {
      $row = mysqli_fetch_row($result);
      printf("%s (%d): %sn", $row[0], $row[1], $row[2]);

Hardly concise code. As of 5.1.0, there is also mysqli_get_warnings(),
but is undocumented beyond noting its existence. The MySQL extension
does not support getting warning information. The PDO wrapper doesn’t
provide any way to get this information.

In perl, DBD::mysql has a mysql_warning_count()
function, but presumably would have to call "SHOW WARNINGS"
like in the PHP example. Seems Python’s MYSQLdb module will raise an
exception on warnings in certain cases. Mostly using the Cursor

In java, you can set the jdbcCompliantTruncation connection
parameter to make the driver throw java.sql.DataTruncation
exceptions, as per the JDBC spec, which makes you wonder why this isn’t
set by default. Unfortunately this setting is usually outside the
programmer’s control. There is also the
java.sql.Statement.getWarnings(), but once again, you need to
check this after every statement. Not sure if ORM tools like Hibernate
check this or not.

So, yes MySQL does give you a warning, but in practice is useless.

MySQL in its standard configuration has this wonderful “feature” of
truncating your data if it can’t fit in the field.

mysql> create table foo (bar varchar(4));
Query OK, 0 rows affected (0.00 sec)

mysql> insert into foo (bar) values ("12345");
Query OK, 1 row affected, 1 warning (0.00 sec)

In comparison, PostgeSQL does:

psql=> create table foo (bar varchar(4));
psql=> insert into foo (bar) values ('12345');
ERROR:  value too long for type character varying(4)

You can make MySQL do the right thing by setting the SQL
option to
include STRICT_TRANS_TABLES or STRICT_ALL_TABLES. The difference is that the
former will only enable it for transactional data storage engines. As much as
I’m loathed to say it, I don’t recommend using STRICT_ALL_TABLES, as an error
during updating a non-transational table will result in a partial
update, which is probably worse than a truncated field. Setting the mode
to TRADITIONAL includes both these and a couple of related issues
mode using:

  • On the command line:

  • In /etc/mysql/my.cnf:

  • At runtime:


Just say no to databases that happily throw away your data

Dear Lazyweb,

We have a web application that has quite a large database and
reasonable usage. Back in the dim and distant past, we scaled the
application by the age-old method of using several read-only slave
databases to prevent reads on the master swamping writes. This worked
well for several years, and then we introduced memcached into the mix to
improve performance by reducing the number of reads from the database.
This improved our database capacity even further.

Now the question has
arisen about reducing or even removing the code to read from the slaves.
I’m trying to come up with some compelling reasons to keep the
application reading from the slaves. The pros and cons I currently have
for removing the code are:

  • Reduces code complexity
  • Removes consistency problems due to latency in the replication. This is less of a
    problem than it used to be after we solved a problem with our
  • Reduces our existing capacity
  • Cache flushes would cause huge spikes on our master server until the
    cache filled up again
  • Caches wouldn’t help queries with unique critera

I would appreciate any additional reasons, pro or cons.
We already have an existing non-live slave for backups and slow
queries by developers. We would retain a slave for redundancy in the
case of master failure. I’m only looking for issues that would affect
the application.

I should have mentioned that my previous
blog posting
was using MySQL
4.0(4.0.23_Debian-3ubuntu2-log). It seems that in 5.0.2 they changed the
precedence of the NOT operator to be lower than LIKE.
From the manual:

The precedence shown for NOT is as of MySQL 5.0.2. For
earlier versions, or from 5.0.2 on if the HIGH_NOT_PRECEDENCE SQL mode
is enabled, the precedence of NOT is the same as that of the !

Using 5.0 (5.0.22-Debian_0ubuntu6.06.8-log), and a slightly smaller
dataset, I get:

mysql> select count(*) from Table where blobid is null or not blobid like '%-%';
| count(*) |
|   199057 |
1 row in set (3.26 sec)

mysql> select count(*) from Table where blobid is null or blobid not like '%-%';
| count(*) |
|   199057 |
1 row in set (0.96 sec)

Jim Kingdon
experimented with other databases and was unable to reproduce this
problem. My test with PostgreSQL 8.3:

quux=> create table foo (blobid varchar(255));
quux=> insert into foo (blobid) values
quux=> select count(*) from foo
   where blobid is null or blobid not like '%-%';
(1 row)

quux=> select count(*) from foo
   where blobid is null or not blobid like '%-%';
(1 row)

quux=> select not blobid from foo limit 10;
ERROR:  argument of NOT must be type boolean, not type character varying

This appears to have been the case since at least 7.4

Problems like this is going to make the transition from MySQL 4.0 to 5.x all the more fun when we get around to doing it.

Dear lazyweb,

I’m possibly being stupid, but can someone explain the differences
between these two queries?

mysql> select count(*) from Table
   where blobid is null or not blobid like '%-%';
| count(*) |
| 15262487 |
1 row in set (25 min 4.18 sec)

mysql> select count(*) from Table
   where blobid is null or blobid not like '%-%';
| count(*) |
| 20044216 |
1 row in set (24 min 54.06 sec)

For reference:

mysql> select count(*) from Table where blobid is null;
| count(*) |
| 15262127 |
1 row in set (24 min 7.15 sec)

Update: It turns out that the former was doing (not blobid) like '%-%' which turns out to not do anything sensible:

mysql> select not blobid from Table limit 10;
| not blobid |
|          0 |
|       NULL |
|          1 |
|          0 |
|          0 |
|          0 |
|          1 |
|       NULL |
|          1 |
|          0 |
10 rows in set (0.02 sec)

mysql> select blobid from Table limit 10;
| blobid                                                            |
| 5cd1237469cc4b52ca094e215156c582-9ef460ac4134c600a4d2382c4b0acee7 |
| NULL                                                              |
| d20cb4037f8f9ab1de5de264660f005c-2c34209dcfb39251cf7c16bb6754bbd2 |
| 845a8d06719d8bad521455a8dd47745c-095d9a0831433c92cd269e14e717b3a9 |
| 9580ed23f34dd68d35da82f7b2a293d6-bf39df7509d977a1de767340536ebe80 |
| 06c9521472cdac02a2d4b2a18f8bec0f-0a8a28d3b63df54860055f1d1de92969 |
| ed3cd0dd9b55f76db7544eeb64f3cfa0-80a6a3eb6d73c0a58f88b7c332866d5c |
| NULL                                                              |
| b339f6545651fbfa49fa500b7845c4ce-6defb5ffc188b8f72f1aa10bbd5c6bec |
| 642075963d6f69bb11c35a110dd07c2c-8db54ac2d2accae7fa4a22db1d6caae9 |
10 rows in set (0.00 sec)

The documentation
says Logical NOT. Evaluates to 1 if the operand is 0, to 0 if the
operand is non-zero, and NOT NULL returns NULL. but doesn’t
describe the behaviour of NOT 'string'. It would appear that a
string starting with a number returns 0 and a string starting with a
letter returns 1. Either way, neither has a hyphen in.

A while ago I published an article on PostgreSQL
user administration
. Typically, things have changed since I wrote
that article. I thought I’d detail a couple of the differences since
I wrote that guide.

The major difference is that you now have roles rather than users and
you use the CREATE ROLE command to create them instead of
CREATE USER, although the latter command still works. The
command line options for the createuser command have changed as
a result too. Before superuser and the ability to create new users were
the same thing. Now you can give a role permissions to create new roles
without giving them superuser powers. The options are now -s for
superuser and -S for not superuser, -d to allow them to create
databases and -D to disallow database creation and -r to allow the new
role to create other roles and -R to prevent them. for a standard user
you probably want somethig like:

createuser -S -D -R -P user

The -P makes createuser ask you for a password for
the new role.

You can find out more information about the new role system in
PostgreSQL in the user
sections of the manual.

Regular viewers will know that I don’t think favourably of MySQL.
Here is yet another reason. Let’s create an InnoDB table:

mysql> CREATE TABLE `User_` (
mysql> ...
mysql> ) ENGINE=InnoDB DEFAULT CHARSET=latin1;

Query OK, 0 rows affected, 1 warning (0.04 sec) 

One warning, but we’re running this as part of an import, so we’ll
fail to spot this and even if we did, we wouldn’t be able to get it back
out of mysql because SHOW WARNINGS only shows the last command.
So let’s look at the table we just created:

mysql> show create table User_G
*************************** 1. row ***************************
       Table: User_
Create Table: CREATE TABLE `User_` (
1 row in set (0.00 sec)

Eh? what’s going on? We asked for InnoDB, but have got a MyISAM
table. Lets look at the engines available.

mysql> show engines;
| Engine     | Support  | Comment                                                        |
| MyISAM     | DEFAULT  | Default engine as of MySQL 3.23 with great performance         |
| MEMORY     | YES      | Hash based, stored in memory, useful for temporary tables      |
| InnoDB     | DISABLED | Supports transactions, row-level locking, and foreign keys     |
| BerkeleyDB | NO       | Supports transactions and page-level locking                   |
| BLACKHOLE  | NO       | /dev/null storage engine (anything you write to it disappears) |
| EXAMPLE    | NO       | Example storage engine                                         |
| ARCHIVE    | YES      | Archive storage engine                                         |
| CSV        | YES      | CSV storage engine                                             |
| ndbcluster | DISABLED | Clustered, fault-tolerant, memory-based tables                 |
| FEDERATED  | YES      | Federated MySQL storage engine                                 |
| MRG_MYISAM | YES      | Collection of identical MyISAM tables                          |
| ISAM       | NO       | Obsolete storage engine                                        |
12 rows in set (0.00 sec)

Oh, so innodb has been disabled. We can fix that easily by removing
skip-innodb from my.cnf.

root@cmsdb01:/var/log# grep skip-innodb /etc/mysql/my.cnf

But hang on a second, that’s not in the config file. What’s going on?
It turns out that the reason InnoDB is disabled is because of the
innodb_log_file_size setting not matching the files on disk.

root@cmsdb01:/var/log# grep innodb_log_file_size /etc/mysql/my.cnf
innodb_log_file_size            = 512M
root@cmsdb01:/var/log# ls -lh /var/lib/mysql/ib_logfile*
-rw-rw---- 1 mysql mysql 5.0M 2006-12-19 18:39 /var/lib/mysql/ib_logfile0
-rw-rw---- 1 mysql mysql 5.0M 2006-12-19 18:39 /var/lib/mysql/ib_logfile1

Rumour has it that you can just stop MySQL, delete these log files
and start MySQL again. I’m yet to try this as the server in question is
in production use. The alternative is to change the
innodb_log_file_size setting to match the file.

So in summary the problems with MySQL are:

  • Not logging warnings anywhere useful.
  • Converting engine types with a warning rather than throwing an
    error. This can be fixed by setting sql_mode to include
  • Starting up and disabling InnoDB when there is a problem rather than
    failing to start, giving a false impression that everything is

MySQL has not impressed me this week.

If you’re trying to import a dump file created using
mysqldump and you get an error like:

ERROR 1005 (HY000): Can't create table './Database/Table.frm' (errno: 150)

Then you’ve just been bitten by mysqldump being far too stupid. The
problem occurs because mysqldump includes foreign key constraints in the
initial CREATE TABLE command, so if a table refers to a table
that doesn’t currently exist, it throws an error. mysqldump does
correctly disable the contraints when inserting data into the tables. The correct way for
this would be for mysqldump to create all the tables without the
constraints, use ALTER TABLE to add the constraints to the
tables, and then importing the data into the tables.

The workaround for this problem is to use:

source dump.sql

Update: Someone has pointed out that it appears that
mysql 5 has fixed
this problem
by including the above statements in the dump.