After my entry yesterday about MySQL truncating data, several people
have pointed out that MySQL 4.1 or later gives you a warning. Yes, this is true. You
can even see it in the example I gave:

Query OK, 1 row affected, 1 warning (0.00 sec)

I ignored mentioning this, but perhaps should have said something
about it. I reason I didn’t mention it was because I didn’t feel that a
warning really helped anyone. Developers have enough problems
remembering to check for errors, let along remembering to check in case
there was a warning as well. Plus, they’d then have to work out if the
warning was something serious or something they could ignore. There’s
also the question of how well the language bindings present this
information. Take for example, PHP. The mysqli extension gained support
for checking for warnings in PHP5 and gives the following
as an
example of getting warnings:

mysqli_query($link, $query);

if (mysqli_warning_count($link)) {
   if ($result = mysqli_query($link, "SHOW WARNINGS")) {
      $row = mysqli_fetch_row($result);
      printf("%s (%d): %sn", $row[0], $row[1], $row[2]);

Hardly concise code. As of 5.1.0, there is also mysqli_get_warnings(),
but is undocumented beyond noting its existence. The MySQL extension
does not support getting warning information. The PDO wrapper doesn’t
provide any way to get this information.

In perl, DBD::mysql has a mysql_warning_count()
function, but presumably would have to call "SHOW WARNINGS"
like in the PHP example. Seems Python’s MYSQLdb module will raise an
exception on warnings in certain cases. Mostly using the Cursor

In java, you can set the jdbcCompliantTruncation connection
parameter to make the driver throw java.sql.DataTruncation
exceptions, as per the JDBC spec, which makes you wonder why this isn’t
set by default. Unfortunately this setting is usually outside the
programmer’s control. There is also the
java.sql.Statement.getWarnings(), but once again, you need to
check this after every statement. Not sure if ORM tools like Hibernate
check this or not.

So, yes MySQL does give you a warning, but in practice is useless.

MySQL in its standard configuration has this wonderful “feature” of
truncating your data if it can’t fit in the field.

mysql> create table foo (bar varchar(4));
Query OK, 0 rows affected (0.00 sec)

mysql> insert into foo (bar) values ("12345");
Query OK, 1 row affected, 1 warning (0.00 sec)

In comparison, PostgeSQL does:

psql=> create table foo (bar varchar(4));
psql=> insert into foo (bar) values ('12345');
ERROR:  value too long for type character varying(4)

You can make MySQL do the right thing by setting the SQL
option to
include STRICT_TRANS_TABLES or STRICT_ALL_TABLES. The difference is that the
former will only enable it for transactional data storage engines. As much as
I’m loathed to say it, I don’t recommend using STRICT_ALL_TABLES, as an error
during updating a non-transational table will result in a partial
update, which is probably worse than a truncated field. Setting the mode
to TRADITIONAL includes both these and a couple of related issues
mode using:

  • On the command line:

  • In /etc/mysql/my.cnf:

  • At runtime:


Just say no to databases that happily throw away your data

I should have mentioned that my previous
blog posting
was using MySQL
4.0(4.0.23_Debian-3ubuntu2-log). It seems that in 5.0.2 they changed the
precedence of the NOT operator to be lower than LIKE.
From the manual:

The precedence shown for NOT is as of MySQL 5.0.2. For
earlier versions, or from 5.0.2 on if the HIGH_NOT_PRECEDENCE SQL mode
is enabled, the precedence of NOT is the same as that of the !

Using 5.0 (5.0.22-Debian_0ubuntu6.06.8-log), and a slightly smaller
dataset, I get:

mysql> select count(*) from Table where blobid is null or not blobid like '%-%';
| count(*) |
|   199057 |
1 row in set (3.26 sec)

mysql> select count(*) from Table where blobid is null or blobid not like '%-%';
| count(*) |
|   199057 |
1 row in set (0.96 sec)

Jim Kingdon
experimented with other databases and was unable to reproduce this
problem. My test with PostgreSQL 8.3:

quux=> create table foo (blobid varchar(255));
quux=> insert into foo (blobid) values
quux=> select count(*) from foo
   where blobid is null or blobid not like '%-%';
(1 row)

quux=> select count(*) from foo
   where blobid is null or not blobid like '%-%';
(1 row)

quux=> select not blobid from foo limit 10;
ERROR:  argument of NOT must be type boolean, not type character varying

This appears to have been the case since at least 7.4

Problems like this is going to make the transition from MySQL 4.0 to 5.x all the more fun when we get around to doing it.

Dear lazyweb,

I’m possibly being stupid, but can someone explain the differences
between these two queries?

mysql> select count(*) from Table
   where blobid is null or not blobid like '%-%';
| count(*) |
| 15262487 |
1 row in set (25 min 4.18 sec)

mysql> select count(*) from Table
   where blobid is null or blobid not like '%-%';
| count(*) |
| 20044216 |
1 row in set (24 min 54.06 sec)

For reference:

mysql> select count(*) from Table where blobid is null;
| count(*) |
| 15262127 |
1 row in set (24 min 7.15 sec)

Update: It turns out that the former was doing (not blobid) like '%-%' which turns out to not do anything sensible:

mysql> select not blobid from Table limit 10;
| not blobid |
|          0 |
|       NULL |
|          1 |
|          0 |
|          0 |
|          0 |
|          1 |
|       NULL |
|          1 |
|          0 |
10 rows in set (0.02 sec)

mysql> select blobid from Table limit 10;
| blobid                                                            |
| 5cd1237469cc4b52ca094e215156c582-9ef460ac4134c600a4d2382c4b0acee7 |
| NULL                                                              |
| d20cb4037f8f9ab1de5de264660f005c-2c34209dcfb39251cf7c16bb6754bbd2 |
| 845a8d06719d8bad521455a8dd47745c-095d9a0831433c92cd269e14e717b3a9 |
| 9580ed23f34dd68d35da82f7b2a293d6-bf39df7509d977a1de767340536ebe80 |
| 06c9521472cdac02a2d4b2a18f8bec0f-0a8a28d3b63df54860055f1d1de92969 |
| ed3cd0dd9b55f76db7544eeb64f3cfa0-80a6a3eb6d73c0a58f88b7c332866d5c |
| NULL                                                              |
| b339f6545651fbfa49fa500b7845c4ce-6defb5ffc188b8f72f1aa10bbd5c6bec |
| 642075963d6f69bb11c35a110dd07c2c-8db54ac2d2accae7fa4a22db1d6caae9 |
10 rows in set (0.00 sec)

The documentation
says Logical NOT. Evaluates to 1 if the operand is 0, to 0 if the
operand is non-zero, and NOT NULL returns NULL. but doesn’t
describe the behaviour of NOT 'string'. It would appear that a
string starting with a number returns 0 and a string starting with a
letter returns 1. Either way, neither has a hyphen in.

Regular viewers will know that I don’t think favourably of MySQL.
Here is yet another reason. Let’s create an InnoDB table:

mysql> CREATE TABLE `User_` (
mysql> ...
mysql> ) ENGINE=InnoDB DEFAULT CHARSET=latin1;

Query OK, 0 rows affected, 1 warning (0.04 sec) 

One warning, but we’re running this as part of an import, so we’ll
fail to spot this and even if we did, we wouldn’t be able to get it back
out of mysql because SHOW WARNINGS only shows the last command.
So let’s look at the table we just created:

mysql> show create table User_G
*************************** 1. row ***************************
       Table: User_
Create Table: CREATE TABLE `User_` (
1 row in set (0.00 sec)

Eh? what’s going on? We asked for InnoDB, but have got a MyISAM
table. Lets look at the engines available.

mysql> show engines;
| Engine     | Support  | Comment                                                        |
| MyISAM     | DEFAULT  | Default engine as of MySQL 3.23 with great performance         |
| MEMORY     | YES      | Hash based, stored in memory, useful for temporary tables      |
| InnoDB     | DISABLED | Supports transactions, row-level locking, and foreign keys     |
| BerkeleyDB | NO       | Supports transactions and page-level locking                   |
| BLACKHOLE  | NO       | /dev/null storage engine (anything you write to it disappears) |
| EXAMPLE    | NO       | Example storage engine                                         |
| ARCHIVE    | YES      | Archive storage engine                                         |
| CSV        | YES      | CSV storage engine                                             |
| ndbcluster | DISABLED | Clustered, fault-tolerant, memory-based tables                 |
| FEDERATED  | YES      | Federated MySQL storage engine                                 |
| MRG_MYISAM | YES      | Collection of identical MyISAM tables                          |
| ISAM       | NO       | Obsolete storage engine                                        |
12 rows in set (0.00 sec)

Oh, so innodb has been disabled. We can fix that easily by removing
skip-innodb from my.cnf.

root@cmsdb01:/var/log# grep skip-innodb /etc/mysql/my.cnf

But hang on a second, that’s not in the config file. What’s going on?
It turns out that the reason InnoDB is disabled is because of the
innodb_log_file_size setting not matching the files on disk.

root@cmsdb01:/var/log# grep innodb_log_file_size /etc/mysql/my.cnf
innodb_log_file_size            = 512M
root@cmsdb01:/var/log# ls -lh /var/lib/mysql/ib_logfile*
-rw-rw---- 1 mysql mysql 5.0M 2006-12-19 18:39 /var/lib/mysql/ib_logfile0
-rw-rw---- 1 mysql mysql 5.0M 2006-12-19 18:39 /var/lib/mysql/ib_logfile1

Rumour has it that you can just stop MySQL, delete these log files
and start MySQL again. I’m yet to try this as the server in question is
in production use. The alternative is to change the
innodb_log_file_size setting to match the file.

So in summary the problems with MySQL are:

  • Not logging warnings anywhere useful.
  • Converting engine types with a warning rather than throwing an
    error. This can be fixed by setting sql_mode to include
  • Starting up and disabling InnoDB when there is a problem rather than
    failing to start, giving a false impression that everything is

MySQL has not impressed me this week.

If you’re trying to import a dump file created using
mysqldump and you get an error like:

ERROR 1005 (HY000): Can't create table './Database/Table.frm' (errno: 150)

Then you’ve just been bitten by mysqldump being far too stupid. The
problem occurs because mysqldump includes foreign key constraints in the
initial CREATE TABLE command, so if a table refers to a table
that doesn’t currently exist, it throws an error. mysqldump does
correctly disable the contraints when inserting data into the tables. The correct way for
this would be for mysqldump to create all the tables without the
constraints, use ALTER TABLE to add the constraints to the
tables, and then importing the data into the tables.

The workaround for this problem is to use:

source dump.sql

Update: Someone has pointed out that it appears that
mysql 5 has fixed
this problem
by including the above statements in the dump.

MySQL cleverly maps

CREATE INDEX foo_bar ON Foo(Bar);


CREATE TEMPORARY TABLE A-Foo ( .... INDEX foo_bar (Bar));

If you have a very large table, expect this operation to take a) a
lot of disk space, b) a very very long time and c) block any writes to the table
in the process. I don’t recommend adding indexes or altering any very
large tables that are in production on MySQL, because you won’t be in
production for quite some time.

Update: Tom Haddon asked me if this applied to
recent versions of MySQL or to PostgreSQL. Looking at the docs, it
appears to still apply to 5.1:


    In some cases, no temporary table is necessary:

    • If you use ALTER TABLE tbl_name RENAME TO new_tbl_name without any other
      options, MySQL simply renames any files that correspond to the table
      tbl_name. (You can also use the RENAME TABLE statement to rename tables. See
      Section 13.1.16, “RENAME TABLE Syntax”.)

    • ALTER TABLE … ADD PARTITION creates no temporary table except for MySQL
      Cluster. ADD or DROP operations for RANGE or LIST partitions are immediate
      operations or nearly so. ADD or COALESCE operations for HASH or KEY partitions
      copy data between changed partitions; unless LINEAR HASH/KEY was used, this is
      much the same as creating a new table (although the operation is done partition
      by partition). REORGANIZE operations copy only changed partitions and do not
      touch unchanged ones.

    If other cases, MySQL creates a temporary table, even if the data wouldn’t
    strictly need to be copied (such as when you change the name of a column).


As far as PostgreSQL is concerned, it doesn’t mention anything about
doing the same thing, but does mention that it does a full sequential
scan of the table. During this time writes are blocked. You can use the
CONCURRENTLY keyword to allow writes to happen, but it does two scans
and will take longer, but you can still use your database.