July 1, 2012, 9:48 a.m.
IT

Leap second causing MySQL / Java headaches

The leap second that just passed on midnight GMT on 30 June 2012 caused some of my servers to experience a significant spike in CPU utilisation. In specific, a MySQL server and JBoss Java servers were pretty much messed up.

I tried to restart MySQL - did restart but immediately went back to 92% CPU usage. Tried killing JBoss and restarting that - it would not shut down cleanly and it would not even begin to start up again. So I bounced the box - it seems fine now.

The workaround seems to be to set the date like this:

/etc/init.d/ntpd stop; date -s "`date`"

As per this site. The explanation? A race condition in a system call, where the system is stuck repeating a call over and over as the time "jumped back 1 second". Strange that a restart of the service did not fix it.

Here is the CPU graph of one server. Note the spike starting at 2 am SAST (midnight GMT):
CPU usage - by day
CPU usage - by day
And the correlated spike in system interrupts:
individual interrupts - by day
individual interrupts - by day

Clearly indicating Local timer interrupts were at the root of this issue. The simple date set fixed this, but next time I will be better prepared.