java leap second bug – 30 June / 1 July 2012 – fix

If your java applications suddenly started to use 100% CPU, you’ve hit java leap second bug (actually it’s a bug in Linux kernel, just java programs seem to be severely affected).

First, you may check if you have the following in your dmesg:


[10703552.860274] Clock: inserting leap second 23:59:60 UTC

Fortunately the fix is straightforward:

/etc/init.d/ntp stop
date -s "$(date)"

(the other suggested way around the web would be date `date +"%m%d%H%M%C%y.%S"`, having the same effect).

You don’t have to restart your java applications (tomcat, solr, wowza, or whatever using java, but also non-java Ruby, Redmine, collectd etc. were affected); it should be enough to run the above commands.

You may want to enable ntp the next day (there were some reports that enabling it makes java misbehave on some installations).

Apparently this issue has knocked down lots of Linux servers running Debian, Ubuntu, CentOS and the like around the globe!

Perpetual Traveler Blog

39 Comments

  1. matvey says:

    Thanx!

    And why this problem occurred?

    Cheers,
    Matt

  2. Gideon says:

    Thank you

    Your post has saved me from having to reboot all my Java servers (since restarting the VMs on some of my platforms don’t seem to help).

  3. Dave says:

    This caused all of my Java processes running on all of my Ubuntu Lucid 10.04 servers to 100% peg the CPU at exactly 12:00:00 Midnight Saturday June 30, 2012 in UTC. You’re “fix” works.

  4. Tobi says:

    Thank you so much! Life saver! Re-post http://bit.ly/QNC6Cc

  5. walendo says:

    Thank you for finding the problem! I had just started debugging a previously well-behaved server that had spontaneously started pegging the CPU this morning, when I found your answer. It worked for me, as well.

  6. Babblo says:

    Thaanks ^ 100000000!!!

  7. ds10 says:

    Anyone running VMWare vCenter Server Appliance should check to see if their vCenter appliance is using massive amounts of CPU. Apparently much of the vCenter Server Appliance is implemented using Java, and it had the leap second bug. The appliance doesn’t stop working, but does chew up tons of resources. Rebooting the appliance virtual machine resolved the issue quickly enough. It is possible that Windows-hosted vCenter also had issues, but we only use the appliance.

  8. Martin says:

    Thank you very much!

  9. arctus says:

    Oh god, we had the same prolem: OpenSuSE + ntp + java application servers -> load from 2 to up to 25; your fix worked. How did you figure out what to do?

  10. [...] Turns out there’s a leap second bug in current java releases on Linux servers. I found this post where a temporary workaround/fix is [...]

  11. Tanel says:

    In our case jrockit_R28.1.0-4.0.1-x64 caused huge system cpu usage (20% user 80% system), but jrockit_160_14_R27.6.5-32 ran without problems in the same machine (redhat enterprise linux). The 80/20 ratio in vmstat output was a clear hint that the problem was somehow related to the environment.

  12. [...] Here is the solution we found here; [...]

  13. Tom says:

    This is not a java bug, it’s a Linux kernel bug which Java and other heavily multithreaded systems tend to trigger. See https://lkml.org/lkml/2012/6/30/122

  14. Russell says:

    My man you are a genius!! Thanks to people like you, we can all work together!! Stupid question, but hey… can I restart my NTP? Some of the services on my webserver are time dependant.

    Thanks,
    Russell

  15. wincus says:

    Thanks! works great!

  16. Alberto says:

    Happened to my mission critical servers and urged me to jump off the bed to reboot them.
    The only one that survived didn’t have NTP. What a story

  17. ski says:

    thanks, did the job!

  18. [...] leap second bug – 30 June / 1 July 2012 – fix java leap second bug – 30 June / 1 July 2012 – fix July 1, 2012, 6:28 [...]

  19. Felipe says:

    I can confirm it’s solved the problem with CentOS 6, x86_64 and kernel 2.6.32-220.17.1.el6.x86_64

    Thanks a lot

  20. Greg says:

    Great fix! Kudos.

  21. Georges Djonga says:

    You saved my life! This problem was driving me nuts.
    I was ready to reboot all my servers ;-)

  22. Gonzo says:

    Thank you, I started to freak out yesterday morning because of this.

  23. Ron says:

    Thank you! Java was eating up CPU on my server running Solr (tomcat). Note that I did not have an /etc/init.d/ntp file, but did have an /etc/init.d/ntpd file, and your solution worked perfectly with that minor change. Took server load down from 1.50 to 0.01. :)

  24. Logan says:

    Thank you!.. Was thinking about all the servers for this. ;)

  25. Logan says:

    Thank you!.. Was thinking about restart all the servers for this. ;)

  26. a says:

    Better: date -s “`date`”

  27. [...] (via wpkg.org) Author: Vucomir Ianculov on July 3, 2012 Category: Linux / Unix Tags: date, java, kernel, linux, ntpd, Zimbra Older: HPUX Dead Gateway Detection [...]

  28. mick says:

    Great, thanks, saved me a travel to the DC…

  29. Vime says:

    Same Here: arch linux, kernel 2.6.39, tomcat 6, ntp
    Thanks for your post!

  30. nathan says:

    > Better: date -s “`date`”

    Even better is simply
    date -s now
    , which causes a single “date” process to set the time to the current time, rather than spawning a second “date” process to look up the current time first…

    (This assumes a GNU coreutils “date”, but hopefully that a safe assumption given that affected systems are by definition running Linux….)

  31. Dmitry says:

    Thank you, thank you, man!

  32. IDANTECH says:

    Thank you a lot, saved me time.

    Great post!

  33. Moshe says:

    Works like a charm.
    Thank you for this great information.

  34. Mike says:

    Holy crap! This was killing me. So glad I stumbled on it. Thanks!!!!

  35. [...] few minutes of research suggested the fix for puppet would be the same as for other effected systems (seems java had a horrible time [...]

  36. [...] ffffffff) = -1 ETIMEDOUT (Connection timed out) Trošku jsem hledal na google a narazil jsem na zajímavý zápisek, který toto zrovna řeší. Jedná se o problém v jádře. Na daném stroji je trošku starší [...]

  37. Got a problem where all cores are at 100% system usage, that seems to be related to the running of a java program.

    However, since we know for a fact that no leap seconds have been introduced we wonder if the same bug can be triggered by something else ? Or if there are related/similar bugs that give same behaviour for other conditions?

Leave a Reply