java leap second bug – 30 June / 1 July 2012 – fix
If your java applications suddenly started to use 100% CPU, you’ve hit java leap second bug (actually it’s a bug in Linux kernel, just java programs seem to be severely affected).
First, you may check if you have the following in your dmesg:
[10703552.860274] Clock: inserting leap second 23:59:60 UTC
Fortunately the fix is straightforward:
/etc/init.d/ntp stop
date -s "$(date)"
(the other suggested way around the web would be date `date +"%m%d%H%M%C%y.%S"`, having the same effect).
You don’t have to restart your java applications (tomcat, solr, wowza, or whatever using java, but also non-java Ruby, Redmine, collectd etc. were affected); it should be enough to run the above commands.
You may want to enable ntp the next day (there were some reports that enabling it makes java misbehave on some installations).
Apparently this issue has knocked down lots of Linux servers running Debian, Ubuntu, CentOS and the like around the globe!
Perpetual Traveler Blog
Thanx!
And why this problem occurred?
Cheers,
Matt
Thank you
Your post has saved me from having to reboot all my Java servers (since restarting the VMs on some of my platforms don’t seem to help).
This caused all of my Java processes running on all of my Ubuntu Lucid 10.04 servers to 100% peg the CPU at exactly 12:00:00 Midnight Saturday June 30, 2012 in UTC. You’re “fix” works.
Thank you so much! Life saver! Re-post http://bit.ly/QNC6Cc
Thank you for finding the problem! I had just started debugging a previously well-behaved server that had spontaneously started pegging the CPU this morning, when I found your answer. It worked for me, as well.
Thaanks ^ 100000000!!!
Anyone running VMWare vCenter Server Appliance should check to see if their vCenter appliance is using massive amounts of CPU. Apparently much of the vCenter Server Appliance is implemented using Java, and it had the leap second bug. The appliance doesn’t stop working, but does chew up tons of resources. Rebooting the appliance virtual machine resolved the issue quickly enough. It is possible that Windows-hosted vCenter also had issues, but we only use the appliance.
Thank you very much!
Oh god, we had the same prolem: OpenSuSE + ntp + java application servers -> load from 2 to up to 25; your fix worked. How did you figure out what to do?
[...] Turns out there’s a leap second bug in current java releases on Linux servers. I found this post where a temporary workaround/fix is [...]
Thanks, been looking for a solution. Re-posted!
http://www.e-rave.nl/java-leap-second-bug-30-june-1-july-2012-fix
In our case jrockit_R28.1.0-4.0.1-x64 caused huge system cpu usage (20% user 80% system), but jrockit_160_14_R27.6.5-32 ran without problems in the same machine (redhat enterprise linux). The 80/20 ratio in vmstat output was a clear hint that the problem was somehow related to the environment.
[...] Here is the solution we found here; [...]
This is not a java bug, it’s a Linux kernel bug which Java and other heavily multithreaded systems tend to trigger. See https://lkml.org/lkml/2012/6/30/122
My man you are a genius!! Thanks to people like you, we can all work together!! Stupid question, but hey… can I restart my NTP? Some of the services on my webserver are time dependant.
Thanks,
Russell
Thanks! works great!
Happened to my mission critical servers and urged me to jump off the bed to reboot them.
The only one that survived didn’t have NTP. What a story
thanks, did the job!
[...] leap second bug – 30 June / 1 July 2012 – fix java leap second bug – 30 June / 1 July 2012 – fix July 1, 2012, 6:28 [...]
I can confirm it’s solved the problem with CentOS 6, x86_64 and kernel 2.6.32-220.17.1.el6.x86_64
Thanks a lot
Great fix! Kudos.
You saved my life! This problem was driving me nuts.
I was ready to reboot all my servers
Thank you, I started to freak out yesterday morning because of this.
Thank you! Java was eating up CPU on my server running Solr (tomcat). Note that I did not have an /etc/init.d/ntp file, but did have an /etc/init.d/ntpd file, and your solution worked perfectly with that minor change. Took server load down from 1.50 to 0.01.
Thank you!.. Was thinking about all the servers for this.
Thank you!.. Was thinking about restart all the servers for this.
Better: date -s “`date`”
[...] http://blog.wpkg.org/2012/07/01/java-leap-second-bug-30-june-1-july-2012-fix/ [...]
[...] (via wpkg.org) Author: Vucomir Ianculov on July 3, 2012 Category: Linux / Unix Tags: date, java, kernel, linux, ntpd, Zimbra Older: HPUX Dead Gateway Detection [...]
Great, thanks, saved me a travel to the DC…
Same Here: arch linux, kernel 2.6.39, tomcat 6, ntp
Thanks for your post!
> Better: date -s “`date`”
Even better is simply
date -s now
, which causes a single “date” process to set the time to the current time, rather than spawning a second “date” process to look up the current time first…
(This assumes a GNU coreutils “date”, but hopefully that a safe assumption given that affected systems are by definition running Linux….)
Thank you, thank you, man!
Thank you a lot, saved me time.
Great post!
Works like a charm.
Thank you for this great information.
Holy crap! This was killing me. So glad I stumbled on it. Thanks!!!!
[...] few minutes of research suggested the fix for puppet would be the same as for other effected systems (seems java had a horrible time [...]
[...] ffffffff) = -1 ETIMEDOUT (Connection timed out) Trošku jsem hledal na google a narazil jsem na zajímavý zápisek, který toto zrovna řeší. Jedná se o problém v jádře. Na daném stroji je trošku starší [...]
Got a problem where all cores are at 100% system usage, that seems to be related to the running of a java program.
However, since we know for a fact that no leap seconds have been introduced we wonder if the same bug can be triggered by something else ? Or if there are related/similar bugs that give same behaviour for other conditions?