2011-10-19

Java 7 Update 1 released - Does it fix the Lucene index corru(m)ption and SIGSEGV bugs?

After the serious issues with the Java 7 GA release, Oracle released Update 1 of Java 7 yesterday. The first thing, of course, was to check the release notes, download the package, and finally run our Lucene test cases. When inspecting the download page and release notes on the Oracle web site, I got confused: The new release is Java 7u1, in contrast the developer preview released one week ago named Java 7u2 Developer Preview. Both builds have the same build number (b08). More strange is: The release notes of the developer preview differ from the release notes on the Update 1 release:
  • The official Update 1 release notes only list one bug of the three ones originally reported to Oracle: #7068051 (SIGSEGV in PhaseIdealLoop::build_loop_late_post on T5440). The other ones are not listed, so we cannot be sure that they are really fixed.
  • The famous SIGSEGV bug in Porter Stemmer (#7070134) is not listed at all, it also disappeared from the Oracle issue tracker. It seems like hidden to the public (maybe they declared it as confidental because it's security related???).

No matter what release notes say - I had to download the offical release package! I took some free time on the Lucene Eurocon Conference in Barcelona, downloaded the packages and installed them in parallel on my Windows 64bit Thinkpad. First I tried to run the Porter Stemmer test with 100 iterations on Java 7 GA, and verified that it crashed. Robert Muir joined and we tested the new U1 release: Test passes! This means, Hotspot issue #7070134 was fixed, but Oracle missed to put it into the release notes.

The second part of the investigations were more complicated: The index corru(m)ption bugs are more complicated to reproduce, as the virtual machine does not crash and simply produces corrupt indexes after merging segments in one of the facetting tests. We checked out the Lucene Trunk revision of the time, when the bug was first discovered (issue LUCENE-3346) and used the random seed mentioned on the issue. We were able to verify the bug with Java 7 GA (the indexes are corrupt 90% of the time), and luckily after 20 iterations of the same test and random seed in Java 7u1, we have seen no corrupt index! It seemed to us that Oracle maybe fixed Java issues #7044738 and #7068051, but missed to put both of them into the release notes. Of course, without an additional statement from Oracle, we cannot be sure, that the issues are really fixed!

Oracle also released Java 6 Update 29 yesterday. The release notes on that version doesn't mention any relation to the Lucene bugs, so we were not sure if this version is completely different to the Java 6u29 developer preview, released one week ago, which listed those bugs (unfortunately, the package is no longer available on the net), or if they also missed to mention them. A quick review as done for Java 7 showed, that Porter Stemmer no longer crashes with -XX:+AggressiveOpts, so the bug seems to be also fixed here, too. We were not able to actually discover any index corru(m)ption.

Finally, we can somehow verify that the bugs seem to be fixed for both versions, but without an official statement from Oracle (in their release notes), we cannot recommend to use Java 7u1 (and Java 6u29 with aggressive opts) with Lucene and Solr.

Once I will be back in Germany, I will try to get an updated FreeBSD package of Java 7 and install it on our Apache Jenkins server.

Update (2011-10-26)

Last night, Oracle updated the release notes of Java 7u1 and Java 6u29, stating that they fixed the three Lucene-relevant bugs (plus another one related to that). Based on this confirmation, it's now safe to use Java 7 Update 1 (and later) with Apache Lucene and Apache Solr.

Of course, there is still the recommendation not to use -XX:+AggressiveOpts on any JVM in production!

We are still waiting for updated OpenJDK packages to install this release on our build servers.

5 comments:

  1. So I was tired and irritable this morning and so I apologise for my tone in advance, you caught me on a bad day :( http://java.dzone.com/news/java-7-and-lucene-bug-saga#comment-57801 - Can't the Lucene folks work with the Oracle engineers on this via the appropriate mailing lists? It really does seem like there's 'point scoring' going on from both sides at times, it doesn't help anyone.

    ReplyDelete
  2. Martijn, its not clear oracle wants to work with us. Perhaps they prefer to hide bugs (http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7070134 <-- why is this hidden now!) rather than fix them.

    ReplyDelete
  3. Hi, I my name is Donald Smith and I work in the Java SE PM team. I'm based in Ottawa, Canada.

    Based on feedback, we've updated the release notes for 7u1 [1] to confirm:
    = = = =
    JIT and Loop Bugs

    Three bugs reported by various parties, including Apache Lucene developers, have been fixed in JDK 7 Update 1, in addition to a fourth related bug found by Oracle (7070134, 7068051, 7044738, 7077439).
    = = = =

    [1] - http://www.oracle.com/technetwork/java/javase/7u1-relnotes-507962.html

    ReplyDelete
  4. ...and 6u29, of course.

    http://www-content.oracle.com/technetwork/java/javase/6u29-relnotes-507960.html

    ReplyDelete
  5. Thank you, Donald! I'll update the blog post and add a "UPDATE" note at the end.

    ReplyDelete