Wednesday, March 09, 2011

It took me 9 years to go from JGroups 2.0.0 to 2.12.0

Yes, you heard right: I released JGroups 2.0.0, new, shiny and refactored, in Feb 2002.

I just released JGroups 2.12.0.Final, which will be the last minor release on the 2.x branch. (There won't be a 2.13; bug fixes will go into 2.12.x).

Time difference: 9 years and change...:-)

I'm still investigating why it took me so long !

Anyway, 2.12.0.Final is here and it is an important release, as it will be shipped in Infinispan 4.2.1 and JBoss 6.


Below are the major features and bug fixes.

On to 3.0 !
Cheers,




Release Notes JGroups 2.12


JGroups 2.12 is API-backwards compatible with previous versions (down to 2.2.7).



New features



RELAY: connecting local (autonomous) clusters into a large virtual cluster


[https://issues.jboss.org/browse/JGRP-747]

A new protocol to connect 2 geographically separate sites into 1 large virtual cluster. The local clusters are
completely autonomous, but RELAY makes them appear as if they were one.

This can for example be used to implement geographic failover

Blog: http://belaban.blogspot.com/2010/11/clustering-between-different-sites.html



LockService: a new distributed locking service

[https://issues.jboss.org/browse/JGRP-1249]
[https://issues.jboss.org/browse/JGRP-1298]
[https://issues.jboss.org/browse/JGRP-1278]

New distributed lock service, offering a java.util.concurrent.lock.Lock implementation (including conditions)
providing cluster wide locks.

Blog: http://belaban.blogspot.com/2011/01/new-distributed-locking-service-in.html



Distributed ExecutorService

[https://issues.jboss.org/browse/JGRP-1300]

New implementation of java.util.concurrent.ExecutorService over JGroups (contributed by William Burns).
Read the documentation at www.jgroups.org for details.



BPING (Broadcast Ping): new discovery protocol based on broadcasting

[https://issues.jboss.org/browse/JGRP-1269]

This is mainly used for discovery of JGroups on Android based phones. Apparently, IP multicasting is not correctly implemented / supported on Android (2.1), and so we have to resort to UPD broadcasting.

Blog: http://belaban.blogspot.com/2011/01/jgroups-on-android-phones.html



JDBC_PING: new discovery protocol using a shared database


[https://issues.jboss.org/browse/JGRP-1231]

All nodes use a shared DB (e.g. RDS on EC2) to place their location information into, and to read information from.
Thanks to Sanne for coming up with the idea and for implementing this !
Additional infos are on the wiki: community.jboss.org/wiki/JDBCPING


FD_SOCK: ability to pick the bind address and port for the client socket

[https://issues.jboss.org/browse/JGRP-1262]



Pluggable address generation


[https://issues.jboss.org/browse/JGRP-1297]

Address generation is now pluggable; JChannel.setAddressGenerator(AddressGenerator) allows for generation of specific implementations of Address. This can for example be used to pass additional information along with every address. Currently used by RELAY to pass the name of the sub cluster around with a UUID.





Optimizations



NAKACK: retransmitted messages don't need to be wrapped


[https://issues.jboss.org/browse/JGRP-1266]

Not serializing retransmitted messages at the retransmitter and deserializing them at the requester saves
1 serialization and 1 deserialization per retransmitted message.


Faster NakReceiverWindow

[https://issues.jboss.org/browse/JGRP-1133]

Various optimizations to reduce locking in NakReceiverWindow:
  • Use of RetransmitTable (array-based matrix) rather than HashMap (reduced memory need, reduced locking, compaction)
  • Removal of double locking






Bug fixes



NAKACK: incorrect digest on merge and state transfer

[https://issues.jboss.org/browse/JGRP-1251]

When calling JChannel.getState() on a merge, the fetched state would overwrite the digest incorrectly.


AUTH: merge can bypass authorization

[https://issues.jboss.org/browse/JGRP-1255]

AUTH would not check creds of other members in case of a merge. This allowed an unauthorized node to join a cluster by triggering a merge.


Custom SocketFactory ignored

[https://issues.jboss.org/browse/JGRP-1276]

Despite setting a custom SocketFactory, it was ignored.


UFC: crash of depleted member could hang node

[https://issues.jboss.org/browse/JGRP-1274]

Causing it to wait forever for credits from the crashed member.


Flow control: crash of member doesn't unblock sender


[https://issues.jboss.org/browse/JGRP-1283]
[https://issues.jboss.org/browse/JGRP-1287]
[https://issues.jboss.org/browse/JGRP-1274]

When a sender block on P sending credits, and P crashes before being able to send credits,
the sender blocks indefinitely.


UNICAST2: incorrect delivery order under stress

[https://issues.jboss.org/browse/JGRP-1267]

UNICAST2 could (in rare cases) deliver messages in incorrect order. Fixed by using the same (proven)
algorithm as NAKACK.


Incorrect conversion of TimeUnit if MILLISECONDS were not used

[https://issues.jboss.org/browse/JGRP-1277]


Check if bind_addr is correct

[https://issues.jboss.org/browse/JGRP-1280]

JGroups now verifies that the bind address is indeed a valid IP address: it has to be either the wildcard
address (0.0.0.0) or an address of a network interface that is up.


ENCRYPT: sym_provider ignored

[https://issues.jboss.org/browse/JGRP-1279]

Property sym_provider is ignored



Manual


The manual is online at http://www.jgroups.org/manual/html/index.html



The complete list of features and bug fixes can be found at http://jira.jboss.com/jira/browse/JGRP.

Download the new release at https://sourceforge.net/projects/javagroups/files/JGroups/2.12.0.Final.

Bela Ban, Kreuzlingen, Switzerland
Vladimir Blagojevic, Toronto, Canada
Richard Achmatowicz, Toronto, Canada
Sanne Grinovero, Newcastle, Great Britain

March 2011

8 comments:

  1. Congratulations Bela! Onwards to 3.0!

    ReplyDelete
  2. Thanks Vladimir !
    I was reading the archives of the JGroups mailing lists, and your name is up there right from the start (2001) !
    Cheers,

    ReplyDelete
  3. Congratulations !! Good to see the project going forward :)
    Looking forward to next release, I will try to merge the new commits into my android port fork some day !

    ReplyDelete
  4. Excellent ! You should just be able to do a git pull on your fork, as 2.12.0 didn't change any APIs

    ReplyDelete
  5. Congrats guys!!!

    Can't wait for jgroups 3.0 in 2020 :-)

    ReplyDelete
  6. Thanks Dimitris !

    I promise it won't be that long this time around ! :-)

    ReplyDelete
  7. Thanks for fixing JGRP-1289!

    JGroups uses *blocking* TCP implementation and if something goes wrong in algorythm (some dead node was not excluded etc) send queue becomes full sooner or later and all threads are blocked. During last year I saw many such problems :(
    Java 7 with NIO.2 is almost ready and I am dreaming about *non-blocking* TCP implementation as it was with TCP_NIO :)

    In our practise, we had servers with blocking IO for some time - sooner or later some threads was blocked on socket write for example (Java does NOT allow to specify 'write timeout' and not all operating system provide this option). After we have switched our code to NIO we are happy, nothing hangs ;)

    Victor N

    ReplyDelete
  8. I have plans to rewrite the transport layer altogether: I want to use an NIO2 based approach where we have a selector in TP and then individual transport implementations can register with it. This will include both UDP and TCP.

    However, because this requires JDK 7 as baseline, it won't be in 3.0. maybe in 3.1 or 3.2...

    [1] https://issues.jboss.org/browse/JGRP-809

    ReplyDelete