Sunday, October 06, 2013

JGroups 3.4.0.Final released

I'm happy to announce that JGroups 3.4.0.Final has been released !

The major features, bug fixes and optimizations are:
  • View creation and coordinator selection is now pluggable
  • Cross site replication can have more than 1 site master
  • Fork channels: light weight channels
  • Reduction of memory and wire size for views, merge views, digests and various headers
  • Various optimizations for large clusters: the largest JGroups cluster is now at 1'538 nodes !
  • The license is now Apache License 2.0
3.4.0.Final can be downloaded from SourceForge [1] or Maven (central). The complete list of issues is at [2]. Below is a summary of the changes.
Enjoy !


Note that the license was changed from LGPL 2.1 to AL 2.0:

New features

Pluggable policy for picking coordinator

View and merge-view creation is now pluggable; this means that an application can determine which member is
the coordinator.

RELAY2: allow for more than one site master

If we have a lot of traffic between sites, having more than 1 site master increases performance and reduces stress
on the single site master

Fork channels: private light-weight channels

This allows multiple light-weight channels to be created over the same (base) channel. The fork channels are
private to the app which creates them and the app can also add protocols over the default stack. These protocols are
also private to the app.


Kerberos based authentication

New AUTH plugin contributed by Martin Swales. Experimental, needs more work

Probe now works with TCP too

If multicasting is not enabled, can be started as follows: -addr -port 12345
, where is the physical address:port of a node.
Probe will ask that node for the addresses of all other members and then send the request to all members.


UNICAST3: ack messages sooner

A message would get acked after delivery, not reception. This was changed, so that long running app code would not
delay acking the message, which could lead to unneeded retransmission by the sender.

Compress Digest and MutableDigest

- In some cases, when a digest and a view are the same, the members
  field of the digest points to the members field of the view,
  resulting in reduced memory use.
- When a view and digest are the same, we marshal the members only
- We don't send the digest with a VIEW to *existing members*; the full
  view and digest is only sent to the joiner. This means that new
  views are smaller, which is useful in large clusters.
- View and MergeView now use arrays rather than lists to store
  membership and subgroups
- Make sure digest matches view when returning JOIN-RSP or installing
  MergeView (
- More efficient marshalling of GMS$GmsHeader: when view and digest
  are present, we only marshal the members once

Large clusters:
- STABLE uses a bitset rather than a list for STABLE msgs, reducing
  memory consumption
- don't print the full list of members
- suppression of fake merge-views
- move contents of GMS headers into message body (otherwise packet at
  transport gets too big)
- ditto for VIRE-RSP in MERGE3
- move large data in headers to message body

Bug fixes

FRAG/FRAG2: incorrect ordering with message batches

Reassembled messages would get reinserted into the batch at the end instead of at their original position

RSVP: incorrect ordering with message batches

RSVP-tagged messages in a batch would get delivered immediately, instead of waiting for their turn

Memory leak in STABLE

Occurred when send_stable_msg_to_coord_only was enabled.

NAKACK2/UNICAST3: problems with flow control

If an incoming message sent out other messages before returning, it could block forever as new credits would not be
processed. Caused by a regression (removal of ignore_sync_thread) in FlowControl.

AUTH: nodes without AUTH protocol can join cluster

If a member didn't ship an AuthHeader, the message would get passed up instead of rejected.

LockService issues

Bug fix for concurrent tryLock() blocks and various optimizations.

Logical name cache is cleared, affecting other channels

If we have multiple channels in the same JVM, the a new view in one channel triggers removal of the entries
of all other caches


The manual is at