Friday, March 11, 2011

A quick update on performance of JGroups 2.12.0.Final

I forgot to add performance data to the release announcement of 2.1.0.Final, so here it is.

Caveat: this is a quick check to see if we have a performance regression, which I run routinely before a release, and my no means a comprehensive performance test !

I ran this both on my home cluster and our internal lab.


org.jgroups.tests.perf.Test

This test is described in detail in [1]. It forms a cluster of 4 nodes, and every node sends 1 million messages of varying size (1K, 5K, 20K). We measure how long it takes for every node to receive the 4 million messages, and compute the message rate and throughput, per second, per node.

This is my home cluster and consists of 4 HP ProLiant DL380G5 quad core servers (ca 3700 bogomips), connected to a GB switch, and running Linux 2.6. The JDK is 1.6 and the heap size is 600M. I ran 1 process on every box. The configuration used was udp.xml (using IP multicasting) shipped with JGroups.

Results
  •   1K message size: 140 MBytes / sec / node
  •   5K message size: 153 MBytes / sec / node
  • 20K message size: 154 MBytes / sec / node
 This shows that GB ethernet is saturated. The reason that every node receives more than the limit of GB ethernet (~ 125 MBytes/sec) is that every node loops back its own traffic, and therefore doesn't have to share it with other incoming packets. In theory, the max throughput should therefore be 4/3 * 125 ~= 166 MBytes/sec. We see that the numbers above are not too far away from this.


org.jgroups.tests.UnicastTestRpcDist

This test mimicks the way Infinispan's DIST mode works.

Again, we form a cluster of between 1 and 9 nodes. Every node is on a separate machine. The test then has every node invoke 2 unicast RPCs in randomly selected nodes. With a chance of 80% the RPCs are reads, and with a chance of 20% they're writes. The writes carry a payload of 1K, and the reads return a payload of 1K. Every node makes 20'000 RPCs.

The hardware is a bit more powerful than my home cluster; every machine has 5300 bogomips, and all machines are connected with GB ethernet.

Results
  • 1 node:   50'000 requests / sec /node
  • 2 nodes: 23'000 requests / sec / node
  • 3 nodes: 20'000 requests / sec / node
  • 4 nodes: 20'000 requests / sec / node
  • 5 nodes: 20'000 requests / sec / node
  • 6 nodes: 20'000 requests / sec / node
  • 7 nodes: 20'000 requests / sec / node
  • 8 nodes: 20'000 requests / sec / node
  • 9 nodes: 20'000 requests / sec / node
As can be seen, the number of requests per node is the same after 2-3 nodes. The 1 node scenario is somewhat contrived as there is no network communication involved.

This is actually good news, as it shows that performance grows linearly. As a matter of fact, with increasing cluster size, the chances of more than 2 nodes picking the same target decreases, therefore performance degradation due to (write) access conflicts are likely to decrease.

Caveat: I haven't tested this on a larger cluster yet, but the current performance is already very promising.

[1] http://community.jboss.org/docs/DOC-11594

Wednesday, March 09, 2011

It took me 9 years to go from JGroups 2.0.0 to 2.12.0

Yes, you heard right: I released JGroups 2.0.0, new, shiny and refactored, in Feb 2002.

I just released JGroups 2.12.0.Final, which will be the last minor release on the 2.x branch. (There won't be a 2.13; bug fixes will go into 2.12.x).

Time difference: 9 years and change...:-)

I'm still investigating why it took me so long !

Anyway, 2.12.0.Final is here and it is an important release, as it will be shipped in Infinispan 4.2.1 and JBoss 6.


Below are the major features and bug fixes.

On to 3.0 !
Cheers,




Release Notes JGroups 2.12


JGroups 2.12 is API-backwards compatible with previous versions (down to 2.2.7).



New features



RELAY: connecting local (autonomous) clusters into a large virtual cluster


[https://issues.jboss.org/browse/JGRP-747]

A new protocol to connect 2 geographically separate sites into 1 large virtual cluster. The local clusters are
completely autonomous, but RELAY makes them appear as if they were one.

This can for example be used to implement geographic failover

Blog: http://belaban.blogspot.com/2010/11/clustering-between-different-sites.html



LockService: a new distributed locking service

[https://issues.jboss.org/browse/JGRP-1249]
[https://issues.jboss.org/browse/JGRP-1298]
[https://issues.jboss.org/browse/JGRP-1278]

New distributed lock service, offering a java.util.concurrent.lock.Lock implementation (including conditions)
providing cluster wide locks.

Blog: http://belaban.blogspot.com/2011/01/new-distributed-locking-service-in.html



Distributed ExecutorService

[https://issues.jboss.org/browse/JGRP-1300]

New implementation of java.util.concurrent.ExecutorService over JGroups (contributed by William Burns).
Read the documentation at www.jgroups.org for details.



BPING (Broadcast Ping): new discovery protocol based on broadcasting

[https://issues.jboss.org/browse/JGRP-1269]

This is mainly used for discovery of JGroups on Android based phones. Apparently, IP multicasting is not correctly implemented / supported on Android (2.1), and so we have to resort to UPD broadcasting.

Blog: http://belaban.blogspot.com/2011/01/jgroups-on-android-phones.html



JDBC_PING: new discovery protocol using a shared database


[https://issues.jboss.org/browse/JGRP-1231]

All nodes use a shared DB (e.g. RDS on EC2) to place their location information into, and to read information from.
Thanks to Sanne for coming up with the idea and for implementing this !
Additional infos are on the wiki: community.jboss.org/wiki/JDBCPING


FD_SOCK: ability to pick the bind address and port for the client socket

[https://issues.jboss.org/browse/JGRP-1262]



Pluggable address generation


[https://issues.jboss.org/browse/JGRP-1297]

Address generation is now pluggable; JChannel.setAddressGenerator(AddressGenerator) allows for generation of specific implementations of Address. This can for example be used to pass additional information along with every address. Currently used by RELAY to pass the name of the sub cluster around with a UUID.





Optimizations



NAKACK: retransmitted messages don't need to be wrapped


[https://issues.jboss.org/browse/JGRP-1266]

Not serializing retransmitted messages at the retransmitter and deserializing them at the requester saves
1 serialization and 1 deserialization per retransmitted message.


Faster NakReceiverWindow

[https://issues.jboss.org/browse/JGRP-1133]

Various optimizations to reduce locking in NakReceiverWindow:
  • Use of RetransmitTable (array-based matrix) rather than HashMap (reduced memory need, reduced locking, compaction)
  • Removal of double locking






Bug fixes



NAKACK: incorrect digest on merge and state transfer

[https://issues.jboss.org/browse/JGRP-1251]

When calling JChannel.getState() on a merge, the fetched state would overwrite the digest incorrectly.


AUTH: merge can bypass authorization

[https://issues.jboss.org/browse/JGRP-1255]

AUTH would not check creds of other members in case of a merge. This allowed an unauthorized node to join a cluster by triggering a merge.


Custom SocketFactory ignored

[https://issues.jboss.org/browse/JGRP-1276]

Despite setting a custom SocketFactory, it was ignored.


UFC: crash of depleted member could hang node

[https://issues.jboss.org/browse/JGRP-1274]

Causing it to wait forever for credits from the crashed member.


Flow control: crash of member doesn't unblock sender


[https://issues.jboss.org/browse/JGRP-1283]
[https://issues.jboss.org/browse/JGRP-1287]
[https://issues.jboss.org/browse/JGRP-1274]

When a sender block on P sending credits, and P crashes before being able to send credits,
the sender blocks indefinitely.


UNICAST2: incorrect delivery order under stress

[https://issues.jboss.org/browse/JGRP-1267]

UNICAST2 could (in rare cases) deliver messages in incorrect order. Fixed by using the same (proven)
algorithm as NAKACK.


Incorrect conversion of TimeUnit if MILLISECONDS were not used

[https://issues.jboss.org/browse/JGRP-1277]


Check if bind_addr is correct

[https://issues.jboss.org/browse/JGRP-1280]

JGroups now verifies that the bind address is indeed a valid IP address: it has to be either the wildcard
address (0.0.0.0) or an address of a network interface that is up.


ENCRYPT: sym_provider ignored

[https://issues.jboss.org/browse/JGRP-1279]

Property sym_provider is ignored



Manual


The manual is online at http://www.jgroups.org/manual/html/index.html



The complete list of features and bug fixes can be found at http://jira.jboss.com/jira/browse/JGRP.

Download the new release at https://sourceforge.net/projects/javagroups/files/JGroups/2.12.0.Final.

Bela Ban, Kreuzlingen, Switzerland
Vladimir Blagojevic, Toronto, Canada
Richard Achmatowicz, Toronto, Canada
Sanne Grinovero, Newcastle, Great Britain

March 2011