- BARRIER: moved lock acquired by every up-message out of the critical path
- IPv6: just running a JGroups channel without any system props (e.g. java.net.preferIPv4Stack=true) now works, as IPv4 addresses are mapped to IP4-mapped IPv6 addresses under IPv6
- NAKACK and UNICAST: streamlined marshalling of headers, drastically reducing the number of bytes streamed when marshalling headers
- TCPGOSSIP: Vladimir fixed a bug in RouterStub which caused GossipRouters to return incorrect membership lists, resulting in JOIN failures
- Provided a new bundler implementation, which is faster than the default one (the new *is* actually the default in 2.10)
- Sending of message lists (bundling): we don't ship the dest and src address for each message, but only ship them *once* for the entire list
- AckReceiverWindow (used by UNICAST): I made this almost lock-free, so concurrent messages to the same recipient don't compete for the same lock. Should be a nice speedup for multiple unicasts to the same sender (e.g. OOB messages)
In 2.10.0.Alpha2 (that's actually the current CVS trunk), I replaced strings as header names with IDs . This means that for each header, instead of marshalling "UNICAST" as a moniker for the UnicastHeader, we marshal a short.
The string (assuming a single-byte charset) uses up 9 bytes, whereas the short uses 2 bytes. We usually have 3-5 headers per message, so that's an average of 20-30 bytes saved per message. If we send 10 million messages, those saving accumulate !
Not only does this change make the marshalled message smaller, it also means that a message kept in memory has a smaller footprint: as messages are kept in memory until they're garbage collected by STABLE (or ack'ed by UNICAST), the savings are really nice...
The downside ? It's an API change for protocol implementers: methods getHeader(), putHeader() and putHeaderIfAbsent() in Message changed from taking a string to taking a short. Plus, if you implement headers, you have to register them in jg-magic-map.xml / jg-protocol-ids.xml and implement Streamable...
Now for some performance numbers. This is a quick and dirty benchmark, without many data points...
perf.Test (see  for details) has N senders send M messages of S size to all cluster nodes. This exercises the NAKACK code.
On my home cluster (4 blades with 4 cores each), 1GB ethernet, sending 1000-byte messages:
- 4 senders, JGroups 2.9.0.GA: 128'000 messages / sec / member
- 4 senders, JGroups 2.10.0.Alpha2: 137'000 messages / sec / member
- 6 senders, JGroups 2.10.0.Alpha2: 100'000 messages / sec /member
- 8 senders, JGroups 2.10.0.Alpha2: 78'000 messages / sec / member
There is also a stress test for unicasts, UnicastTestRpcDist. It mimicks DIST mode of Infinispan and has every member invoke 20'000 requests on 2 members; 80% of those requests are GETs (simple RPCs) and 20% are PUTs (2 RPCs in parallel). All RPCs are synchronous, so the caller always waits for the result and thus blocks for the roud trip time. Every member has 25 threads invoking the RPCs concurrently.
On my home network, I got the following numbers:
- 4 members, JGroups 2.9.0.GA: 4'500 requests / sec / member
- 4 members, JGroups 2.10.0.Alpha2: 5'700 requests / sec / member
- 6 members, JGroups 2.9.0.GA: 4'000 requests / sec / member
- 6 members, JGroups 2.10.0.Alpha2: 5'000 requests / sec / member
- 8 members, JGroups 2.9.0.GA: 3'800 requests / sec / member
- 8 members, JGroups 2.10.0.Alpha2: 4'300 requests / sec / member
In our Atlanta lab (faster boxes), I got (unfortunately only for 2.10.0.Alpha2):
- 4 members, JGroups 2.10.0.Alpha2: 10'900 requests / sec / member
- 6 members, JGroups 2.10.0.Alpha2: 10'900 requests / sec / member
- 8 members, JGroups 2.10.0.Alpha2: 10'900 requests / sec / member