Monday, December 21, 2009

JGroups 2.8.0.GA released

I'm happy to announce that JGroups 2.8.0 is finally GA !

It has taken us almost a year since the last major release (2.7 was released in January), but to our defense 2.8.0.GA contains a lot of new features and I think they are worth the wait. We also released a number of 2.6.x versions in 2009, which are used in the JBoss Enterprise Application Platform (EAP).

Before I get into a summary of some of the new features (a detailed list can be found at [1]), I'd like to thank all the developers, users and contributors of JGroups. Without this healthy community, producing code, bug reports, patches, documentation and user stories, JGroups wouldn't be anywhere close to where it is today !

So a big thanks to everyone involved, Happy Holidays and a great start into 2010 !

Here's a short list of features that made it into 2.8.0.GA (here are the release notes):
  • Logical addresses: decouples physical addresses (which can change) from logical ones. Eliminates reincarnation issues. This alone is worth 2.8, as it eliminates a big source of problems !
  • Logical names: allow for meaningful channel names, logical names stay with a channel for its lifetime, even after reconnecting it
  • Improved merging / no more shunning: shunning was replaced by merging. Now we have a much simpler model: JOIN - LEAVE - MERGE. The merging algorithm was improved to take 'weird' (e.g. asymmetric) merges into account
  • Better IPv6 support
  • Better support for defaults for addresses: based on the type of the stack (IPv4, IPv6), we perform sanity checks and set default addresses of the correct type
  • FILE_PING / S3_PING: new discovery protocols, file-based and Amazon S3 based. The latter protocol can be used as a replacement for GossipRouter on EC2
  • Speaking of which: major overhaul of GossipRouter
  • Ability to have multiple protocols of the same class in the same stack
  • Ability to override message bundling on a per-message basis
  • Much improved and faster UNICAST
  • XSD schema for protocol configurations
  • STREAMING_STATE_TRANSFER now doesn't need to use TCP, but can also use the configured transport, e.g. UDP
  • RpcDispatcher: additional methods returning a Future rather than blocking
  • Probe.sh: ability to invoke methods cluster-wide. E.g. run message stability on all nodes: probe.sh invoke=STABLE.runMessageGarbageCollection
  • Logging
    • Removal of commons-logging.jar: JGroups now has ZERO dependencies !
    • Configure logging level at runtime, e.g. through JMX (jconsole) or probe.sh, or programmatically. Use case: set logging for NAKACK from "warn" to "trace" for a unit test, then reset it back to "warn"
    • Ability to set custom log provider. This allows for support of new logging frameworks (JGroups ships with support for log4j and JDK logging)
Enjoy !
Bela, Vladimir and Richard

[1] http://javagroups.cvs.sourceforge.net/viewvc/javagroups/JGroups/doc/ReleaseNotes-2.8.txt?revision=1.10&view=markup&pathrev=Branch_JGroups_2_8

[2] http://community.jboss.org/wiki/Support

Thursday, November 05, 2009

IPv6 addresses in JGroups

I finished code to support scoped IPv6 link local addresses [1]. A link local address is an address that's not guaranteed to be unique on a given host (althougbh in most cases it will be), so it can be assigned on different interfaces of the same host.

To differentiate between interfaces, a scope-id can be added, e.g. fe80::216:cbff:fea9:c3b5%en0 or fe80::216:cbff:fea9:c3b5%3, where the %X suffix denotes the interface.

Note that this is only relevant for TCP sockets, multicast or datagram sockets are not affected.

Now, on the server side, we can bind to a scoped or unscoped link-local socket, e.g.

ServerSocket srv_sock=new ServerSocket(7500, 50, InetAddress.getByName("fe80::216:cbff:fea9:c3b5"))

binds to an unscoped link-local address, and

ServerSocket srv_sock=new ServerSocket(7500, 50, InetAddress.getByName("fe80::216:cbff:fea9:c3b5%en0"))

binds to the scoped equivalent.

This is all fine, but on the client side, we cannot use scoped link-local addresses, e.g.

Socket sock=new Socket(InetAddress.getByName("fe80::216:cbff:fea9:c3b5%en0"), 7500)

fails !

The reason is that a scope-id "en0" does not mean anything on a client, which might run on a different host.

The correct code is

Socket sock=new Socket(InetAddress.getByName("fe80::216:cbff:fea9:c3b5"), 7500),

with the scope-id removed.

JGroups runs into this problem, too: whenever we have a bind_addr which is a scoped link-local IPv6 address, certain discovery protocols (e.g. MPING, TCPGOSSIP) will return the scoped addresses, and the joiners will then try to connect to the existing members using the scoped addresses.

To fix this, all Socket.connect() calls in JGroups have been replaced with Util.connect(Socket, SocketAddress, port). This method checks for scoped link-local IPv6 addresses and simply removes the scope-id from the destination address, so the connect() call will work.

Note that this problem doesn't occur with global IPv6 addresses.

I need to test whether this solution works on other operating systems, too, .e.g. on Windows, Solaris and MacOS.

OK, I'm off to http://www.davidoffswissindoors.ch, hope to see some good tennis !

[1] http://www.jboss.org/community/wiki/IPv6

Wednesday, October 28, 2009

JGroups 2.8.0.CR3 released

Unfortunately, a little later than estimated, but better late than never ! The reason is that I got side tracked by EAP 5 performance testing and also by the good feedback from the community (you !) on CR2, and the associated bug reports.

This version contains bug fixes, and mostly work around IPv6 versus IPv4 addresses. We now try to be smart and attempt to find out the type of stack used, and then default undefined IP addresses to addresses of the correct type. Note that IPv6 support is not yet 100% done, I'm continuing to work on this for either CR4 or GA. More on this topic in a later post...

CR3 also added a new feature, which is marshaller pools in the transport. When we send messages, they're either bundled and sent as a batch of messages, or not. In either case, the marshalling of a message or message list is done in an output buffer for which we have to acquire a lock. When we have heavy message sending, e.g. through multiple sender threads, that lock is heavily contended.

Not to say this is a big issue because the sender side is almost never the culprit in slow performance (the receiver side is !), but I've introduced a marshaller pool, which provides N output streams (default=2) rather than 1. The property marshaller_pool_size defines how many output streams we want in the pool and marshaller_pool_initial_size the initial size of each output stream (in bytes).

Note that, for UDP, each output stream can grow up to 63535 bytes, so take that into account when allocating a large number of streams.

In my perf tests, I haven't found that increasing the pool size makes a difference to performance, but if you use many threads which send messages concurrently, this does make a difference.

2.8.0.CR3 can be downloaded from http://sourceforge.net/projects/javagroups/files/JGroups/2.8.0.CR3.
Enjoy !

Friday, September 18, 2009

JGroups 2.6.13.CR2 released

OK, going from CR1 to CR2 doesn't seem like a big deal, and certainly not worth posting as a blog entry ?

You might wonder if I have nothing better to do (like biking in the French Alps) :-)

But actually, there have been significant changes since CR1, so please read on !

CR2 only contains 3 JIRA issues:
  1. Backport of NAKACK from head
  2. Backport of UNICAST from head and
  3. Removal of UNICAST contention issues
#1 is a partial backport of NAKACK from head (2.8) to the 2.6 branch. This version doesn't acquire locks for incoming messages anymore, but uses a CAS (compare-and-swap) operation to decide whether to process a message, or not.

What used to happen when a message from P is received is that we grabbed the receiver window for P and added the message. Then we grabbed the lock associated with P's window and - once acquired - removed as many messages as possible and passed them up to the application sequentially. Sequential order is always respected unless a message is tagged as OOB (out-of-band).

So here's what happened: say we received 10 multicast messages from B and 3 from A. Both A's and B's messages would be delivered in parallel with respect to each other, but sequentially for a given sender. So A's message #34 would always get delivered before #35 before #36 and so on...

However, say we have to process 10 messages from B: 1 2 3 4 5 6 7 8 9 10:
  • Every message would get into NAKACK on a separate thread
  • All the 10 messages would get added into B's receiver window
  • The thread with message #3 would grab the lock
  • All other threads would block, trying to acquire the lock
  • The thread with the lock would remove #1 and pass it up the stack, then #2, then #3 and so on, until it passed #10 up the stack to the application
  • Now it releases the lock
  • All other 9 threads now compete for the lock, but every single thread will return because there are no more messages in the receiver window
This is a terrible waste: we've wasted 9 threads; for the duration of removing and passing up 10 messages, these threads could have been put to better use, e.g. processing other messages !

For example, if our total thread pool only had 10 threads, and 1 of them was processing messages and 9 were blocked on lock acquisition, if a message from a different sender came in (which could be delivered in parallel to B's messages), then no thread would be available !

So the simple but effective change was to replace the lock on the receive window with a CAS: when a thread tries to remove messages, it simply set the CAS from false to true. If it succeed, it goes into the removal loop and sets the CAS back to false when done. Else, the thread simply returns because it knows that someone else will be processing the message it just added.

Result: we've returned 9 threads to the thread pool, ready to serve other messages, without even locking !

The net affect is faster performance and smaller thread pools. As a rule of thumb, a thread pool's max threads can now be around the number of cluster nodes: if every node sends messages, we only need 1 thread per sender to process all of the sender's messages...


#2 has 2 changes: same as above (locks replaced by CAS) and the changes outlined in the design document. The latter changes simplify UNICAST a lot and also handle the cases of asymmetrical connection closings. This was also back-ported from head (2.8)


#3 UNICAST contention issues
We used to have 2 big fat locks in UNICAST, which severely impacted performance on high unicast message volumes. The bottleneck was detected as part of our EAP testing for JBoss.

This has been fixed and is getting forward-ported to CVS head.

I guess the 3 changes are worth trying out 2.6.13.CR2; in some cases this should make a real difference in performance !

Enjoy,

Monday, August 24, 2009

2.8.0.CR1 released

I just released 2.8.0.CR1, it can be downloaded from SourceForge (binary and source).

This version is pretty stable, and I expect a GA soon. The only open issues are currently a few IPv6 related issues and an issue which fixes spurious merges.

The release notes are here.

Enjoy,

Monday, August 17, 2009

2.6.12.GA released

Just uploaded to SourceForge, the JIRA issues are at https://jira.jboss.org/jira/secure/IssueNavigator.jspa?reset=true&pid=10053&fixfor=12313820.

In a nutshell, 2.6.12 contains only 4 issues:
  • GossipRouter consumed 40% CPU without doing anything: fixed
  • S3_PING is a new file-basedx discovery protocol for running JGroups on EC2 / S3
  • There was a memory leak in the GMS protocol on high member churn (high rate of joins and leaves)
  • FLUSH could lock up the entire cluster when the initial flush phase ran into a timeout. Thanks to Rado and Brian for discovering this bug by adding all weird combination of failure scenarios to their merciless tests... :-) And kudos to Vladimir for investigating the (60MB !) logs, finding the offending code and fixing it, all within 60 minutes !
2.6.12.GA can be downloaded from SourceForge in binary and source versions.

Enjoy,

Thursday, July 16, 2009

Bike tour Nice

Executive summary: very nice tour with 600km, 8 mountain passes and ca 17'000m of climbing, but unfortunately cut short by the weather, so the total is only 8 instead of 10 passes.

That's the stats, if you want to know more, read on...

By the way: a 'bike' is a bicycle, *not* a motorbike ! :-)



Day 1: FRI July 10th 2009


I took the 9am flight from Zurich and arrived in Nice at 10:00am. My biggest concern was to assemble the bike and get out of the airport as soon as possible because it must be busy this time of the season (in France, vacation time started July 3rd).

However, I was pleasantly surprised when I found that NCE even had a bike assembly station inside the airport, with a stand and tools. Who would have thought ?

As you can see, I only had 2-3 kilos of baggage with me, attached to the saddle (no back pack).

Alors, I assembled my bike, passed customs and took off. First, the ride was along the shore (at 0 meters elevation), then through Nice, with a bit too much traffic and off I went into the mountains.

Once I found D19 towards Tourrette-Levens, traffic eased and the long but steady climb began. The ride was mostly through wooded areas, with lots of ups and downs and curves. Almost no traffic anymore, only other bikers.

After the Col de St. Martin (1500m), I had already booked a hotel at La Bolline and spent the night there. Very small place, but the good thing is it's quite high up so no air conditioning was needed. As a matter of fact, I spent all nights at altitudes over 1000m, so it was not too hot and not too cold. Just perfect !

The only thing that wasn't perfect was that the French (at least in the South) start their repas (dinner) at 7:30pm, so when I arrived (usually very hungry), I still had to wait for a few hours !

I 'fixed' this by having a late and long lunch (called dejeuner in France), so I would survive until 7:30pm...


Day 2: SAT July 11th 2009

Big day: today I wanted to ride 2 passes, the Cole De Bonnette and the Col de Vars.
But first things first. In the morning, there was a nice downhill from La Bolline to the junction with D2205.



From there on, the climb to the Bonnette started. In the picture, coming down from Valdeblore, I took a right and started my ascent to Bonnette. This point is ca. at 500m, and Bonnette at 2802m, so a long climb of 2300m !

All the passes I did are not very steep, but the climbs are very long, at maybe 5-8%, and that wears you out, too ! I prefer steep climbs and long downhill rides :-)

The Col De Bonnette is the highest pass in Europe, but only because of a trick: some resourceful people (probably from the tourist office) added a loop around the top of Bonnette in the 60's, which added a few tens of meters, so Bonnette would surpass the Col de L'Iseran !

In the picture, one can see the loop starting and going around the top clock-wise.

In Jausiers, I unfortunately had a big ham and cheese toast, with a few cokes and beers and - as the experienced athletes among you will know - this was somewhat detrimental to my effort to climb the Col de Vars ! Only 800m to climb, but I had to walk my bike and push because of (a) my stomach and (b) cramps.

Note to self: tonight have loads of salt to avoid the cramps (binds the water in the body) and next time have spaghetti or something with carbs (and salt) for lunch !

Anyway, I made it to the top (even biking the last 1.5kms) and after a nice downhill spent the night at Vars (a mountain resort).


Day 3: SUN July 12th 2009

3 passes under the buckle and 7 to go, I had a nice downhill ride (that's the advantage of spending the night halfway up the mountain, you always have a downhill the next day!) to Guillestre. From here, the most beautiful pass, the Col de L'Izoard, started. In the picture below, you can see Brunissard, looking back.







The road climbed nicely through a dense forest and later passed the famous Casse Desert (Broken Desert), which looks like a piece of the moon right before the summit of the Izoard pass.

At the top of the Izoard, there was a concession stand, offering drinks and souvenirs, and there were many motorbikes and bikes (and cars, too). However, the downhill was fantastic: roads in great shape and winded curves, excellent to cruise down.

The next pass (#5) was the Montgenevre, starting from Briancon. At only 600 meters of climbing, it would have been a nice pass, but the traffic was overwhelming. Maybe because it was Sunday, everybody (and their grandmothers) was on their motorbikes. And sometimes they just love to accelerate when passing bikers (the real ones :-))...

At least the hotel at the top was nice (Le Chalet Blanc), had a nice TV and so I could watch the Tour De France for that day.


Day 4: MON July 13th 2009


The day started with a nice downhill to Cesana Torinese, and then on to Susa, Italy. But that was it in terms of niceness: the next climb up Mont Cenis was hard, because it started from 500m and went all the way up to 2100m, for a climb of 1600m.

In addition, once I could see the top (a bunch of hotels) through the fog, when I got closer I found out that there was a dam behind them, another 200m of climbing. Then I found out that the road didn't go alongside the lake, but climbed another hill, adding 200m again !

At the top, it was very cold and so I just put on a long wind breaker and rode the downhill into Lanslevillard where I had lunch (no ham/cheese toast though !).

I decided to ride a little further to Bessans and took a hotel there.


Day 5: TUE July 14th 2009

Pass #7 was the Col de L'iseran (2770m), which is the second highest mountain pass in Europe, only to be passed by the Bonnette (albeit through a trick, as mentioned above). However, I started at 1677m, so the actual climbing was only 1100m, much better than the 2300m for the Bonnette.
As once can see, I took the obligatory picture of the sign at the top of the pass, to prove that I was there :-)
Well, actually, I'm not in the picture, so... hmmm :-)

The downhill ride was very pleasant, through Val d'Isere, which is a famous winter sports resort and down to Montvalezan.

The Col de L'Iseran was pass # 7.

From here, I took a shortcut to the Petit St. Bernard, although it had a steep climb of up to 16%. The ride up St Bernard would have been nice because it features a mild climb of mostly 5% or less, but the wind at the top made it hard to reach the old monestary, which is at the top.


Here's the picture of me at the top.

The weather forecast had rain and storms for the evening, so I didn't stay too long at the top and made my way down to La Thuile, Italy (the Pt. St. Bernard is the border between France and Italy) and took a hotel.

Petit St. Bernard was pass #8.












Day 6: WED July 15th 2009


Unfortunately, it rained the whole night and the forecast for where I wanted to go (Gr. St. Bernard and Furka (Valais) in Switzerland) was very bad (rain and thunderstorms), so I decided to finish this tour by heading west.

When the rain stopped, I rode down to Pre-Saint-Didier (Italy) and up the hill to Courmayeur (Italy). There I took a bus through the Mont Blanc tunnel into Chamonix (France again).

From Chamonix, I rode more or less A40 back to Geneva (Switzerland), where I took the train back home. The A40 is partially a 2 lane highway, but due to lack of other roads (very narrow valley), they allow bikes to use it. Hmm, not very pleasant to be passed by heavy trucks, you going at 65km/h and the trucks going at 100km/h...

Anyway, this was a great tour, friendly people everywhere, I practiced my French, and toured a beautiful scenery. I might just do it again some day ! Next time though, maybe with an iphone (or even better, an Android HTC phone !) and lots of plan B's: in some places they had no public transportation (besides the bi-weekly bus :-)) and I would have been stuck in that place had it started to rain...

If someone is interested in the exact tour, I have it on bikemap.net, let me know.
Cheers,
Bela