<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-19835054</id><updated>2012-01-12T10:08:17.638+01:00</updated><category term='ReplCache JGroups raid distribution replication'/><category term='Testing JGroups protocols'/><category term='pub-sub'/><category term='publish-subscribe'/><category term='stock feed'/><category term='multicast'/><category term='bike tour nice france'/><title type='text'>Belas Blog</title><subtitle type='html'></subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://belaban.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://belaban.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>Bela Ban</name><uri>http://www.blogger.com/profile/01830789377474906550</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='28' height='32' src='http://3.bp.blogspot.com/_NUNlUHL8KjI/ST6MrbBdIaI/AAAAAAAAAAM/bu0j0fvc6sI/S220/bela.jpg'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>45</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-19835054.post-8008652598408124894</id><published>2011-12-06T16:43:00.001+01:00</published><updated>2011-12-06T17:14:52.859+01:00</updated><title type='text'>Repondez s'il vous plait !</title><content type='html'>No, this isn't a post in French (my school French would be too rusty for this !); this is about a new protocol in JGroups, called RSVP :-)&lt;br /&gt;&lt;br /&gt;As the name possibly suggests, this feature allows for messages to get ack'ed by receivers before a message send returns. In other words, when A broadcasts a message M to {A,B,C,D}, then JChannel.send() will only return once itself, B, C and D have acknowledged that they delivered M to the application.&lt;br /&gt;&lt;br /&gt;This differs from the default behavior of JGroups which always sends messages asynchronously, and guarantees that all non-faulty members will eventually receive the message. If we tag a message as RSVP, then we basically have 2 properties:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;The message send will only return when we've received all acks from the current members. Members leaving or crashing during the wait are treated as if they sent an ack. The send() method can also throw a (runtime) TimeoutException if a timeout was defined (in RSVP) and encountered.&lt;/li&gt;&lt;li&gt;If A sent (asynchronous) messages #1-10, and tagged #10 as RSVP, then - when send() returns successfully - A is guaranteed that all members received A's message #10 and all messages prior to #10, that's #1-9.&lt;/li&gt;&lt;/ol&gt;This can be used for example when completing a unit of work, and needing to know that all current cluster members received all of the messages sent up to now by a given cluster member. &lt;br /&gt;&lt;br /&gt;This is similar to FLUSH, but less strict in that it is a &lt;i&gt;per-sender&lt;/i&gt; flush, there is no reconciliation phase, and it doesn't stop the world.&lt;br /&gt;&lt;br /&gt;An alternative is to use a blocking RPC. However, I wanted to add the capability of synchronous messages directly into the base channel.&lt;br /&gt;&lt;br /&gt;Note that this also solves another problem: if A sends messages #1-5, but some members drop #5, and A doesn't send more messages for some time, then A#5 won't get delivered at some members for quite a while (until stability (STABLE) kicks in).&lt;br /&gt;&lt;br /&gt;RSVP will be available in JGroups 3.1. If you want to try it out, get the code from master [2]. The documentation is at [1], section 3.8.8.2.&lt;br /&gt;&lt;br /&gt;For questions, I suggest one of the mailing lists.&lt;br /&gt;Cheers,&lt;br /&gt;&lt;br /&gt;[1] &lt;a href="http://www.jgroups.org/manual-3.x/html/user-channel.html#SendingMessages"&gt;http://www.jgroups.org/manual-3.x/html/user-channel.html#SendingMessages&lt;/a&gt; &lt;br /&gt;&lt;br /&gt;[2] &lt;a href="https://github.com/belaban/JGroups"&gt;https://github.com/belaban/JGroups&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19835054-8008652598408124894?l=belaban.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://belaban.blogspot.com/feeds/8008652598408124894/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://belaban.blogspot.com/2011/12/repondez-sil-vous-plait.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/8008652598408124894'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/8008652598408124894'/><link rel='alternate' type='text/html' href='http://belaban.blogspot.com/2011/12/repondez-sil-vous-plait.html' title='Repondez s&apos;il vous plait !'/><author><name>Bela Ban</name><uri>http://www.blogger.com/profile/01830789377474906550</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='28' height='32' src='http://3.bp.blogspot.com/_NUNlUHL8KjI/ST6MrbBdIaI/AAAAAAAAAAM/bu0j0fvc6sI/S220/bela.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-19835054.post-5557380094582772629</id><published>2011-11-17T12:33:00.001+01:00</published><updated>2011-11-17T13:24:56.612+01:00</updated><title type='text'>JGroups 3.0.0.Final released</title><content type='html'>I'm happy to announce that JGroups 3.0.0.Final is here !&lt;br /&gt;&lt;br /&gt;While originally intended to make only API changes (some of them queued for years), there are also several optimizations, most of them related to running JGroups in larger clusters.&lt;br /&gt;&lt;br /&gt;For instance, the size of several messages has been reduced, and some protocol rounds have been eliminated, making JGroups more memory efficient and less chatty.&lt;br /&gt;&lt;br /&gt;For the last couple of weeks, I've been working on making merging of 100-300 cluster nodes faster and making sure a merge never blocks. To this end, I've written a &lt;a href="https://github.com/belaban/JGroups/blob/master/tests/junit-functional/org/jgroups/tests/LargeMergeTest.java"&gt;unit test&lt;/a&gt;, which creates N singleton nodes (= nodes which only see themselves in the cluster), then make them see each other and wait until a cluster of N has formed.&lt;br /&gt;&lt;br /&gt;The test itself was a real challenge because I was hitting the max heap size pretty soon. For example, with 300 members, I had to increase the heap size to at least 900 MB, to make the test complete. This indicates that a JGroups member needs roughly a max of 3MBs of heap. Of course, I had to use shared thread pools, timers and do a fair amount of (memory) tuning on some of the protocols, to accommodate 300 members all running in the same JVM.&lt;br /&gt;&lt;br /&gt;Running in such a memory constrained environment led to some more optimizations, which will benefit users, even if they're not running 300 members inside the same JVM ! :-)&lt;br /&gt;&lt;br /&gt;One of them is that UNICAST / UNICAST2 maintain a structure for every member they talk to. So if member A sends a unicast to each and every member of a cluster of 300, it'll have 300 connections open.&lt;br /&gt;&lt;br /&gt;The change is to close connections that have been idle for a given (configurable) time, and re-establish them when needed.&lt;br /&gt;&lt;br /&gt;Further optimizations will be made in 3.1.&lt;br /&gt;&lt;br /&gt;The release notes for 3.0.0.Final are here: &lt;a href="https://github.com/belaban/JGroups/blob/master/doc/ReleaseNotes-3.0.0.txt"&gt;https://github.com/belaban/JGroups/blob/master/doc/ReleaseNotes-3.0.0.txt&lt;/a&gt;.&amp;nbsp;&lt;br /&gt;&lt;br /&gt;JGroups 3.0.0.Final can be downloaded here: &lt;a href="https://sourceforge.net/projects/javagroups/files/JGroups/3.0.0.Final"&gt;https://sourceforge.net/projects/javagroups/files/JGroups/3.0.0.Final&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;As usual, if you have questions, use one of the mailing lists for questions.&lt;br /&gt;&lt;br /&gt;Enjoy !&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19835054-5557380094582772629?l=belaban.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://belaban.blogspot.com/feeds/5557380094582772629/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://belaban.blogspot.com/2011/11/jgroups-300final-released.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/5557380094582772629'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/5557380094582772629'/><link rel='alternate' type='text/html' href='http://belaban.blogspot.com/2011/11/jgroups-300final-released.html' title='JGroups 3.0.0.Final released'/><author><name>Bela Ban</name><uri>http://www.blogger.com/profile/01830789377474906550</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='28' height='32' src='http://3.bp.blogspot.com/_NUNlUHL8KjI/ST6MrbBdIaI/AAAAAAAAAAM/bu0j0fvc6sI/S220/bela.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-19835054.post-865210181213163421</id><published>2011-09-12T11:46:00.000+02:00</published><updated>2011-09-12T11:46:12.461+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='multicast'/><category scheme='http://www.blogger.com/atom/ns#' term='stock feed'/><category scheme='http://www.blogger.com/atom/ns#' term='publish-subscribe'/><category scheme='http://www.blogger.com/atom/ns#' term='pub-sub'/><title type='text'>Publish-subscribe with JGroups</title><content type='html'>I've added a new demo program (&lt;a href="https://github.com/belaban/JGroups/blob/master/src/org/jgroups/demos/PubSub.java"&gt;org.jgroups.demos.PubSub&lt;/a&gt;), which shows how to use JGroups channels to do publish-subscribe.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://en.wikipedia.org/wiki/Publish/subscribe"&gt;Pub-sub&lt;/a&gt; is a pattern where instances subscribe to topics and receive only messages posted to those topics. For example, in a stock feed application, an instance could subscribe to topics "rht", "aapl" and "msft". Stock quote publishers could post to these topics to update a quote, and subscribers would get notified of the updates.&lt;br /&gt;&lt;br /&gt;The simplest way to do this in JGroups is for each instance to join a cluster; publishers send topic posts as multicasts, and subscribers discard messages for topics to which they haven't subscribed.&lt;br /&gt;&lt;br /&gt;The problem with this is that a lot of multicasts will make it all they way up to the application, only to be discarded there if the topic doesn't match. This means that a message is received by the transport protocols (by all instances in the cluster), passed up through all the protocols, and then handed over to the application. If the application discards the message, then all the work of fragmenting, retransmitting, ordering, flow-controlling, de-fragmenting, uncompressing and so on is unnecessary, resulting in wasted CPU cycles, lock acquisitions, cache and memory accesses, context switching and bandwidth.&lt;br /&gt;&lt;br /&gt;A solution to this could be to do topic filtering at the &lt;i&gt;publisher's&lt;/i&gt; side: a publisher maintains a hashmap of subscribers and topics they've subscribed to and sends updates only to instances which have a current subscription.&lt;br /&gt;&lt;br /&gt;This has two drawbacks though: first the publishers have additional work maintaining those subscriptions, and the subscribers need to multicast subscribe or unsubscribe requests. In addition, new publishers need to somehow get the current subscriptions from an existing cluster member (via state transfer).&lt;br /&gt;&lt;br /&gt;Secondly, to send updates only to instances with a subscription, we'd have to resort to unicasts: if 10 instances of a 100 instance cluster are subscribed to "rht", an update message to "rht" would entail sending 10 unicast messages rather than 1 multicast message. This generates more traffic than needed, especially when the cluster size increases.&lt;br /&gt;&lt;br /&gt;Another solution, and that's the one chosen by PubSub, is to send all updates as multicast messages, but discard them as soon as possible at the receivers when there isn't a match. &lt;b&gt;Instead of having to traverse the entire JGroups stack, a message that doesn't match is discarded directly by the transport, which is the first protocol that receives a message.&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;This is done by using a &lt;a href="http://www.jgroups.org/manual-3.x/html/user-advanced.html#SharedTransport"&gt;shared transport&lt;/a&gt; and creating a separate channel for each subscription: whenever a new topic is subscribed to, PubSub creates a new channel and joins a cluster whose name is the &lt;i&gt;topic name&lt;/i&gt;. This is not overly costly, as the transport protocol - which contains almost all the resources of a stack, such as the thread pools, timers and sockets -&amp;nbsp; is only created once.&lt;br /&gt;&lt;br /&gt;The first channel to join a cluster will create the shared transport. Subsequent channels will only link to the existing shared transport, but won't initialize it. Using reference counting, the last channel to leave the cluster will de-allocate the resources used by the shared transport and destroy it.&lt;br /&gt;&lt;br /&gt;Every channel on top of the same shared transport will join a different cluster, named after the topic. PubSub maintains a hashmap of topic names as keys and channels as values. A "subscribe rht" operation simply creates a new channel (if there isn't one for topic "rht" yet), adds a listener, joins cluster "rht" and adds the topic/channel pair to the hashmap. An "unsubscribe rht" grabs the channel for "rht", closes it and removes it from the hashmap.&lt;br /&gt;&lt;br /&gt;When a publishes posts an update for "rht", it essentially sends a multicast to the "rht" cluster.&lt;br /&gt; &lt;br /&gt;The important point is that, when an update for "rht" is received by a shared transport, JGroups tries to find the channel which joined cluster "rht" and passes the message up to that channel (through its protocol stack), or discards it if there isn't a channel which joined cluster "rht".&lt;br /&gt;&lt;br /&gt;For example, if we have 3 channels A, B and C over the same shared transport TP, and A joined cluster "rht", B joined "aapl" and C joined "msft", then when a message for "ibm" arrives, it will be discarded by TP as there is no cluster "ibm" present. When a message for "rht" arrives, it will be passed up the stack for "rht" to channel A. &lt;br /&gt;&lt;br /&gt;&lt;b&gt;As a non-matching message will be discarded at the transport level, and not the application level, we save the costs of passing the message up the stack, through all the protocols and delivering it to the application.&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Note that PubSub uses the properties of IP multicasting, so the stack used by it should have UDP as shared transport. If TCP is used, then there are no benefits to the approach outlined above.&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19835054-865210181213163421?l=belaban.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://belaban.blogspot.com/feeds/865210181213163421/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://belaban.blogspot.com/2011/09/publish-subscribe-with-jgroups.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/865210181213163421'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/865210181213163421'/><link rel='alternate' type='text/html' href='http://belaban.blogspot.com/2011/09/publish-subscribe-with-jgroups.html' title='Publish-subscribe with JGroups'/><author><name>Bela Ban</name><uri>http://www.blogger.com/profile/01830789377474906550</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='28' height='32' src='http://3.bp.blogspot.com/_NUNlUHL8KjI/ST6MrbBdIaI/AAAAAAAAAAM/bu0j0fvc6sI/S220/bela.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-19835054.post-2600046605325196417</id><published>2011-09-07T11:22:00.001+02:00</published><updated>2011-09-07T11:22:34.610+02:00</updated><title type='text'>Speaking at the OpenBlend conference on Sept 15</title><content type='html'>FYI,&lt;br /&gt;&lt;br /&gt;I'll be speaking at the &lt;a href="http://www.openblend.org/en/home"&gt;OpenBlend&lt;/a&gt; conference in Ljubljana on Sept 15.&lt;br /&gt;&lt;br /&gt;My talk will be about how to persist data without using a disk, by spreading it over a grid with a customizable degree of redundancy. Kind of the NoSQL stuff everybody and their grandmothers are talking about these days...&lt;br /&gt;&lt;br /&gt;I'm excited to visit Ljubljana, as I've never been there before and I like seeing new towns.&lt;br /&gt;&lt;br /&gt;The other reason, of course, is to beat &lt;a href="http://relation.to/Bloggers/Ales"&gt;Ales Justin&lt;/a&gt;'s a**s in tennis :-)&lt;br /&gt;&lt;br /&gt;If you happen to be in town, come and join us ! I mean not for tennis, but for the conference, or for a beer in the evening !&lt;br /&gt;&lt;br /&gt;Cheers,&lt;br /&gt;Bela&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19835054-2600046605325196417?l=belaban.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://belaban.blogspot.com/feeds/2600046605325196417/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://belaban.blogspot.com/2011/09/speaking-at-openblend-conference-on.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/2600046605325196417'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/2600046605325196417'/><link rel='alternate' type='text/html' href='http://belaban.blogspot.com/2011/09/speaking-at-openblend-conference-on.html' title='Speaking at the OpenBlend conference on Sept 15'/><author><name>Bela Ban</name><uri>http://www.blogger.com/profile/01830789377474906550</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='28' height='32' src='http://3.bp.blogspot.com/_NUNlUHL8KjI/ST6MrbBdIaI/AAAAAAAAAAM/bu0j0fvc6sI/S220/bela.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-19835054.post-6748258354694704680</id><published>2011-09-01T17:46:00.002+02:00</published><updated>2011-09-01T17:46:56.813+02:00</updated><title type='text'>Optimizations for large clusters</title><content type='html'>I've been working on making JGroups more efficient on large clusters. 'Large' is between 100 and 2000 nodes.&lt;br /&gt;&lt;br /&gt;My focus has been on making the memory footprint smaller, and to reduce the wire size of certain types of messages.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Here are some of the optimizations that I implemented.&lt;br /&gt;&lt;br /&gt;&lt;u&gt;Discovery&lt;/u&gt;&lt;br /&gt;&lt;br /&gt;Discovery is needed by a new member to find the coordinator when joining. It broadcasts a discovery request, and everybody in the cluster replies with a discovery response.&lt;br /&gt;&lt;br /&gt;There were 2 problems with this: first, a cluster of 1000 nodes meant that a new joiner received 1000 messages at the same time, possibly clogging up network queues and causing messages to get dropped.&lt;br /&gt;&lt;br /&gt;This was solved by staggering the sending of responses (stagger_timeout).&lt;br /&gt;&lt;br /&gt;The second problem was that every discovery response included the current view. In a cluster of 1000, this meant that 1000 responses each contained a view of 1000 members !&lt;br /&gt;&lt;br /&gt;The solution to this was that we only send back the address of the coordinator; as this is all that's needed to send a JOIN request to it. So instead of sending back (with every discovery response) 1000 addresses, we now only send back 1 address.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;u&gt;Digest&lt;/u&gt;&lt;br /&gt;&lt;br /&gt;A digest used to contain the lowest, highest delivered and highest received sequence numbers (seqnos) for every member. They are sent back to a new joiner in a JOIN response, and they are also broadcast periodically by STABLE to purge messages delivered by everyone.&lt;br /&gt;&lt;br /&gt;The wire size would be 2 longs for every address (UUID), and 3 longs for the 3 seqnos. That's roughly 1000 * 5 * 8 = 40000 bytes for a cluster of 1000 members. Bear in mind that that's the size of &lt;b&gt;one&lt;/b&gt; digest; in a cluster of 1000, everyone broadcasts such a digest periodically (STABLE) !&lt;br /&gt;&lt;br /&gt;The first optimization was to remove the 'low' seqno; I had to change some code in the retransmitters to allow for that, but - hey - who wouldn't do that to save &lt;b&gt;8 bytes&lt;/b&gt; / STABLE message ? :-)&lt;br /&gt;&lt;br /&gt;This reduced the wire (and memory !) size of a 1000-member digest by another 8'000 bytes, down to 32'000 (from 40'000).&lt;br /&gt;&lt;br /&gt;Having only highest delivered (HD) and highest received (HR) seqnos allowed for another optimization: HR is always &amp;gt;= HD, and the difference between HR and HD is usually small.&lt;br /&gt;&lt;br /&gt;So the next optimization was to send HR as a delta to HD. So instead of sending 322649 | 322650, we'd send 322649 | 1.&lt;br /&gt;&lt;br /&gt;The central optimization underlying that was that seqnos seldomly need 8 bytes: a seqno starts at 1 and increases monotonically. If a member sends 5 million messages, the seqno can still be encoded in 4 bytes (saving 4 bytes per seqno). If a member is restarted, the seqno starts again at 1 and can thus be encoded in 1 byte.&lt;br /&gt;&lt;br /&gt;So now I could encode an HD/HR pair by sending a byte containing the number of bytes needed for the HD part in the lower 4 bits and the number of bytes needed for the delta in the higher 4 bits. The HD and the delta would then follow. Example: to encode HD=2000000 | HR=2000500, we'd generate the bytes:&lt;br /&gt;&lt;br /&gt;| 50 | -128 | -124 | 30 | -12 | 1 |&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;50 encodes a length of 3 for HD and 2 for HD-HR (500)&lt;/li&gt;&lt;li&gt;-128, -124 and 30 encode 2'000'000 in 3 bytes&lt;/li&gt;&lt;li&gt; -12 and 1 encode the delta (500)&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;So instead of using 16 bytes for the above sequence, we use only 6 bytes !&lt;br /&gt;&lt;br /&gt;If we assume that we can encode 2 seqnos on average in 6 bytes, the wire size of a digest is now 1000 * (16 (UUID) + 6) = 22'000, that's down from 40'000 in a 1000 member cluster. In other words, we're saving almost 50% of the wire size of a digest !&lt;br /&gt;&lt;br /&gt;Of course, we can not only encode seqno sequences, but also other longs, which is exactly what we did for another optimization. Examples of where this makes sense are:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Seqnos in NakackHeaders: every multicast message has such a header, so the savings here are significant&lt;/li&gt;&lt;li&gt;Range: this is used for retransmission requests, and is also a seqno sequence&lt;/li&gt;&lt;li&gt;RequestCorrelator IDs: used for every RPC&lt;/li&gt;&lt;li&gt;Fragmentation IDs (FRAG and FRAG2)&lt;/li&gt;&lt;li&gt;UNICAST and UNICAST2: sqnos and ranges&lt;/li&gt;&lt;li&gt;ViewId&lt;/li&gt;&lt;/ul&gt;An example of where this doesn't make sense are UUIDs: they are generated such that the bits are spread out over the entire 8 bytes, so encoding them would make 9 bytes out of 8 and that doesn't help. &lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;u&gt;JoinRsp&lt;/u&gt;&lt;br /&gt;&lt;br /&gt;A JoinRsp used to contain a list of members twice: once in the view and once in the digest. The was eliminated, and now we're sending the member list only once. This also cut the wire size of a JoinRsp in half.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Further optimizations planned for 3.1 include delta views and better compressed STABLE messages:&lt;br /&gt;&lt;u&gt;&lt;br /&gt;&lt;/u&gt;&lt;br /&gt;&lt;br /&gt;&lt;u&gt;Delta views&lt;/u&gt;&lt;br /&gt;&lt;br /&gt;If we have a view of 1000 members, we always send the full address list with every view change. This is not necessary, as everybody has access to the previous view.&lt;br /&gt;&lt;br /&gt;So, for example, when we have P, Q and R joining, and X and Y leaving in V22, then we can simply send a delta view; a view V22={V21+P+Q+R-X-Y}. This means, take the current view V21, remove members X and Y, and add members P, Q and R to the tail of the list, in order to generate a new view V22.&lt;br /&gt;&lt;br /&gt;So, instead of sending a list of 1000 members, we simply send 5 members, and everybody creates the new view locally, based on the current view and the delta information.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;u&gt;Compressed STABLE messages&lt;/u&gt;&lt;br /&gt;&lt;br /&gt;A STABLE message contains a digest with a list of all members and then the digest seqnos for HD and HR. Since STABLE messages are exchanged between members of the same cluster, they all have the same view, or else they would drop a STABLE message.&lt;br /&gt;&lt;br /&gt;Hence, we can drop the View and instead send the ViewId, which is 1 address and a long. Everyone knows that the digest seqnos will be in order of the current view, e.g. seqno pair 1 belongs to the first member of the current view, seqno pair 2 to the second member and so on.&lt;br /&gt;&lt;br /&gt;So instead of sending a list of 1000 members for a STABLE message, we only send 1 address.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;This will reduce the wire size of a 1000-member digest sent by STABLE from roughly 40'000 bytes to ca. 6'000 bytes !&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;u&gt;Download 3.0.0.CR1&lt;/u&gt;&lt;br /&gt;&lt;br /&gt;The optimizations (exluding delta views and compressed STABLE messages) are available in JGroups 3.0.0.CR1, which can be downloaded from [1].&lt;br /&gt;&lt;br /&gt;Enjoy (and feedback appreciated, on the mailing lists...) ! &lt;br /&gt;&lt;br /&gt;[1] &lt;a href="https://sourceforge.net/projects/javagroups/files/JGroups/3.0.0.CR1/"&gt;https://sourceforge.net/projects/javagroups/files/JGroups/3.0.0.CR1&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19835054-6748258354694704680?l=belaban.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://belaban.blogspot.com/feeds/6748258354694704680/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://belaban.blogspot.com/2011/09/optimizations-for-large-clusters.html#comment-form' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/6748258354694704680'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/6748258354694704680'/><link rel='alternate' type='text/html' href='http://belaban.blogspot.com/2011/09/optimizations-for-large-clusters.html' title='Optimizations for large clusters'/><author><name>Bela Ban</name><uri>http://www.blogger.com/profile/01830789377474906550</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='28' height='32' src='http://3.bp.blogspot.com/_NUNlUHL8KjI/ST6MrbBdIaI/AAAAAAAAAAM/bu0j0fvc6sI/S220/bela.jpg'/></author><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-19835054.post-2248324447366954620</id><published>2011-07-26T15:19:00.000+02:00</published><updated>2011-07-26T15:19:12.904+02:00</updated><title type='text'>It's time for a change: JGroups 3.0</title><content type='html'>I'm happy to anounce that I just released a first beta of JGroups 3.0 !&lt;br /&gt;&lt;br /&gt;It's been a long time since I released version 2.0 (Feb 2002); over &lt;b&gt;11 years&lt;/b&gt; and &lt;b&gt;77&lt;/b&gt; 2.x releases !&lt;br /&gt;&lt;br /&gt;We've pushed a lot of API changes into 3.x, in order to provide more features, bug fixes and optimizations in 2.x releases, which were always (API) backwards compatible to previous 2.x releases.&lt;br /&gt;&lt;br /&gt;However, now it was time to take that step and make all the changes we've accumulated over the years.&lt;br /&gt;&lt;br /&gt;The bad thing is that 3.x will require code changes if you port your 2.x app to it... however I anticipate that those changes will be trivial. Please ask questions regarding porting on the JGroups mailing list (or forums), and also post suggestions for improvements !&lt;br /&gt;&lt;br /&gt;The good thing is that I was able to remove a lot of code (ca. 25'000 lines compared to 2.12.1) and simplify JGroups significantly.&lt;br /&gt;&lt;br /&gt;Just one example: the getState(OutputStream) callback in 2.x didn't have an exception in its signature, so an implementation would typically look like this:&lt;br /&gt;&lt;br /&gt;&lt;div style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;public void getState(OutputStream output) {&lt;/div&gt;&lt;div style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; try {&lt;/div&gt;&lt;div style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; marshalStateToStream(output);&lt;/div&gt;&lt;div style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; }&lt;/div&gt;&lt;div style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; catch(Exception ex) {&lt;/div&gt;&lt;div style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;&amp;nbsp; log.error(ex); &lt;/div&gt;&lt;div style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; }&lt;/div&gt;&lt;div style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;}&lt;/div&gt;&lt;br /&gt;In 3.x, getState() is allowed to throw an exception, so the code looks like this now:&lt;br /&gt;&lt;br /&gt;&lt;div style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;public void getState(OutputStream output) throws Exception {&lt;/div&gt;&lt;div style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; marshalStateToStream(output);&lt;br /&gt;&lt;/div&gt;&lt;div style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;}&lt;/div&gt;&lt;br /&gt;First of all, we don't need to catch (and swallow !) the exception. Secondly, a possible exception will now actually be passed to the state requester, so that we know *why* a state transfer failed when we call JChannel.getState().&lt;br /&gt;&lt;br /&gt;There are many small (or bigger) changes like this, which I hope will make using JGroups simpler. A list of all API changes can be found at [2].&lt;br /&gt;&lt;br /&gt;The stability of 3 beta1 is about the same as 2.12.1 (very high), because there were mainly API changes, and only a few bug fixes or optimizations.&lt;br /&gt;&lt;br /&gt;I've also created a new 3.x specific set of documentation (manual, tutorial, javadocs), for example see the 3.x manual at [3].&lt;br /&gt;&lt;br /&gt;JGroups 3 beta1 can be downloaded from [1]. Please try it out and send me your feedback (mailing lists preferred) !&lt;br /&gt;&lt;br /&gt;Enjoy !&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;[1] &lt;a href="https://sourceforge.net/projects/javagroups/files/JGroups/3.0.0.Beta1"&gt;https://sourceforge.net/projects/javagroups/files/JGroups/3.0.0.Beta1&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;[2] &lt;a href="https://github.com/belaban/JGroups/blob/master/doc/API_Changes.txt"&gt;https://github.com/belaban/JGroups/blob/master/doc/API_Changes.txt&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;[3] &lt;a href="http://www.jgroups.org/manual-3.x/html/index.html"&gt;http://www.jgroups.org/manual-3.x/html/index.html&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19835054-2248324447366954620?l=belaban.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://belaban.blogspot.com/feeds/2248324447366954620/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://belaban.blogspot.com/2011/07/its-time-for-change-jgroups-30.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/2248324447366954620'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/2248324447366954620'/><link rel='alternate' type='text/html' href='http://belaban.blogspot.com/2011/07/its-time-for-change-jgroups-30.html' title='It&apos;s time for a change: JGroups 3.0'/><author><name>Bela Ban</name><uri>http://www.blogger.com/profile/01830789377474906550</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='28' height='32' src='http://3.bp.blogspot.com/_NUNlUHL8KjI/ST6MrbBdIaI/AAAAAAAAAAM/bu0j0fvc6sI/S220/bela.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-19835054.post-5355060771057298148</id><published>2011-04-29T09:29:00.000+02:00</published><updated>2011-04-29T09:29:45.331+02:00</updated><title type='text'>Largest JGroups cluster ever: 536 nodes !</title><content type='html'>I just returned from a trip to a customer who's working on creating a large scale JGroups cluster. The largest cluster I've ever created is 32 nodes, due to the fact that I don't have access to a larger lab...&lt;br /&gt;&lt;br /&gt;I've heard of a customer who's running a 420 node cluster, but I haven't seen it with my own eyes.&lt;br /&gt;&lt;br /&gt;However, this record was surpassed on Thursday April 28 2011: &lt;b&gt;we managed to run a 536 node cluster !&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;The setup was 130 celeron based blades with 1GB of memory, each running 4 JVMs with 96MB of heap, plus 4 embedded devices with 4 JVMs running on each. Each blade had 2 1GB NICs setup with IP Bonding. Note that the 4 processes are competing for CPU time and network IO, so with more blades or more physical memory available, I'm convinced we could go to 1000+ nodes !&lt;br /&gt;&lt;br /&gt;The configuration used was udp-largecluster.xml (with some modifications), recently created and shipped with JGroups 2.12.&lt;br /&gt;&lt;br /&gt;We started the processes in batches of 130, then waited for 20 seconds, then launched the second batch and so on. The reason we staggered the startup was to reduce the number of merges, which would have increased the startup time.&lt;br /&gt;&lt;br /&gt;Running this a couple of times (plus 50+ times over night), the cluster always formed fine, and most of the time we didn't have any merges at all.&lt;br /&gt;&lt;br /&gt;It took around 150-200 seconds (including the 5 sleeps of 20 seconds each) to start the cluster; in the picture at the bottom we see a run that took 176 seconds.&lt;br /&gt;&lt;br /&gt;&lt;u&gt;&lt;span style="font-size: small;"&gt;Changes to JGroups&lt;/span&gt;&lt;/u&gt;&lt;br /&gt;&lt;br /&gt;This large scale setup revealed that certain protocols need slight modifications to optimally support large clusters, a few of these changes are:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Discovery: the current view is sent back with every discovery response. This is not normally an issue, but if you have a 500+ view, then the size of a discovery response becomes huge. We'll fix this by returning only the coordinator's address and not the view. For discovery requests triggered by MERGE2, we'll return the ViewId instead of the entire view.&lt;/li&gt;&lt;li&gt;We're thinking about canonicalizing UUIDs with IDs, so nodes will be assigned unique (short) IDs instead of UUIDs. This means reducing the size for having 17 bytes (UUID) in memory in favor of 2 bytes (short).&lt;/li&gt;&lt;li&gt;STABLE messages: here, we return an array of members plus a digest (containing 3 longs) for *each* member. This also generates large messages (11K for 260 nodes).&lt;/li&gt;&lt;li&gt;The fix in general for these problems is to reduce the data sent, e.g. by compressing the view, or not sending it at all, if possible. For digests, we can also reduce the data sent by sending only diffs, by sending only 1 long and using shorts for diffs, by using bitsets representing offsets to a previously sent value, and so on.&amp;nbsp;&lt;/li&gt;&lt;/ul&gt;Ideas are abundant, we now need to see which one is the most efficient.&lt;br /&gt;&lt;br /&gt;For now, 536 nodes is an excellent number and - remember - we got to this number *without* the changes discussed above ! I'm convinced we can easily go higher, e.g. to 1000 nodes, without any changes. However, to reach 2000 nodes, the above changes will probably be required.&lt;br /&gt;&lt;br /&gt;Anyway, I'm very happy to see this new record !&lt;br /&gt;&lt;br /&gt;If anyone has created an even larger cluster, I'd be very interested in hearing about it !&lt;br /&gt;Cheers, and happy clustering,&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/-F2Zeve6pKwc/TbpfdjpnH7I/AAAAAAAAADE/IUGClF-Um6s/s1600/collector.jpg" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="640" src="http://4.bp.blogspot.com/-F2Zeve6pKwc/TbpfdjpnH7I/AAAAAAAAADE/IUGClF-Um6s/s640/collector.jpg" width="408" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19835054-5355060771057298148?l=belaban.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://belaban.blogspot.com/feeds/5355060771057298148/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://belaban.blogspot.com/2011/04/largest-jgroups-cluster-ever-536-nodes.html#comment-form' title='11 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/5355060771057298148'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/5355060771057298148'/><link rel='alternate' type='text/html' href='http://belaban.blogspot.com/2011/04/largest-jgroups-cluster-ever-536-nodes.html' title='Largest JGroups cluster ever: 536 nodes !'/><author><name>Bela Ban</name><uri>http://www.blogger.com/profile/01830789377474906550</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='28' height='32' src='http://3.bp.blogspot.com/_NUNlUHL8KjI/ST6MrbBdIaI/AAAAAAAAAAM/bu0j0fvc6sI/S220/bela.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/-F2Zeve6pKwc/TbpfdjpnH7I/AAAAAAAAADE/IUGClF-Um6s/s72-c/collector.jpg' height='72' width='72'/><thr:total>11</thr:total></entry><entry><id>tag:blogger.com,1999:blog-19835054.post-1109086974365275842</id><published>2011-04-01T09:05:00.000+02:00</published><updated>2011-04-01T09:05:32.405+02:00</updated><title type='text'>JBossWorld 2011 around the corner</title><content type='html'>Wanted to let you know that I've got 2 talks at JBW (Boston, May 3-6).&lt;br /&gt;&lt;br /&gt;The first talk [1] is about geographic failover of JBoss clusters. I'll show 2 clusters, one in NYC, the other one in ZRH. Both are completely independent and don't know about each other. However, they're bridged with a JGroups RELAY and therefore appear as if they were one big virtual cluster.&lt;br /&gt;&lt;br /&gt;This can be used for geographic failover, but it could also be used for example to extend a private cloud with an external, public cloud without having to use a hardware VPN device.&lt;br /&gt;&lt;br /&gt;As always with my talks, this will be demo'ed, so you know this isn't just vapor ware !&lt;br /&gt;&lt;br /&gt;The second talk [2] discusses 5 different ways of running a JBoss cluster on EC2. I'll show 2 demos, one of which works only on EC2, the other works on all clouds.&lt;br /&gt;&lt;br /&gt;This will be a fun week, followed by a week of biking in the Bay Area ! YEAH !!&lt;br /&gt;&lt;br /&gt;Hope to see and meet many of you in Boston !&lt;br /&gt;Cheers,&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;[1] &lt;a href="http://www.redhat.com/summit/sessions/best-of.html#66"&gt;http://www.redhat.com/summit/sessions/best-of.html#66&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;[2] &lt;a href="http://www.redhat.com/summit/sessions/jboss.html#43"&gt;http://www.redhat.com/summit/sessions/jboss.html#43&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19835054-1109086974365275842?l=belaban.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://belaban.blogspot.com/feeds/1109086974365275842/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://belaban.blogspot.com/2011/04/jbossworld-2011-around-corner.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/1109086974365275842'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/1109086974365275842'/><link rel='alternate' type='text/html' href='http://belaban.blogspot.com/2011/04/jbossworld-2011-around-corner.html' title='JBossWorld 2011 around the corner'/><author><name>Bela Ban</name><uri>http://www.blogger.com/profile/01830789377474906550</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='28' height='32' src='http://3.bp.blogspot.com/_NUNlUHL8KjI/ST6MrbBdIaI/AAAAAAAAAAM/bu0j0fvc6sI/S220/bela.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-19835054.post-231880159117200515</id><published>2011-03-11T13:33:00.000+01:00</published><updated>2011-03-11T13:33:19.368+01:00</updated><title type='text'>A quick update on performance of JGroups 2.12.0.Final</title><content type='html'>I forgot to add performance data to the release announcement of 2.1.0.Final, so here it is.&lt;br /&gt;&lt;br /&gt;&lt;i&gt;Caveat: this is a quick check to see if we have a performance regression, which I run routinely before a release, and my no means a comprehensive performance test !&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;I ran this both on my home cluster and our internal lab.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size: large;"&gt;org.jgroups.tests.perf.Test&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;This test is described in detail in [1]. It forms a cluster of 4 nodes, and every node sends 1 million messages of varying size (1K, 5K, 20K). We measure how long it takes for every node to receive the 4 million messages, and compute the message rate and throughput, per second, per node.&lt;br /&gt;&lt;br /&gt;This is my home cluster and consists of 4 HP ProLiant DL380G5 quad core servers (ca 3700 bogomips), connected to a GB switch, and running Linux 2.6. The JDK is 1.6 and the heap size is 600M. I ran 1 process on every box. The configuration used was udp.xml (using IP multicasting) shipped with JGroups.&lt;br /&gt;&lt;br /&gt;&lt;u&gt;Results&lt;/u&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&amp;nbsp; 1K message size: 140 MBytes / sec / node&lt;/li&gt;&lt;li&gt;&amp;nbsp; 5K message size: 153 MBytes / sec / node&lt;/li&gt;&lt;li&gt;20K message size: 154 MBytes / sec / node&lt;/li&gt;&lt;/ul&gt;&amp;nbsp;This shows that GB ethernet is saturated. The reason that every node receives more than the limit of GB ethernet (~ 125 MBytes/sec) is that every node loops back its own traffic, and therefore doesn't have to share it with other incoming packets. In theory, the max throughput should therefore be 4/3 * 125 ~= 166 MBytes/sec. We see that the numbers above are not too far away from this.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size: large;"&gt;org.jgroups.tests.UnicastTestRpcDist&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;This test mimicks the way Infinispan's DIST mode works.&lt;br /&gt;&lt;br /&gt;Again, we form a cluster of between 1 and 9 nodes. Every node is on a separate machine. The test then has every node invoke 2 unicast RPCs in randomly selected nodes. With a chance of 80% the RPCs are reads, and with a chance of 20% they're writes. The writes carry a payload of 1K, and the reads return a payload of 1K. Every node makes 20'000 RPCs.&lt;br /&gt;&lt;br /&gt;The hardware is a bit more powerful than my home cluster; every machine has 5300 bogomips, and all machines are connected with GB ethernet.&lt;br /&gt;&lt;br /&gt;&lt;u&gt;Results&lt;/u&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;1 node:&amp;nbsp;&amp;nbsp; 50'000 requests / sec /node&lt;/li&gt;&lt;li&gt;2 nodes: 23'000 requests / sec / node&lt;/li&gt;&lt;li&gt;3 nodes: 20'000 requests / sec / node&lt;/li&gt;&lt;li&gt;4 nodes: 20'000 requests / sec / node&lt;/li&gt;&lt;li&gt;5 nodes: 20'000 requests / sec / node&lt;/li&gt;&lt;li&gt;6 nodes: 20'000 requests / sec / node&lt;/li&gt;&lt;li&gt;7 nodes: 20'000 requests / sec / node&lt;/li&gt;&lt;li&gt;8 nodes: 20'000 requests / sec / node&lt;/li&gt;&lt;li&gt;9 nodes: 20'000 requests / sec / node &lt;/li&gt;&lt;/ul&gt;As can be seen, the number of requests per node is the same after 2-3 nodes. The 1 node scenario is somewhat contrived as there is no network communication involved.&lt;br /&gt;&lt;br /&gt;This is actually good news, as it shows that performance grows linearly. As a matter of fact, with increasing cluster size, the chances of more than 2 nodes picking the same target decreases, therefore performance degradation due to (write) access conflicts are likely to decrease.&lt;br /&gt;&lt;br /&gt;Caveat: I haven't tested this on a larger cluster yet, but the current performance is already very promising.&lt;br /&gt;&lt;br /&gt;[1] &lt;a href="http://community.jboss.org/docs/DOC-11594"&gt;http://community.jboss.org/docs/DOC-11594&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19835054-231880159117200515?l=belaban.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://belaban.blogspot.com/feeds/231880159117200515/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://belaban.blogspot.com/2011/03/quick-update-on-performance-of-jgroups.html#comment-form' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/231880159117200515'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/231880159117200515'/><link rel='alternate' type='text/html' href='http://belaban.blogspot.com/2011/03/quick-update-on-performance-of-jgroups.html' title='A quick update on performance of JGroups 2.12.0.Final'/><author><name>Bela Ban</name><uri>http://www.blogger.com/profile/01830789377474906550</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='28' height='32' src='http://3.bp.blogspot.com/_NUNlUHL8KjI/ST6MrbBdIaI/AAAAAAAAAAM/bu0j0fvc6sI/S220/bela.jpg'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-19835054.post-1501294311192346790</id><published>2011-03-09T17:55:00.001+01:00</published><updated>2011-03-09T17:58:12.118+01:00</updated><title type='text'>It took me 9 years to go from JGroups 2.0.0 to 2.12.0</title><content type='html'>Yes, you heard right: I released JGroups 2.0.0, new, shiny and refactored, in Feb 2002.&lt;br /&gt;&lt;br /&gt;I just released JGroups 2.12.0.Final, which will be the last minor release on the 2.x branch. (There won't be a 2.13; bug fixes will go into 2.12.x).&lt;br /&gt;&lt;br /&gt;Time difference: 9 years and change...:-)&lt;br /&gt;&lt;br /&gt;I'm still investigating why it took me so long !&lt;br /&gt;&lt;br /&gt;Anyway, 2.12.0.Final is here and it is an important release, as it will be shipped in Infinispan 4.2.1 and JBoss 6.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Below are the major features and bug fixes.&lt;br /&gt;&lt;br /&gt;On to 3.0 !&lt;br /&gt;Cheers,&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size: large;"&gt;&lt;b&gt;Release Notes JGroups 2.12&lt;/b&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;JGroups 2.12 is API-backwards compatible with previous versions (down to 2.2.7).&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;&lt;span style="font-size: small;"&gt;New features&lt;/span&gt;&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;u&gt;&lt;br /&gt;RELAY: connecting local (autonomous) clusters into a large virtual cluster&lt;/u&gt;&lt;br /&gt;&lt;br /&gt;[&lt;a href="https://issues.jboss.org/browse/JGRP-747"&gt;https://issues.jboss.org/browse/JGRP-747&lt;/a&gt;]&lt;br /&gt;&lt;br /&gt;A new protocol to connect 2 geographically separate sites into 1 large virtual cluster. The local clusters are&lt;br /&gt;completely autonomous, but RELAY makes them appear as if they were one.&lt;br /&gt;&lt;br /&gt;This can for example be used to implement geographic failover&lt;br /&gt;&lt;br /&gt;Blog: &lt;a href="http://belaban.blogspot.com/2010/11/clustering-between-different-sites.html"&gt;http://belaban.blogspot.com/2010/11/clustering-between-different-sites.html&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;u&gt;LockService: a new distributed locking service&lt;/u&gt;&lt;br /&gt;&lt;br /&gt;[&lt;a href="https://issues.jboss.org/browse/JGRP-1249"&gt;https://issues.jboss.org/browse/JGRP-1249&lt;/a&gt;]&lt;br /&gt;[&lt;a href="https://issues.jboss.org/browse/JGRP-1298"&gt;https://issues.jboss.org/browse/JGRP-1298&lt;/a&gt;]&lt;br /&gt;[&lt;a href="https://issues.jboss.org/browse/JGRP-1278"&gt;https://issues.jboss.org/browse/JGRP-1278&lt;/a&gt;]&lt;br /&gt;&lt;br /&gt;New distributed lock service, offering a java.util.concurrent.lock.Lock implementation (including conditions)&lt;br /&gt;providing cluster wide locks.&lt;br /&gt;&lt;br /&gt;Blog: &lt;a href="http://belaban.blogspot.com/2011/01/new-distributed-locking-service-in.html"&gt;http://belaban.blogspot.com/2011/01/new-distributed-locking-service-in.html&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;u&gt;Distributed ExecutorService&lt;/u&gt;&lt;br /&gt;&lt;br /&gt;[&lt;a href="https://issues.jboss.org/browse/JGRP-1300"&gt;https://issues.jboss.org/browse/JGRP-1300&lt;/a&gt;]&lt;br /&gt;&lt;br /&gt;New implementation of java.util.concurrent.ExecutorService over JGroups (contributed by William Burns).&lt;br /&gt;Read the documentation at www.jgroups.org for details.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;u&gt;BPING (Broadcast Ping): new discovery protocol based on broadcasting&lt;/u&gt;&lt;br /&gt;&lt;br /&gt;[&lt;a href="https://issues.jboss.org/browse/JGRP-1269"&gt;https://issues.jboss.org/browse/JGRP-1269&lt;/a&gt;]&lt;br /&gt;&lt;br /&gt;This is mainly used for discovery of JGroups on Android based phones. Apparently, IP multicasting is not correctly implemented / supported on Android (2.1), and so we have to resort to UPD broadcasting.&lt;br /&gt;&lt;br /&gt;Blog: &lt;a href="http://belaban.blogspot.com/2011/01/jgroups-on-android-phones.html"&gt;http://belaban.blogspot.com/2011/01/jgroups-on-android-phones.html&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;u&gt;&lt;br /&gt;JDBC_PING: new discovery protocol using a shared database&lt;/u&gt;&lt;br /&gt;&lt;br /&gt;[&lt;a href="https://issues.jboss.org/browse/JGRP-1231"&gt;https://issues.jboss.org/browse/JGRP-1231&lt;/a&gt;]&lt;br /&gt;&lt;br /&gt;All nodes use a shared DB (e.g. RDS on EC2) to place their location information into, and to read information from.&lt;br /&gt;Thanks to Sanne for coming up with the idea and for implementing this !&lt;br /&gt;Additional infos are on the wiki: community.jboss.org/wiki/JDBCPING&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;u&gt;FD_SOCK: ability to pick the bind address and port for the client socket&lt;/u&gt;&lt;br /&gt;&lt;br /&gt;[&lt;a href="https://issues.jboss.org/browse/JGRP-1262"&gt;https://issues.jboss.org/browse/JGRP-1262&lt;/a&gt;]&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;u&gt;&lt;br /&gt;Pluggable address generation&lt;/u&gt;&lt;br /&gt;&lt;br /&gt;[&lt;a href="https://issues.jboss.org/browse/JGRP-1297"&gt;https://issues.jboss.org/browse/JGRP-1297&lt;/a&gt;]&lt;br /&gt;&lt;br /&gt;Address generation is now pluggable; JChannel.setAddressGenerator(AddressGenerator) allows for generation of specific implementations of Address. This can for example be used to pass additional information along with every address. Currently used by RELAY to pass the name of the sub cluster around with a UUID.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Optimizations&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;u&gt;&lt;br /&gt;NAKACK: retransmitted messages don't need to be wrapped&lt;/u&gt;&lt;br /&gt;&lt;br /&gt;[&lt;a href="https://issues.jboss.org/browse/JGRP-1266"&gt;https://issues.jboss.org/browse/JGRP-1266&lt;/a&gt;]&lt;br /&gt;&lt;br /&gt;Not serializing retransmitted messages at the retransmitter and deserializing them at the requester saves&lt;br /&gt;1 serialization and 1 deserialization per retransmitted message.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;u&gt;Faster NakReceiverWindow&lt;/u&gt;&lt;br /&gt;&lt;br /&gt;[&lt;a href="https://issues.jboss.org/browse/JGRP-1133"&gt;https://issues.jboss.org/browse/JGRP-1133&lt;/a&gt;]&lt;br /&gt;&lt;br /&gt;Various optimizations to reduce locking in NakReceiverWindow:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Use of RetransmitTable (array-based matrix) rather than HashMap (reduced memory need, reduced locking, compaction)&lt;/li&gt;&lt;li&gt;Removal of double locking&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;&lt;br /&gt;Bug fixes&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;u&gt;NAKACK: incorrect digest on merge and state transfer&lt;/u&gt;&lt;br /&gt;&lt;br /&gt;[&lt;a href="https://issues.jboss.org/browse/JGRP-1251"&gt;https://issues.jboss.org/browse/JGRP-1251&lt;/a&gt;]&lt;br /&gt;&lt;br /&gt;When calling JChannel.getState() on a merge, the fetched state would overwrite the digest incorrectly.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;u&gt;AUTH: merge can bypass authorization&lt;/u&gt;&lt;br /&gt;&lt;br /&gt;[&lt;a href="https://issues.jboss.org/browse/JGRP-1255"&gt;https://issues.jboss.org/browse/JGRP-1255&lt;/a&gt;]&lt;br /&gt;&lt;br /&gt;AUTH would not check creds of other members in case of a merge. This allowed an unauthorized node to join a cluster by triggering a merge.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;u&gt;Custom SocketFactory ignored&lt;/u&gt;&lt;br /&gt;&lt;br /&gt;[&lt;a href="https://issues.jboss.org/browse/JGRP-1276"&gt;https://issues.jboss.org/browse/JGRP-1276&lt;/a&gt;]&lt;br /&gt;&lt;br /&gt;Despite setting a custom SocketFactory, it was ignored.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;u&gt;UFC: crash of depleted member could hang node&lt;/u&gt;&lt;br /&gt;&lt;br /&gt;[&lt;a href="https://issues.jboss.org/browse/JGRP-1274"&gt;https://issues.jboss.org/browse/JGRP-1274&lt;/a&gt;]&lt;br /&gt;&lt;br /&gt;Causing it to wait forever for credits from the crashed member.&lt;br /&gt;&lt;br /&gt;&lt;u&gt;&lt;br /&gt;Flow control: crash of member doesn't unblock sender&lt;/u&gt;&lt;br /&gt;&lt;br /&gt;[&lt;a href="https://issues.jboss.org/browse/JGRP-1283"&gt;https://issues.jboss.org/browse/JGRP-1283&lt;/a&gt;]&lt;br /&gt;[&lt;a href="https://issues.jboss.org/browse/JGRP-1287"&gt;https://issues.jboss.org/browse/JGRP-1287&lt;/a&gt;]&lt;br /&gt;[&lt;a href="https://issues.jboss.org/browse/JGRP-1274"&gt;https://issues.jboss.org/browse/JGRP-1274&lt;/a&gt;]&lt;br /&gt;&lt;br /&gt;When a sender block on P sending credits, and P crashes before being able to send credits,&lt;br /&gt;the sender blocks indefinitely.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;u&gt;UNICAST2: incorrect delivery order under stress&lt;/u&gt;&lt;br /&gt;&lt;br /&gt;[&lt;a href="https://issues.jboss.org/browse/JGRP-1267"&gt;https://issues.jboss.org/browse/JGRP-1267&lt;/a&gt;]&lt;br /&gt;&lt;br /&gt;UNICAST2 could (in rare cases) deliver messages in incorrect order. Fixed by using the same (proven)&lt;br /&gt;algorithm as NAKACK.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;u&gt;Incorrect conversion of TimeUnit if MILLISECONDS were not used&lt;/u&gt;&lt;br /&gt;&lt;br /&gt;[&lt;a href="https://issues.jboss.org/browse/JGRP-1277"&gt;https://issues.jboss.org/browse/JGRP-1277&lt;/a&gt;]&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;u&gt;Check if bind_addr is correct&lt;/u&gt;&lt;br /&gt;&lt;br /&gt;[&lt;a href="https://issues.jboss.org/browse/JGRP-1280"&gt;https://issues.jboss.org/browse/JGRP-1280&lt;/a&gt;]&lt;br /&gt;&lt;br /&gt;JGroups now verifies that the bind address is indeed a valid IP address: it has to be either the wildcard&lt;br /&gt;address (0.0.0.0) or an address of a network interface that is up.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;u&gt;ENCRYPT: sym_provider ignored&lt;/u&gt;&lt;br /&gt;&lt;br /&gt;[&lt;a href="https://issues.jboss.org/browse/JGRP-1279"&gt;https://issues.jboss.org/browse/JGRP-1279&lt;/a&gt;]&lt;br /&gt;&lt;br /&gt;Property sym_provider is ignored&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Manual&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;The manual is online at &lt;a href="http://www.jgroups.org/manual/html/index.html"&gt;http://www.jgroups.org/manual/html/index.html&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;The complete list of features and bug fixes can be found at &lt;a href="http://jira.jboss.com/jira/browse/JGRP"&gt;http://jira.jboss.com/jira/browse/JGRP&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Download the new release at &lt;a href="https://sourceforge.net/projects/javagroups/files/JGroups/2.12.0.Final"&gt;https://sourceforge.net/projects/javagroups/files/JGroups/2.12.0.Final&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Bela Ban, Kreuzlingen, Switzerland&lt;br /&gt;Vladimir Blagojevic, Toronto, Canada&lt;br /&gt;Richard Achmatowicz, Toronto, Canada&lt;br /&gt;Sanne Grinovero, Newcastle, Great Britain&lt;br /&gt;&lt;br /&gt;March 2011&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19835054-1501294311192346790?l=belaban.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://belaban.blogspot.com/feeds/1501294311192346790/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://belaban.blogspot.com/2011/03/it-took-me-9-years-to-go-from-jgroups.html#comment-form' title='8 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/1501294311192346790'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/1501294311192346790'/><link rel='alternate' type='text/html' href='http://belaban.blogspot.com/2011/03/it-took-me-9-years-to-go-from-jgroups.html' title='It took me 9 years to go from JGroups 2.0.0 to 2.12.0'/><author><name>Bela Ban</name><uri>http://www.blogger.com/profile/01830789377474906550</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='28' height='32' src='http://3.bp.blogspot.com/_NUNlUHL8KjI/ST6MrbBdIaI/AAAAAAAAAAM/bu0j0fvc6sI/S220/bela.jpg'/></author><thr:total>8</thr:total></entry><entry><id>tag:blogger.com,1999:blog-19835054.post-7009385939285434884</id><published>2011-01-22T09:46:00.000+01:00</published><updated>2011-01-22T09:46:06.896+01:00</updated><title type='text'>JGroups on Android phones</title><content type='html'>Yann Sionneau recently completed a port of JGroups to Android (2.1+). He took the 2.11 version of JGroups and removed classes which weren't available on Android, and changed some code to make JGroups run on Android.&lt;br /&gt;&lt;br /&gt;The QR code for a demo app (based on Draw) is available at [1]. Point a QR code scanner to it, download the app and run it on your Android based phone (I ran it on my HTC Desire). Then start Draw on your local computer, connected to the same wifi network as the phone. The instances, whether run on the phone or computers, should find each other and form a cluster.&lt;br /&gt;&lt;br /&gt;It was cool to draw some lines on my HTC and see them getting drawn on all cluster instances as well !&lt;br /&gt;&lt;br /&gt;[1] &lt;a href="http://sionneau.net/index.php?option=com_content&amp;amp;view=article&amp;amp;id=12%3Atouchsurface-android-app-now-pc-compatible-&amp;amp;catid=3%3Adivers&amp;amp;Itemid=2&amp;amp;lang=en"&gt;http://sionneau.net/index.php?option=com_content&amp;amp;view=article&amp;amp;id=12%3Atouchsurface-android-app-now-pc-compatible-&amp;amp;catid=3%3Adivers&amp;amp;Itemid=2&amp;amp;lang=en&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19835054-7009385939285434884?l=belaban.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://belaban.blogspot.com/feeds/7009385939285434884/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://belaban.blogspot.com/2011/01/jgroups-on-android-phones.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/7009385939285434884'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/7009385939285434884'/><link rel='alternate' type='text/html' href='http://belaban.blogspot.com/2011/01/jgroups-on-android-phones.html' title='JGroups on Android phones'/><author><name>Bela Ban</name><uri>http://www.blogger.com/profile/01830789377474906550</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='28' height='32' src='http://3.bp.blogspot.com/_NUNlUHL8KjI/ST6MrbBdIaI/AAAAAAAAAAM/bu0j0fvc6sI/S220/bela.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-19835054.post-8079579309142659969</id><published>2011-01-21T11:32:00.000+01:00</published><updated>2011-01-21T11:32:15.760+01:00</updated><title type='text'>New distributed locking service in JGroups</title><content type='html'>I just uploaded JGroups 2.12.0.Beta1, which contains a first version of the new distributed locking service (LockService), which replaces DistributedLockManager.&lt;br /&gt;&lt;br /&gt;LockService provides a distributed implementation of java.util.concurrent.lock.Lock. A lock is named and locking granularity is per thread. Here's an example of how to use it:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;// lock.xml has to have a locking protocol in it&lt;/span&gt;&lt;br /&gt;&lt;div style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;JChannel ch=new JChannel("/home/bela/lock.xml");&lt;/div&gt;&lt;div style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;LockService lock_service=new LockService(ch);&lt;/div&gt;&lt;div style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;Lock lock=lock_service.getLock("mylock");&lt;/div&gt;&lt;div style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;if(lock.tryLock(2000, TimeUnit.MILLISECONDS)) {&lt;/div&gt;&lt;div style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; try {&lt;/div&gt;&lt;div style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; // access the resource protected by "mylock" &lt;/div&gt;&lt;div style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; }&lt;/div&gt;&lt;div style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; finally {&lt;/div&gt;&lt;div style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp; &amp;nbsp; lock.unlock(); &lt;/div&gt;&lt;div style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; }&lt;/div&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;}&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;If "mylock" is locked by a different thread, it doesn't matter whether inside the same JVM, on the same box, or somewhere in the same cluster, then tryLock() will return false after 2 seconds, else it'll return true.&lt;br /&gt;&lt;br /&gt;Lock.newCondition() is currently not implemented - if there's a need for this, let us know on one of the JGroups mailing lists and we'll tackle this. If you have a chance to play with LockService, we're also grateful for feedback.&lt;br /&gt;&lt;br /&gt;The new locking service is part of 2.12.0.Beta1, which can be downloaded at [1]. Documentation is at [2].&lt;br /&gt;Cheers,&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;[1] http://sourceforge.net/projects/javagroups/files/JGroups/2.12.0.Beta1&lt;br /&gt;[2] http://www.jgroups.org/manual/html/index.html, section 4.6&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19835054-8079579309142659969?l=belaban.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://belaban.blogspot.com/feeds/8079579309142659969/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://belaban.blogspot.com/2011/01/new-distributed-locking-service-in.html#comment-form' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/8079579309142659969'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/8079579309142659969'/><link rel='alternate' type='text/html' href='http://belaban.blogspot.com/2011/01/new-distributed-locking-service-in.html' title='New distributed locking service in JGroups'/><author><name>Bela Ban</name><uri>http://www.blogger.com/profile/01830789377474906550</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='28' height='32' src='http://3.bp.blogspot.com/_NUNlUHL8KjI/ST6MrbBdIaI/AAAAAAAAAAM/bu0j0fvc6sI/S220/bela.jpg'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-19835054.post-6134969839128119142</id><published>2010-11-30T14:00:00.000+01:00</published><updated>2010-11-30T14:00:06.696+01:00</updated><title type='text'>Clustering between different sites / geopgraphic failover</title><content type='html'>I just completed a new feature in JGroups which allows for transparent bridging of separate clusters, e.g. at different sites.&lt;br /&gt;&lt;br /&gt;Let's say we have a (local) cluster in New York (NYC) and another  cluster in San Francisco (SFO). They're completely autonomous, and can even have completely different configurations.&lt;br /&gt;&lt;br /&gt;RELAY [1] essentially has the coordinators of the local clusters relay local traffic to the remote cluster, and vice versa. The relaying (or bridging) is done via a separate cluster, usually based on TCP, as IP multicasting is typically not allowed between sites.&lt;br /&gt;&lt;br /&gt;SFO could be a backup of NYC, or both could be active, or we could think of a follow-the-sun model where each cluster is active during working hours at its site.&lt;br /&gt;&lt;br /&gt;If we have nodes {A,B,C} in NYC and {D,E,F} in SFO, then there would be a global view, e.g. {D,E,F,A,B,C}, which is the same across all the nodes of both clusters.&lt;br /&gt;&lt;br /&gt;One use of RELAY could be to provide geographic failover in case of site failures. Because all of the data in NYC is also available in SFO, clients can simply fail over from NYC to SFO if the entire NYC site goes down, and continue to work.&lt;br /&gt;&lt;br /&gt;Another use case is to have SFO act as a read-only copy of NYC, and run data analysis functions on SFO, without disturbing NYC, and with access to almost real-time data.&lt;br /&gt;&lt;br /&gt;As you can guess, this feature is going to be used by &lt;a href="http://www.infinispan.org/"&gt;Infinispan&lt;/a&gt;, and since Infinispan serves as the data replication / distribution layer in JBoss, we hope to be able to provide replication / distribution between sites in JBoss as well...&lt;br /&gt;&lt;br /&gt;Exciting times ... stay tuned for more interesting news from the Infinispan team ! &lt;br /&gt;&lt;br /&gt;Read more on RELAY at [1] and provide feedback !&lt;br /&gt;Cheers,&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;[1] http://www.jgroups.org/manual/html/user-advanced.html#RelayAdvanced&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19835054-6134969839128119142?l=belaban.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://belaban.blogspot.com/feeds/6134969839128119142/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://belaban.blogspot.com/2010/11/clustering-between-different-sites.html#comment-form' title='14 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/6134969839128119142'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/6134969839128119142'/><link rel='alternate' type='text/html' href='http://belaban.blogspot.com/2010/11/clustering-between-different-sites.html' title='Clustering between different sites / geopgraphic failover'/><author><name>Bela Ban</name><uri>http://www.blogger.com/profile/01830789377474906550</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='28' height='32' src='http://3.bp.blogspot.com/_NUNlUHL8KjI/ST6MrbBdIaI/AAAAAAAAAAM/bu0j0fvc6sI/S220/bela.jpg'/></author><thr:total>14</thr:total></entry><entry><id>tag:blogger.com,1999:blog-19835054.post-7331864158343070147</id><published>2010-11-23T13:28:00.000+01:00</published><updated>2010-11-23T13:28:45.540+01:00</updated><title type='text'>JGroups finally has a logo</title><content type='html'>After conducting a vote on the logos designed by James Cobb, the vast majority voted for logo #1. So I'm happy to say that, after 12 years, JGroups finally has a logo !&lt;br /&gt;&lt;br /&gt;I added the logo and favicon to &lt;a href="http://jgroups.org/"&gt;jgroups.org&lt;/a&gt;. Let me know what you think !&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;There's also swag available on &lt;a href="http://www.cafepress.com/jbossorg/7489623"&gt;cafepress&lt;/a&gt;, check it out !&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19835054-7331864158343070147?l=belaban.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://belaban.blogspot.com/feeds/7331864158343070147/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://belaban.blogspot.com/2010/11/jgroups-finally-has-logo.html#comment-form' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/7331864158343070147'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/7331864158343070147'/><link rel='alternate' type='text/html' href='http://belaban.blogspot.com/2010/11/jgroups-finally-has-logo.html' title='JGroups finally has a logo'/><author><name>Bela Ban</name><uri>http://www.blogger.com/profile/01830789377474906550</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='28' height='32' src='http://3.bp.blogspot.com/_NUNlUHL8KjI/ST6MrbBdIaI/AAAAAAAAAAM/bu0j0fvc6sI/S220/bela.jpg'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-19835054.post-2591150914530319170</id><published>2010-10-29T14:49:00.001+02:00</published><updated>2010-10-29T14:51:12.661+02:00</updated><title type='text'>JGroups 2.11 final released</title><content type='html'>FYI,&lt;br /&gt;&lt;br /&gt;2.11.0.final can be downloaded &lt;a href="http://sourceforge.net/projects/javagroups/files/"&gt;here&lt;/a&gt;. Its main features, optimizations and bug fixes are listed below.&lt;br /&gt;&lt;br /&gt;I hope that 2.12 will be the &lt;span style="font-weight: bold;"&gt;last&lt;/span&gt; release before &lt;span style="font-weight: bold;"&gt;finally&lt;/span&gt; going to 3.0 !&lt;br /&gt;&lt;br /&gt;2.12 should be very small, currently it contains only 8 issues (mainly optimizations).&lt;br /&gt;&lt;br /&gt;However, I also moved &lt;a href="https://jira.jboss.org/browse/JGRP-747"&gt;RELAY&lt;/a&gt; from 3.x to 2.12.&lt;br /&gt;&lt;br /&gt;RELAY allows for connecting geographically separate clusters into a large virtual cluster. This will be interesting to apps which need to provide geographic failover. More on this in the next couple of weeks...&lt;br /&gt;&lt;br /&gt;Meanwhile ... enjoy 2.11 !&lt;br /&gt;&lt;br /&gt;Bela, Vladimir &amp;amp; Richard&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Release Notes JGroups 2.11&lt;br /&gt;==========================&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Version: $Id: ReleaseNotes-2.11.txt,v 1.2 2010/10/29 11:45:35 belaban Exp $&lt;br /&gt;Author: Bela Ban&lt;br /&gt;&lt;br /&gt;JGroups 2.11 is API-backwards compatible with previous versions (down to 2.2.7).&lt;br /&gt;&lt;br /&gt;Below is a summary (with links to the detailed description) of the major new features.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;New features&lt;br /&gt;============&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;AUTH: pattern matching to prevent unauthorized joiners&lt;br /&gt;------------------------------------------------------&lt;br /&gt;[&lt;a href="https://jira.jboss.org/browse/JGRP-996"&gt;https://jira.jboss.org/browse/JGRP-996&lt;/a&gt;]&lt;br /&gt;&lt;br /&gt;New plugin for AUTH which can use pattern matching against regular expressions to prevent unauthorized&lt;br /&gt;IP addresses to join a cluster.&lt;br /&gt;&lt;br /&gt;Blog: &lt;a href="http://belaban.blogspot.com/2010/09/cluster-authentication-with-pattern.html"&gt;http://belaban.blogspot.com/2010/09/cluster-authentication-with-pattern.html&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;DAISYCHAIN: implementation of daisy chaining&lt;br /&gt;--------------------------------------------&lt;br /&gt;[&lt;a href="https://jira.jboss.org/browse/JGRP-1021"&gt;https://jira.jboss.org/browse/JGRP-1021&lt;/a&gt;]&lt;br /&gt;&lt;br /&gt;Daisy chaining sends messages around in a ring, improving throughput for non IP multicast networks.&lt;br /&gt;&lt;br /&gt;Blog: &lt;a href="http://belaban.blogspot.com/2010/08/daisychaining-in-clouds.html"&gt;http://belaban.blogspot.com/2010/08/daisychaining-in-clouds.html&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;New flow control protocols for unicast (UFC) and multicast (MFC) messages&lt;br /&gt;-------------------------------------------------------------------------&lt;br /&gt;[&lt;a href="https://jira.jboss.org/browse/JGRP-1154"&gt;https://jira.jboss.org/browse/JGRP-1154&lt;/a&gt;]&lt;br /&gt;&lt;br /&gt;MFC and UFC replace FC. They can be used independently, and performance is faster than that of FC only.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;API for programmatic creation of channel&lt;br /&gt;----------------------------------------&lt;br /&gt;[&lt;a href="https://jira.jboss.org/browse/JGRP-1245"&gt;https://jira.jboss.org/browse/JGRP-1245&lt;/a&gt;]&lt;br /&gt;&lt;br /&gt;Allows for programmatic creation of a JChannel, no need for XML config file.&lt;br /&gt;&lt;br /&gt;Blog: &lt;a href="http://belaban.blogspot.com/2010/10/programmatic-creation-of-channel.html"&gt;http://belaban.blogspot.com/2010/10/programmatic-creation-of-channel.html&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;S3: new features&lt;br /&gt;----------------&lt;br /&gt;[&lt;a href="https://jira.jboss.org/browse/JGRP-1234"&gt;https://jira.jboss.org/browse/JGRP-1234&lt;/a&gt;] Allow use of public buckets (no credentials need to be sent)&lt;br /&gt;[&lt;a href="https://jira.jboss.org/browse/JGRP-1235"&gt;https://jira.jboss.org/browse/JGRP-1235&lt;/a&gt;] Pre-signed URLs&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;STOMP: new protocol to allows STOMP clients to talk to a JGroups node&lt;br /&gt;---------------------------------------------------------------------&lt;br /&gt;[&lt;a href="https://jira.jboss.org/browse/JGRP-1248"&gt;https://jira.jboss.org/browse/JGRP-1248&lt;/a&gt;]&lt;br /&gt;&lt;br /&gt;Blog: &lt;a href="http://belaban.blogspot.com/2010/10/stomp-for-jgroups.html"&gt;http://belaban.blogspot.com/2010/10/stomp-for-jgroups.html&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Optimizations&lt;br /&gt;=============&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;NAKACK: simplify and optimize handling of OOB messages&lt;br /&gt;------------------------------------------------------&lt;br /&gt;[&lt;a href="https://jira.jboss.org/browse/JGRP-1104"&gt;https://jira.jboss.org/browse/JGRP-1104&lt;/a&gt;]&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Discovery: reduce number of discovery responses sent in a large cluster&lt;br /&gt;-----------------------------------------------------------------------&lt;br /&gt;[&lt;a href="https://jira.jboss.org/browse/JGRP-1181"&gt;https://jira.jboss.org/browse/JGRP-1181&lt;/a&gt;]&lt;br /&gt;&lt;br /&gt;A new propery (max_rank) determines who will and who won't send discovery responses.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;New timer implementations&lt;br /&gt;-------------------------&lt;br /&gt;[&lt;a href="https://jira.jboss.org/browse/JGRP-1051"&gt;https://jira.jboss.org/browse/JGRP-1051&lt;/a&gt;]&lt;br /&gt;&lt;br /&gt;Way more effecient implementations of the timer (TimeScheduler).&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Bug fixes&lt;br /&gt;=========&lt;br /&gt;&lt;br /&gt;ENCRYPT: encrypt entire message when length=0&lt;br /&gt;---------------------------------------------&lt;br /&gt;[&lt;a href="https://jira.jboss.org/browse/JGRP-1242"&gt;https://jira.jboss.org/browse/JGRP-1242&lt;/a&gt;]&lt;br /&gt;&lt;br /&gt;ENCRYPT would not encrypt messages whose length = 0&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;FD_ALL: reduce number of messages sent on suspicion&lt;br /&gt;---------------------------------------------------&lt;br /&gt;[&lt;a href="https://jira.jboss.org/browse/JGRP-1241"&gt;https://jira.jboss.org/browse/JGRP-1241&lt;/a&gt;]&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;FILE_PING: empty files stop discovery&lt;br /&gt;-------------------------------------&lt;br /&gt;[&lt;a href="https://jira.jboss.org/browse/JGRP-1246"&gt;https://jira.jboss.org/browse/JGRP-1246&lt;/a&gt;]&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Manual&lt;br /&gt;======&lt;br /&gt;&lt;br /&gt;The manual is online at http://www.jgroups.org/manual/html/index.html&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;The complete list of features and bug fixes can be found at &lt;a href="http://jira.jboss.com/jira/browse/JGRP"&gt;http://jira.jboss.com/jira/browse/JGRP&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Bela Ban, Kreuzlingen, Switzerland&lt;br /&gt;Vladimir Blagojevic, Toronto, Canada&lt;br /&gt;Richard Achmatowicz, Toronto, Canada&lt;br /&gt;&lt;br /&gt;Nov 2010&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19835054-2591150914530319170?l=belaban.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://belaban.blogspot.com/feeds/2591150914530319170/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://belaban.blogspot.com/2010/10/fyi-2.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/2591150914530319170'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/2591150914530319170'/><link rel='alternate' type='text/html' href='http://belaban.blogspot.com/2010/10/fyi-2.html' title='JGroups 2.11 final released'/><author><name>Bela Ban</name><uri>http://www.blogger.com/profile/01830789377474906550</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='28' height='32' src='http://3.bp.blogspot.com/_NUNlUHL8KjI/ST6MrbBdIaI/AAAAAAAAAAM/bu0j0fvc6sI/S220/bela.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-19835054.post-525496207368093635</id><published>2010-10-27T18:04:00.000+02:00</published><updated>2010-10-27T18:04:54.033+02:00</updated><title type='text'>STOMP for JGroups</title><content type='html'>FYI, &lt;br /&gt;&lt;br /&gt;I've written a new JGroups protocol STOMP, which implements the &lt;a href="http://stomp.codehaus.org/"&gt;STOMP protocol&lt;/a&gt;. This allows for STOMP clients to connect to any JGroups server node (which has the JGroups STOMP protocol in its configuration).&lt;br /&gt;&lt;br /&gt;The benefits of this are:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&amp;nbsp;Clients can be written in any language. For example, I've used &lt;a href="http://code.google.com/p/stomppy"&gt;stomppy&lt;/a&gt;, a Python client, to connect to JGroups server nodes, and successfully subscribed to destinations, and sent and received messages.&lt;/li&gt;&lt;li&gt;Sometimes, clients don't want to be peers, ie. they don't want to join a cluster and become full members. These (light-weight) clients could also be in a different geographic location, and not be able to use IP multicasting.&lt;/li&gt;&lt;li&gt;Clients are started and stopped frequently, and there might be many of them. Frequently starting and stopping a full-blown JGroups server node has a cost, and is not recommended. Besides, a high churn rate might move the cluster coordinator around quite a lot, preventing it from doing real work.&lt;/li&gt;&lt;li&gt;We can easily scale to a large number of clients. Although every client requires 1 thread on the server side, we can easily support hundreds of clients. Note though that I wouldn't use the current JGroups STOMP protocol to connect thousands of clients...&lt;/li&gt;&lt;/ul&gt;Let's take a quick look: I started an instance of JGroups with STOMP on the top of the protocol stack (on 192.168.1.5). Then I connected to it with the JGroups client:&lt;br /&gt;&lt;br /&gt;&lt;table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: left; margin-right: 1em; text-align: left;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td style="text-align: center;"&gt;&lt;img border="0" height="349" src="http://4.bp.blogspot.com/_NUNlUHL8KjI/TMhIt2Pd90I/AAAAAAAAACw/4-zMCQQagiw/s640/JavaClient.png" style="margin-left: auto; margin-right: auto;" width="640" /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="tr-caption" style="text-align: center;"&gt;JGroups STOMP client&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_NUNlUHL8KjI/TMhIt2Pd90I/AAAAAAAAACw/4-zMCQQagiw/s1600/JavaClient.png" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;&lt;br /&gt;As can be seen, the first response the client received was an INFO with information about the available endpoints (STOMP instances) in the cluster. This is actually used by the StompConnection client to failover to a different server node should the currently connected to server fail.&lt;br /&gt;Next, we subscribe to destination /a using the simplified syntax of the JGroups STOMP client. &lt;br /&gt;&lt;br /&gt;Then, a telnet session to 192.168.1.5:8787 was started:&lt;br /&gt;&lt;br /&gt;&lt;table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: left; margin-right: 1em; text-align: left;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td style="text-align: center;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/_NUNlUHL8KjI/TMhJ4DwCZRI/AAAAAAAAAC4/uZQpqYTguJQ/s1600/TelnetClient.png" style="margin-left: auto; margin-right: auto;" /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="tr-caption" style="text-align: center;"&gt;Telnet STOMP client&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_NUNlUHL8KjI/TMhJ4DwCZRI/AAAAAAAAAC4/uZQpqYTguJQ/s1600/TelnetClient.png" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;We get the INFO response with the list of endpoints too here. Then we subscribe to the /a destination. Note that the syntax used here is compliant with the STOMP protocol spec: first is the verb (SUBSCRIBE), then an optional bunch of headers (here just one, defining the destination to subscribe to), a newline and finally the body, terminated with a 0 byte. (SUBSCRIBE does not have a body).&lt;br /&gt;&lt;br /&gt;Next, we send a message to all clients subscribed to /a. This is the telnet session itself, as evidenced by the reception of MESSAGE. If you look at the JGroups STOMP client, the message is also received there.&lt;br /&gt;&lt;br /&gt;Next the JGroups client also sends a message to destination /a, which is received by itself and the telnet client.&lt;br /&gt;&lt;br /&gt;JGroups 2.11.0.Beta2 also ships with a 'stompified' Draw demo, org.jgroups.demos.StompDraw, which is a stripped down version of Draw, using the STOMP protocol to send updates to the cluster.&lt;br /&gt;&lt;br /&gt;Let me know what you think of this; feature requests, feedback etc appreciated (preferably on one of the JGroups mailing lists) !&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;The new protocol is part of JGroups 2.11.0.Beta2, which can be downloaded &lt;a href="http://sourceforge.net/projects/javagroups/files/"&gt;here&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Documentation is &lt;a href="http://www.jgroups.org/manual/html/protlist.html#STOMP"&gt;here&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Enjoy !&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19835054-525496207368093635?l=belaban.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://belaban.blogspot.com/feeds/525496207368093635/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://belaban.blogspot.com/2010/10/stomp-for-jgroups.html#comment-form' title='7 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/525496207368093635'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/525496207368093635'/><link rel='alternate' type='text/html' href='http://belaban.blogspot.com/2010/10/stomp-for-jgroups.html' title='STOMP for JGroups'/><author><name>Bela Ban</name><uri>http://www.blogger.com/profile/01830789377474906550</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='28' height='32' src='http://3.bp.blogspot.com/_NUNlUHL8KjI/ST6MrbBdIaI/AAAAAAAAAAM/bu0j0fvc6sI/S220/bela.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_NUNlUHL8KjI/TMhIt2Pd90I/AAAAAAAAACw/4-zMCQQagiw/s72-c/JavaClient.png' height='72' width='72'/><thr:total>7</thr:total></entry><entry><id>tag:blogger.com,1999:blog-19835054.post-6492724544359762164</id><published>2010-10-20T17:03:00.000+02:00</published><updated>2010-10-20T17:03:36.711+02:00</updated><title type='text'>Programmatic creation of a channel</title><content type='html'>I've committed code which provides programmatic creation of channels. This is a way of creating a channel without XML config files. So instead of writing&lt;br /&gt;&lt;br /&gt;&lt;div style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;JChannel ch=new JChannel("udp.xml");&lt;/div&gt;&lt;br /&gt;, I can construct the channel programmatically:&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;pre class="screen"&gt;JChannel ch=new JChannel(false);                 // 1&lt;br /&gt;ProtocolStack stack=new ProtocolStack(); // 2&lt;br /&gt;ch.setProtocolStack(stack);              // 3&lt;br /&gt;stack.addProtocol(new UDP().setValue("ip_ttl", 8));&lt;br /&gt;     .addProtocol(new PING())&lt;br /&gt;     .addProtocol(new MERGE2())&lt;br /&gt;     .addProtocol(new FD_SOCK())&lt;br /&gt;     .addProtocol(new FD_ALL().setValue("timeout", 12000));&lt;br /&gt;     .addProtocol(new VERIFY_SUSPECT())&lt;br /&gt;     .addProtocol(new BARRIER())&lt;br /&gt;     .addProtocol(new NAKACK())&lt;br /&gt;     .addProtocol(new UNICAST2())&lt;br /&gt;     .addProtocol(new STABLE())&lt;br /&gt;     .addProtocol(new GMS())&lt;br /&gt;     .addProtocol(new UFC())&lt;br /&gt;     .addProtocol(new MFC())&lt;br /&gt;     .addProtocol(new FRAG2());       // 4&lt;br /&gt;stack.init();                         // 5&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;First, a JChannel is created (1). The 'false' argument means that the channel must not create its own protocol stack, because we create it (2) and stick it into the channel (3).&lt;br /&gt;&lt;br /&gt;Next, all protocols are created and added to the stack (4). This needs to happen in the order in which we want the protocols to be, so the first protocol added is the transport protocol (UDP in the example).&lt;br /&gt;&lt;br /&gt;Note that we can use Protocol.setValue(String attr_name, Object attr_value) to configure each protocol instance. We can also use regular setters if available.&lt;br /&gt;&lt;br /&gt;Finally, we call init() (5), which connects the protocol list correctly and calls init() on every instance. This also handles shared transports correctly. For an example of how to create a shared transport with 2 channels on top see ProgrammaticApiTest.&lt;br /&gt;&lt;br /&gt;I see mainly 3 use cases where programmatic creation of a channel is preferred over declarative creation:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Someone hates XML (I'm not one of them) :-)&lt;/li&gt;&lt;li&gt;Unit tests&lt;/li&gt;&lt;li&gt;Projects consuming JGroups might have their own configuration mechanism (e.g. GUI, properties file, different XML configuration&amp;nbsp; etc) and don't want to use the XML cofiguration mechanism shipped with JGroups.&lt;/li&gt;&lt;/ol&gt;Let me know what you think about this API ! I deliberately kept it simple and stupid, and maybe there are things people like to see changed. I'm open to suggestions !&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Cheers,&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19835054-6492724544359762164?l=belaban.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://belaban.blogspot.com/feeds/6492724544359762164/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://belaban.blogspot.com/2010/10/programmatic-creation-of-channel.html#comment-form' title='9 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/6492724544359762164'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/6492724544359762164'/><link rel='alternate' type='text/html' href='http://belaban.blogspot.com/2010/10/programmatic-creation-of-channel.html' title='Programmatic creation of a channel'/><author><name>Bela Ban</name><uri>http://www.blogger.com/profile/01830789377474906550</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='28' height='32' src='http://3.bp.blogspot.com/_NUNlUHL8KjI/ST6MrbBdIaI/AAAAAAAAAAM/bu0j0fvc6sI/S220/bela.jpg'/></author><thr:total>9</thr:total></entry><entry><id>tag:blogger.com,1999:blog-19835054.post-291257105449233503</id><published>2010-10-01T16:08:00.000+02:00</published><updated>2010-10-01T16:08:55.235+02:00</updated><title type='text'>Confessions of a serial protocol designer</title><content type='html'>I have a confession to make.&lt;br /&gt;&lt;br /&gt;I'm utterly disgusted by my implementation of FD_ALL, and thanks to David Forget for pointing this out !&lt;br /&gt;&lt;br /&gt;What's bad about FD_ALL ? It will not scale at all ! After having written several dozen protocols, I thought an amateurish mistake like the one I'm about to show would certainly not happen to me anymore. Boy, was I wrong !&lt;br /&gt;&lt;br /&gt;FD_ALL is about detecting crashed nodes in a cluster, and the protocol then lets GMS know so that the crashed node(s) can be excluded from the view.&lt;br /&gt;&lt;br /&gt;Let's take a look at the design.&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Every node periodically multicasts a HEARTBEAT&lt;/li&gt;&lt;li&gt;This message is received by everyone in the cluster and a hashmap of nodes and timestamps is updated; for a node P, P's timestamp is set to the current time&lt;/li&gt;&lt;li&gt;Another task run at every node periodcially iterates through the timestamps and checks if any timestamps haven't been updated for a given time. If that's the case, the members with outdated timestamps are suspected&lt;/li&gt;&lt;li&gt;A suspicion of P results in a SUSPECT(P) multicast&lt;/li&gt;&lt;li&gt;On reception of SUSPECT(P), every node generates a SUSPECT(P) event and passes it up the stack&lt;/li&gt;&lt;li&gt;VERIFY_SUSPECT catches SUSPECT(P) and sends an ARE_YOU_DEAD message to P&lt;/li&gt;&lt;li&gt;If P is still alive, it'll respond with a I_AM_NOT_DEAD message&lt;/li&gt;&lt;li&gt;If the sender doesn't get this message for a certain time, it'll pass the SUSPECT(P) event further up the stack (otherwise it'll drop it), and GMS will exclude P from the view, but if and only if that given node is the coordinator (first in the view)&lt;/li&gt;&lt;/ul&gt;Can anyone see the flaw in this design ? Hint: it has to do with the number of messages generated...&lt;br /&gt;&lt;br /&gt;OK, so let's see what happens if we have a cluster of 100 nodes:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Say node P is temporarily slow; it doesn't send HEARTBEATs because a big garbage collection is going on, or the CPU is crunching at 90%&lt;/li&gt;&lt;li&gt;99 nodes multicast a SUSPECT(P) message&lt;/li&gt;&lt;li&gt;Every node Q therefore receives 99 SUSPECT(P) messages&lt;/li&gt;&lt;ul&gt;&lt;li&gt;Q (via VERIFY_SUSPECT) sends a ARE_YOU_DEAD message to P&lt;/li&gt;&lt;li&gt;P (if it can) responds with an I_AM_NOT_DEAD back to Q&lt;/li&gt;&lt;li&gt;So the total number of messages generated by a single node is 99 * 2&lt;/li&gt;&lt;/ul&gt;&lt;li&gt;This is done on &lt;i&gt;every node&lt;/i&gt;, so the total number of messages is &lt;b&gt;99 * 99 * 2 = 19'602 messages&lt;/b&gt; !&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;Can you imagine what happens to P, which is a bit overloaded and cannot send out HEARTBEATs in time when it receives 19'602 messages ?&lt;br /&gt;&lt;br /&gt;&lt;b&gt;It it aint dead yet, it will die !&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Isn't it ironic: by asking a node if it is still alive, we actually kill it !&lt;br /&gt;&lt;br /&gt;This is an example of where the effects of using IP multicasts were not taken into account: if we multicast M, and everybody who receives M sends 2 messages, I neglected to see that the number of messages sent is a function of the cluster size !&lt;br /&gt;&lt;br /&gt;So what's the solution ? Simple, elegant and outlined in [1].&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Everybody sends a HEARTBEAT multicast periodically&lt;/li&gt;&lt;li&gt;Every member maintains a suspect list&amp;nbsp;&lt;/li&gt;&lt;li&gt;This list is adjusted on view changes&amp;nbsp;&lt;/li&gt;&lt;li&gt;Reception of a SUSPECT(P) message adds P to the list&amp;nbsp;&lt;/li&gt;&lt;li&gt;When we suspect P because we haven't received a HEARTBEAT (or traffic if enabled):&amp;nbsp;&lt;/li&gt;&lt;ul&gt;&lt;li&gt;The set of eligible members is computed as: members - suspected members&amp;nbsp;&lt;/li&gt;&lt;li&gt;If we are the coordinator (first in the list):&amp;nbsp;&lt;/li&gt;&lt;ul&gt;&lt;li&gt;Pass a SUSPECT(P) event up the stack, this runs the VERIFY_SUSPECT protocol and eventually passes the SUSPECT(P) up to GMS, which will exclude P from the view &lt;/li&gt;&lt;/ul&gt;&lt;/ul&gt;&lt;/ul&gt;&lt;br /&gt;The cost of running the suspicion protocol is (excluding the periodic heartbeat multicasts): &lt;br /&gt;&lt;ul&gt;&lt;li&gt; 1 ARE_YOU_DEAD unicast to P &lt;/li&gt;&lt;li&gt; A potential response (I_AM_NOT_DEAD) from P to the coordinator&lt;/li&gt;&lt;/ul&gt;&lt;b&gt;TOTAL COST in a cluster of 100: 2 messages (this is always constant), compared to 19'602 messages before !&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;This is way better than the previous implementation ! &lt;br /&gt;&lt;br /&gt;&lt;br /&gt;[1] &lt;a href="https://jira.jboss.org/browse/JGRP-1241"&gt;https://jira.jboss.org/browse/JGRP-1241&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19835054-291257105449233503?l=belaban.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://belaban.blogspot.com/feeds/291257105449233503/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://belaban.blogspot.com/2010/10/confessions-of-serial-protocol-designer.html#comment-form' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/291257105449233503'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/291257105449233503'/><link rel='alternate' type='text/html' href='http://belaban.blogspot.com/2010/10/confessions-of-serial-protocol-designer.html' title='Confessions of a serial protocol designer'/><author><name>Bela Ban</name><uri>http://www.blogger.com/profile/01830789377474906550</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='28' height='32' src='http://3.bp.blogspot.com/_NUNlUHL8KjI/ST6MrbBdIaI/AAAAAAAAAAM/bu0j0fvc6sI/S220/bela.jpg'/></author><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-19835054.post-8188234903719304747</id><published>2010-09-22T10:04:00.000+02:00</published><updated>2010-09-22T10:04:58.133+02:00</updated><title type='text'>JUDCon 2010 Berlin</title><content type='html'>I'll be giving a talk at JUDCon 2010 (Oct 7 and 8, Berlin) on how to configure JBoss clusters to run optimally in a cloud (EC2).&lt;br /&gt;&lt;br /&gt;It would be cool to see some of you, we can discuss JGroups and other topics over a beer !&lt;br /&gt;&lt;br /&gt;The agenda is &lt;a href="http://www.jboss.org/events/JUDCon/JUDCon2010Berlin/agenda.html"&gt;here.&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Cheers,&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19835054-8188234903719304747?l=belaban.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://belaban.blogspot.com/feeds/8188234903719304747/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://belaban.blogspot.com/2010/09/judcon-2010-berlin.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/8188234903719304747'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/8188234903719304747'/><link rel='alternate' type='text/html' href='http://belaban.blogspot.com/2010/09/judcon-2010-berlin.html' title='JUDCon 2010 Berlin'/><author><name>Bela Ban</name><uri>http://www.blogger.com/profile/01830789377474906550</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='28' height='32' src='http://3.bp.blogspot.com/_NUNlUHL8KjI/ST6MrbBdIaI/AAAAAAAAAAM/bu0j0fvc6sI/S220/bela.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-19835054.post-8061176554338643256</id><published>2010-09-17T07:36:00.001+02:00</published><updated>2010-09-17T07:38:26.891+02:00</updated><title type='text'>Cluster authorization with pattern matching</title><content type='html'>I've added a new plugin to &lt;a href="https://jira.jboss.org/browse/JGRP-206"&gt;AUTH&lt;/a&gt; which allows for pattern matching to determine who can join a cluster.&lt;br /&gt;&lt;br /&gt;The idea is very simple: if a new node wants to join a cluster, we only admit the node into the cluster if it matches a certain pattern. For example, we could only admit nodes whose IP address starts with 192.168.* or 10.5.*. Or we could only admit nodes whose logical name is "groucho" or "marx".&lt;br /&gt;&lt;br /&gt;Currently, the 2 things I match against are IP address and logical name, but of course any attribute of a message could be used to match against.&lt;br /&gt;&lt;br /&gt;Let's take a look at an example.&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&amp;lt;AUTH auth_class="org.jgroups.auth.RegexMembership"&lt;br /&gt;      match_string="groucho | marx"&lt;br /&gt;      match_ip_address="false"&lt;br /&gt;      match_logical_name="true" /&amp;gt;&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;This example uses the new plugin RegexMembership (derived from FixedMembership). Its match string (which takes any regular expression as value) says that any node whose logical name is "marx" or "groucho" will be able to join. Note that we set match_logical_name to true here.&lt;br /&gt;&lt;br /&gt;Note that AUTH has to be placed somewhere below GMS (Group MemberShip) in the configuration.&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&amp;lt;AUTH auth_class="org.jgroups.auth.RegexMembership"&lt;br /&gt;      match_string=&lt;br /&gt;      "192.168.[0-9]{1,3}\.[0-9]{1,3}(:.[0-9]{1,5})?"&lt;br /&gt;      match_ip_address="true"&lt;br /&gt;      match_logical_name="false"  /&amp;gt;&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;This example is a bit more complex, but it essentially says that all nodes whose IP address starts with 192.168 are allowed to join the cluster. So 192.168.1.5 and 192.168.1.10:5546 would pass, while 10.1.4.5 would be rejected.&lt;br /&gt;&lt;br /&gt;I have to admit, I'm not really an expert in regular expression, so I guess the above expression could be simplified. For example, I gave up trying to define that hosts starting &lt;i&gt;either&lt;/i&gt; with 192.168 &lt;i&gt;or&lt;/i&gt; 10.5 could join.&lt;br /&gt;If you know how to do that, please send me the regular expression !&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19835054-8061176554338643256?l=belaban.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://belaban.blogspot.com/feeds/8061176554338643256/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://belaban.blogspot.com/2010/09/cluster-authentication-with-pattern.html#comment-form' title='8 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/8061176554338643256'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/8061176554338643256'/><link rel='alternate' type='text/html' href='http://belaban.blogspot.com/2010/09/cluster-authentication-with-pattern.html' title='Cluster authorization with pattern matching'/><author><name>Bela Ban</name><uri>http://www.blogger.com/profile/01830789377474906550</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='28' height='32' src='http://3.bp.blogspot.com/_NUNlUHL8KjI/ST6MrbBdIaI/AAAAAAAAAAM/bu0j0fvc6sI/S220/bela.jpg'/></author><thr:total>8</thr:total></entry><entry><id>tag:blogger.com,1999:blog-19835054.post-278658579023183052</id><published>2010-08-13T15:09:00.001+02:00</published><updated>2010-08-13T15:11:34.812+02:00</updated><title type='text'>Daisychaining in the clouds</title><content type='html'>I've been working on a new protocol DAISYCHAIN [1] which is based on research out of EPFL [2].&lt;br /&gt;&lt;br /&gt;The idea behind it is that it is inefficient to broadcast a message in clusters where IP multicasting is not available. For example, if we only have TCP available (as is the case in most clouds today), then we have to send a broadcast (or group) message N-1 times. If we want to broadcast M to a cluster of 10, we send the same message 9 times.&lt;br /&gt;&lt;br /&gt;Example: if we have {A,B,C,D,E,F}, and A broadcasts M, then it sends it to B, then to C, then to D etc.&lt;br /&gt;&lt;br /&gt;If we have a 1 GB switch, and M is 1GB, then sending a broadcast to 9 members takes 9 seconds, even if we parallelize the sending of M. This is due to the fact that the link to the switch only sustains 1GB / sec. (Note that I'm conveniently ignoring the fact that the switch will start dropping packets if it is overloaded, causing TCP to retransmit, slowing things down)...&lt;br /&gt;&lt;br /&gt;Let's introduce the concept of a &lt;i&gt;round&lt;/i&gt;. A round is the time it takes to send or receive a message. In the above example, a round takes 1 second if we send 1 GB messages.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;In the existing N-1 approach, it takes X * (N-1) rounds to send X messages to a cluster of N nodes. So to broadcast 10 messages a the cluster of 10, it takes 90 rounds.&lt;br /&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;br /&gt;&lt;b&gt;Enter DAISYCHAIN.&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;The idea is that, instead of sending a message to N-1 members, we only send it to our neighbor, which forwards it to its neighbor, and so on. For example, in {A,B,C,D,E}, D would broadcast a message by forwarding it to E, E forwards it to A, A to B, B to C and C to D. We use a time-to-live field, which gets decremented on every forward, and a message gets discarded when the time-to-live is 0.&lt;br /&gt;&lt;br /&gt;The advantage is that, instead of taxing the link between a member and the switch to send N-1 messages, we distribute the traffic more evenly across the links between the nodes and the switch. Let's take a look at an example, where A broadcasts messages m1 and m2 in cluster {A,B,C,D}, '--&amp;gt;' means sending:&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Traditional N-1 approach&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Round 1: A(m1) --&amp;gt; B&lt;br /&gt;Round 2: A(m1) --&amp;gt; C&lt;br /&gt;Round 3: A(m1) --&amp;gt; D&lt;br /&gt;Round 4: A{m2) --&amp;gt; B&lt;br /&gt;Round 5: A(m2} --&amp;gt; C&lt;br /&gt;Round 6: A(m2) --&amp;gt; D&lt;br /&gt;&lt;br /&gt;It takes 6 rounds to broadcast m1 and m2 to the cluster.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Daisychaining approach&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Round 1: A(m1) --&amp;gt; B&lt;br /&gt;Round 2: A(m2) --&amp;gt; B || B(m1) --&amp;gt; C&lt;br /&gt;Round 3: B(m2) --&amp;gt; C || C(m1) --&amp;gt; D&lt;br /&gt;Round 4: C(m2) --&amp;gt; D&lt;br /&gt;&lt;br /&gt;In round 1, A send m1 to B.&lt;br /&gt;In round 2, A sends m2 to B, but B also forwards m1 (received in round 1) to C.&lt;br /&gt;In round 3, A is done. B forwards m2 to C and C forwards m1 to D(in parallel, denoted by '||').&lt;br /&gt;In round 4, C forwards m2 to D. &lt;br /&gt;&lt;br /&gt;&lt;b&gt;Switch usage&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Let's take a look at this in terms of switch usage: in the N-1 approach, A can only send 125MB/sec, no matter how many members there are in the cluster, so it is constrained by the link capacity to the switch. (Note that A can also &lt;i&gt;receive&lt;/i&gt; 125MB/sec in parallel with today's full duplex links).&lt;br /&gt;&lt;br /&gt;So the link between A and the switch gets hot.&lt;br /&gt;&lt;br /&gt;In the daisychaining approach, link usage is more even: if we look for example at round 2, A sending to B and B sending to C uses 2 different links, so there are no constraints regarding capacity of a link. The same goes for B sending to C and C sending to D.&lt;br /&gt;&lt;br /&gt;In terms of rounds, the daisy chaining approach uses X + (N-2) rounds, so for a cluster size of 10 and broadcasting 10 messages, it requires only &lt;b&gt;18 rounds, compared to 90 for the N-1 approach !&lt;/b&gt; &lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Performance&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;I ran a quick performance test this morning, with 4 nodes connected to a 1 GB switch; and every node sending 1 million 8K messages, for a total of 32GB received by every node. The config used was tcp.xml.&lt;br /&gt;&lt;br /&gt;The N-1 approach yielded a throughput of &lt;b&gt;73 MB/node/sec&lt;/b&gt;, and the daisy chaining approach &lt;b&gt;107MB/node/sec&lt;/b&gt; !&lt;br /&gt;&lt;br /&gt;The change to switch from N-1 to daisy chaining was to place DAISYCHAIN&amp;nbsp;&lt;daisychain&gt; directly on top of &lt;tcp...&gt;TCP.&lt;/tcp...&gt;&lt;/daisychain&gt;&lt;br /&gt;&lt;br /&gt;DAISYCHAIN is still largely experimental, but the numbers above show that it has potential to improve performance in TCP based clusters.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;[1] &lt;a href="https://jira.jboss.org/browse/JGRP-1021"&gt;https://jira.jboss.org/browse/JGRP-1021&lt;/a&gt;&lt;br /&gt;[2] &lt;a href="http://infoscience.epfl.ch/record/149218/files/paper.pdf"&gt;infoscience.epfl.ch/record/149218/files/paper.pdf&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19835054-278658579023183052?l=belaban.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://belaban.blogspot.com/feeds/278658579023183052/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://belaban.blogspot.com/2010/08/daisychaining-in-clouds.html#comment-form' title='8 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/278658579023183052'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/278658579023183052'/><link rel='alternate' type='text/html' href='http://belaban.blogspot.com/2010/08/daisychaining-in-clouds.html' title='Daisychaining in the clouds'/><author><name>Bela Ban</name><uri>http://www.blogger.com/profile/01830789377474906550</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='28' height='32' src='http://3.bp.blogspot.com/_NUNlUHL8KjI/ST6MrbBdIaI/AAAAAAAAAAM/bu0j0fvc6sI/S220/bela.jpg'/></author><thr:total>8</thr:total></entry><entry><id>tag:blogger.com,1999:blog-19835054.post-6853547978270439477</id><published>2010-07-12T14:48:00.000+02:00</published><updated>2010-07-12T14:48:54.589+02:00</updated><title type='text'>JGroups 2.10 final released</title><content type='html'>I'm happy to announce that JGroups 2.10 final has been released. It can be downloaded from &lt;a href="http://sourceforge.net/projects/javagroups/files/"&gt;SourceForge&lt;/a&gt; and contains the following major new features (for a detailed list of the 80+ issues&amp;nbsp; check 2.10 in &lt;a href="http://jira.jboss.com/jira/browse/JGRP"&gt;JIRA&lt;/a&gt;):&lt;br /&gt;&lt;br /&gt;&lt;b&gt;SCOPE: concurrent delivery of messages from the same sender&lt;/b&gt;&lt;br /&gt;[https://jira.jboss.org/browse/JGRP-822]&lt;br /&gt;&lt;br /&gt;By default, messages from a sender P are delivered in the (FIFO) order in which P sent them (ignoring OOB messages for now). However, sometimes it would be beneficial to deliver unrelated messages concurrently, e.g. modifications sent by P for different HTTP sessions.&lt;br /&gt;&lt;br /&gt;SCOPE is a new protocol, which allows a developer to define a scope for a message, and that scope is then used to deliver messages from P concurrently.&lt;br /&gt;&lt;br /&gt;See http://www.jgroups.org/manual/html/user-advanced.html#Scopes for details.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Use of factory to create sockets&lt;/b&gt;&lt;br /&gt;[https://jira.jboss.org/browse/JGRP-278]&lt;br /&gt;&lt;br /&gt;There's now a method Protocol.setSocketFactory(SocketFactory) which allows to set a socket factory, used to create and close datagram and TCP (client and server) sockets. The default implementation keeps track of open sockets, so&lt;br /&gt;./probe.sh socks&lt;br /&gt;dumps a list of open sockets.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;UNICAST2: experimental version of UNICAST based on negative acks&lt;/b&gt;&lt;br /&gt;[https://jira.jboss.org/browse/JGRP-1140]&lt;br /&gt;&lt;br /&gt;By not sending acks for received messages, we can cut down on the number of acks. UNICAST2 is ca 20-30% faster than UNICAST as a result. Needs more testing though, currently UNICAST2 is experimental.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Certain IPv4 addresses should be allowed in an IPv6 stack&lt;/b&gt;&lt;br /&gt;[https://jira.jboss.org/browse/JGRP-1152]&lt;br /&gt;&lt;br /&gt;They will be converted into IPv6 mapped IPv4 addresses. This relaxes the (too restrictive) IP address conformance testing somewhat, and allows for more configurations to actually start the stack and not fail with an exception.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Multiple components using the same channel&lt;/b&gt;&lt;br /&gt;[https://jira.jboss.org/browse/JGRP-1177]&lt;br /&gt;&lt;br /&gt;This is a new light weight version of the (old and dreaded !) Multiplexer, which allows for sharing of channels between components, such as for example HAPartition and Infinispan.&lt;br /&gt;&lt;br /&gt;*** Only to be used by experts ! ***&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;MERGE2: fast merge&lt;/b&gt;&lt;br /&gt;[https://jira.jboss.org/browse/JGRP-1191]&lt;br /&gt;&lt;br /&gt;Fast merge in case where we receive messages from a member which is not part of our group, but has the same group name.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;RpcDispatcher / MessageDispatcher: add exclusion list&lt;/b&gt;&lt;br /&gt;[https://jira.jboss.org/browse/JGRP-1192]&lt;br /&gt;&lt;br /&gt;If an RPC needs to be sent to all nodes in a cluster except one node (e.g. the sender itself), then we can simply exclude the sender. This is done using&lt;br /&gt;RequestOptions.setExclusionList(Address ...&amp;nbsp; xcluded_mbrs).&lt;br /&gt;This is simpler than having to create the full list, and remove the sender.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Ability to use keywords instead of IP addresses&lt;/b&gt;&lt;br /&gt;[https://jira.jboss.org/browse/JGRP-1204]&lt;br /&gt;&lt;br /&gt;Whenever IP addresses (symbolic or dotted-decimal notation) are used, we can now use a keyword instead. Currently, the keywords are "GLOBAL" (public IP address), "SITE_LOCAL" (private IP address), "LINK_LOCAL" (link local), "LOOPBACK" (a loopback address) and "NON_LOOPBACK" (any but a loopback address).&lt;br /&gt;This is useful in cloud environments where IP address may not be known beforehand.&lt;br /&gt;Example: java -Djgroups.bind_addr=SITE_LOCAL&lt;br /&gt;Example: &lt;udp ...="" bind_addr="GLOBAL"&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;GossipRouter: re-introduce pinging to detect crashed clients&lt;/b&gt;&lt;br /&gt;[https://jira.jboss.org/browse/JGRP-1213]&lt;br /&gt;&lt;br /&gt;When clients are terminated without closing of sockets (e.g. in virtualized environments), they'd cause their&lt;br /&gt;entries to not be removed in GossipRouter. This was changed by (re-)introducing pinging.&lt;/udp&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Feeback is appreciated via the usual channels (mailing list, IRC) !&lt;br /&gt;Enjoy !&lt;br /&gt;&lt;br /&gt;Bela Ban&lt;br /&gt;Vladimir Blagojevic&lt;br /&gt;Richard Achmatowicz&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19835054-6853547978270439477?l=belaban.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://belaban.blogspot.com/feeds/6853547978270439477/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://belaban.blogspot.com/2010/07/jgroups-210-final-released.html#comment-form' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/6853547978270439477'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/6853547978270439477'/><link rel='alternate' type='text/html' href='http://belaban.blogspot.com/2010/07/jgroups-210-final-released.html' title='JGroups 2.10 final released'/><author><name>Bela Ban</name><uri>http://www.blogger.com/profile/01830789377474906550</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='28' height='32' src='http://3.bp.blogspot.com/_NUNlUHL8KjI/ST6MrbBdIaI/AAAAAAAAAAM/bu0j0fvc6sI/S220/bela.jpg'/></author><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-19835054.post-8595221037841448770</id><published>2010-07-09T17:32:00.000+02:00</published><updated>2010-07-09T17:32:13.365+02:00</updated><title type='text'>mod-cluster webinar: video available on vimeo</title><content type='html'>On July 7th, I did a webinar on &lt;a href="http://www.jboss.org/mod_cluster"&gt;mod-cluster&lt;/a&gt;, and it was a huge success: 1215 people signed up and 544 attended the webinar ! I'm told that this is the second highest turnout ever for Red Hat (the highest being an xvirt webinar a couple of years ago, with 600 attendees)...&lt;br /&gt;&lt;br /&gt;For those who missed the webex presentation, &lt;a href="http://www.vimeo.com/13180921"&gt;here&lt;/a&gt;'s the link to the recorded video. For those who only want to see the demo, it is &lt;a href="http://www.vimeo.com/13189666"&gt;here&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;The demo is really cool: I set up a huge cluster in the cloud, spanning GoGrid, EC2 and Rackspace as clouds, and fronting a JBoss 6 based cluster with mod-cluster.&lt;br /&gt;&lt;br /&gt;I showed how cluster nodes dynamically register themselves with httpd, or de-register when shutting down, and how web applications get registered/de-registered.&lt;br /&gt;&lt;br /&gt;For those who know mod-jk: no more workers.properties or uriworkmap.properties are needed !&lt;br /&gt;&lt;br /&gt;The coolest part was where I ran a load test, simulating 80 clients, each creating and destroying a session every 30 seconds: initially I ran 2 cluster nodes on EC2, so every node had 40 sessions on average. Then I started another EC2 instance, a GoGrid instance and 2 Rackspace instances, and after a few minutes, there were 3 mod-cluster domains with 3, 1 and 2 servers respectively, and every server had ca 12 sessions on average !&lt;br /&gt;&lt;br /&gt;This can be compared to a bookshop, which spins up additional servers in the cloud around the holidays to serve increased traffic, and where the servers form a cluster for redundancy (don't want to lose your shoppig cart !).&lt;br /&gt;&lt;br /&gt;Enjoy the demo, and give us feedback on mod-cluster on the &lt;a href="https://lists.jboss.org/mailman/listinfo/mod_cluster-dev"&gt;mailing list&lt;/a&gt; or &lt;a href="http://community.jboss.org/en/mod_cluster?view=discussions"&gt;forum&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Bela&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19835054-8595221037841448770?l=belaban.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://belaban.blogspot.com/feeds/8595221037841448770/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://belaban.blogspot.com/2010/07/mod-cluster-webinar-video-available-on.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/8595221037841448770'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/8595221037841448770'/><link rel='alternate' type='text/html' href='http://belaban.blogspot.com/2010/07/mod-cluster-webinar-video-available-on.html' title='mod-cluster webinar: video available on vimeo'/><author><name>Bela Ban</name><uri>http://www.blogger.com/profile/01830789377474906550</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='28' height='32' src='http://3.bp.blogspot.com/_NUNlUHL8KjI/ST6MrbBdIaI/AAAAAAAAAAM/bu0j0fvc6sI/S220/bela.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-19835054.post-7132304143891412085</id><published>2010-05-07T11:49:00.000+02:00</published><updated>2010-05-07T11:49:22.225+02:00</updated><title type='text'>JBossWorld in Boston and bike riding in California</title><content type='html'>I'll be &lt;a href="http://www.redhat.com/promo/summit/2010/sessions/jboss.html#553214"&gt;talking&lt;/a&gt; about &lt;a href="http://www.jboss.org/mod_cluster"&gt;mod-cluster&lt;/a&gt; at &lt;a href="http://www.jbossworld.com/"&gt;JBossWorld&lt;/a&gt; this June. It was a good talk last year, and I've spiced up the demo even more: I'm going to show 2 Apache httpd instances running in different clouds, and 3 domains of JBoss instances, also running in 3 different clouds (GoGrid, Amazon EC2 and Rackspace).&lt;br /&gt;&lt;br /&gt;This will be a fun talk, showing the practical aspects of clouds, and not focusing on the hype (I leave that to marketing :-)).&lt;br /&gt;&lt;br /&gt;This led to some changes in JGroups, which I'll talk about in my next blog post.&lt;br /&gt;&lt;br /&gt;It would be cool to see some of you at JBW !&lt;br /&gt;&lt;br /&gt;After that, I'll fly to the best place in the US: the Bay Area ! I'll be there June 25 until July 2nd and will rent a race bike, to ride my 5 favorite rides (from the time when I lived in San Jose). A friend will join me for some insane riding (he's preparing for the &lt;a href="http://www.deathride.com/"&gt;Death Ride&lt;/a&gt;), so this will definitely be fun !&lt;br /&gt;&lt;br /&gt;Now let's just hope that some unknown volcano in Iceland doesn't stop me from making the trip to the US ! :-)&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19835054-7132304143891412085?l=belaban.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://belaban.blogspot.com/feeds/7132304143891412085/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://belaban.blogspot.com/2010/05/jbossworld-in-boston-and-bike-riding-in.html#comment-form' title='10 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/7132304143891412085'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/7132304143891412085'/><link rel='alternate' type='text/html' href='http://belaban.blogspot.com/2010/05/jbossworld-in-boston-and-bike-riding-in.html' title='JBossWorld in Boston and bike riding in California'/><author><name>Bela Ban</name><uri>http://www.blogger.com/profile/01830789377474906550</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='28' height='32' src='http://3.bp.blogspot.com/_NUNlUHL8KjI/ST6MrbBdIaI/AAAAAAAAAAM/bu0j0fvc6sI/S220/bela.jpg'/></author><thr:total>10</thr:total></entry><entry><id>tag:blogger.com,1999:blog-19835054.post-6035243150778025607</id><published>2010-03-27T09:33:00.000+01:00</published><updated>2010-03-27T09:33:12.812+01:00</updated><title type='text'>Scopes: making message delivery in JGroups more concurrent</title><content type='html'>In JGroups, messages are delivered in the order in which they were sent by a given member. So when member X sends messages 1-3 to the cluster, then everyone will deliver them in the order X1 -&amp;gt; X2 -&amp;gt; X3 ('-&amp;gt;' means 'followed by').&lt;br /&gt;&lt;br /&gt;When a different member Y delivers messages 4-6, then they will get delivered in parallel to X's messages ('||' means 'parallel to'):&lt;br /&gt;&lt;b&gt;X1 -&amp;gt; X2 -&amp;gt; X3 || Y4 -&amp;gt; Y5 -&amp;gt; Y6&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;This is good, but what if X has 100 HTTP sessions and performs session replication ?&lt;br /&gt;&lt;br /&gt;All modifications to the sessions are sent to the cluster, and will get delivered in the order in which they were performed.&lt;br /&gt;&lt;br /&gt;The problem here is that even updates to &lt;i&gt;different&lt;/i&gt; sessions will be ordered, e.g. if X updates sessions A, B and C, then we could end up with the following delivery order (X is omitted for brevity):&lt;br /&gt;A1 -&amp;gt; A2 -&amp;gt; B1 -&amp;gt; A3 -&amp;gt; C1 -&amp;gt; C2 -&amp;gt; C3&lt;br /&gt;&lt;br /&gt;This means that update 1 to session C has to wait until updates A1-3 and B1 have been processed; in other words, an update has to wait until all updates ahead of it in the queue have been processed !&lt;br /&gt;&lt;br /&gt;This unnecessarily delays updates: since updates to A, B and C and &lt;i&gt;unrelated&lt;/i&gt;, we could deliver them in parallel, e.g.:&lt;br /&gt;&lt;br /&gt;&lt;b&gt;A1 -&amp;gt; A2 -&amp;gt; A3 || B1 || C1 -&amp;gt; C2 -&amp;gt; C3&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;This means that all updates to A are delivered in order, but parallel to updates to B and updates to C.&lt;br /&gt;&lt;br /&gt;How is this done ? Enter the &lt;a href="http://www.jgroups.org/manual/html/user-advanced.html#Scopes"&gt;SCOPE&lt;/a&gt; protocol.&lt;br /&gt;&lt;br /&gt;SCOPE delivers messages&amp;nbsp; in the order in which they were sent within a given scope. Place it somewhere above NAKACK and UNICAST (or SEQUENCER).&lt;br /&gt;&lt;br /&gt;To give a message a scope, simply use Message.setScope(short). The argument should be as unique as possible, to prevent collisions.&lt;br /&gt;&lt;br /&gt;The use case described above is actually for real, and we anticipate using this feature in HTTP session replication / distribution in the JBoss application server !&lt;br /&gt;&lt;br /&gt;More detailed documentation of&amp;nbsp; scopes can be found at [1]. Configuration of the SCOPE protocol is described in [2].&lt;br /&gt;&lt;br /&gt;This is yet an experimental feature, so feedback is appreciated !&lt;br /&gt;&lt;br /&gt;[1] &lt;a href="http://www.jgroups.org/manual/html/user-advanced.html#Scopes"&gt;Scopes&lt;/a&gt;&lt;br /&gt;[2] The &lt;a href="http://www.jgroups.org/manual/html/protlist.html#SCOPE"&gt;SCOPE&lt;/a&gt; protocol&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19835054-6035243150778025607?l=belaban.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://belaban.blogspot.com/feeds/6035243150778025607/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://belaban.blogspot.com/2010/03/scopes-making-message-delivery-in.html#comment-form' title='8 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/6035243150778025607'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/6035243150778025607'/><link rel='alternate' type='text/html' href='http://belaban.blogspot.com/2010/03/scopes-making-message-delivery-in.html' title='Scopes: making message delivery in JGroups more concurrent'/><author><name>Bela Ban</name><uri>http://www.blogger.com/profile/01830789377474906550</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='28' height='32' src='http://3.bp.blogspot.com/_NUNlUHL8KjI/ST6MrbBdIaI/AAAAAAAAAAM/bu0j0fvc6sI/S220/bela.jpg'/></author><thr:total>8</thr:total></entry><entry><id>tag:blogger.com,1999:blog-19835054.post-795957057496384422</id><published>2010-03-05T15:32:00.000+01:00</published><updated>2010-03-05T15:32:41.078+01:00</updated><title type='text'>Status report: performance of JGroups 2.10.0.Alpha2</title><content type='html'>I've already improved (mainly unicast) performance in Alpha1, a short list is:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;BARRIER: moved lock acquired by every up-message out of the critical path&lt;/li&gt;&lt;li&gt;IPv6: just running a JGroups channel without any system props (e.g. java.net.preferIPv4Stack=true) now works, as IPv4 addresses are mapped to IP4-mapped IPv6 addresses under IPv6&lt;/li&gt;&lt;li&gt;NAKACK and UNICAST: streamlined marshalling of headers, drastically reducing the number of bytes streamed when marshalling headers&lt;/li&gt;&lt;li&gt;TCPGOSSIP: Vladimir fixed a bug in RouterStub which caused GossipRouters to return incorrect membership lists, resulting in JOIN failures&lt;/li&gt;&lt;li&gt;TP.Bundler:&lt;/li&gt;&lt;ul&gt;&lt;li&gt;Provided a new bundler implementation, which is faster than the default one (the new *is* actually the default in 2.10)&lt;/li&gt;&lt;li&gt;Sending of message lists (bundling): we don't ship the dest and src address for each message, but only ship them *once* for the entire list&lt;/li&gt;&lt;/ul&gt;&lt;li&gt;AckReceiverWindow (used by UNICAST): I made this almost lock-free, so concurrent messages to the same recipient don't compete for the same lock. Should be a nice speedup for multiple unicasts to the same sender (e.g. OOB messages)&lt;/li&gt;&lt;/ul&gt;The complete list of features is at [1].&lt;br /&gt;&lt;br /&gt;In 2.10.0.Alpha2 (that's actually the current CVS trunk), I replaced strings as header names with IDs [2]. This means that for each header, instead of marshalling "UNICAST" as a moniker for the UnicastHeader, we marshal a short.&lt;br /&gt;&lt;br /&gt;The string (assuming a single-byte charset) uses up 9 bytes, whereas the short uses 2 bytes. We usually have 3-5 headers per message, so that's an average of 20-30 bytes saved per message. If we send 10 million messages, those saving accumulate !&lt;br /&gt;&lt;br /&gt;Not only does this change make the marshalled message smaller, it also means that a message kept in memory has a smaller footprint: as messages are kept in memory until they're garbage collected by STABLE (or ack'ed by UNICAST), the savings are really nice...&lt;br /&gt;&lt;br /&gt;The downside ? It's an API change for protocol implementers: methods getHeader(), putHeader() and putHeaderIfAbsent() in Message changed from taking a string to taking a short. Plus, if you implement headers, you have to register them in jg-magic-map.xml / jg-protocol-ids.xml and implement Streamable...&lt;br /&gt;&lt;br /&gt;Now for some performance numbers. This is a quick and dirty benchmark, without many data points...&lt;br /&gt;&lt;br /&gt;perf.Test (see [3] for details) has N senders send M messages of S size to all cluster nodes. This exercises the NAKACK code.&lt;br /&gt;&lt;br /&gt;On my home cluster (4 blades with 4 cores each), 1GB ethernet, sending 1000-byte messages:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;4 senders, JGroups 2.9.0.GA:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 128'000 messages / sec / member&lt;/li&gt;&lt;li&gt;4 senders, JGroups 2.10.0.Alpha2: 137'000 messages / sec / member&lt;/li&gt;&lt;li&gt;6 senders, JGroups 2.10.0.Alpha2: 100'000 messages / sec /member&lt;/li&gt;&lt;li&gt;8 senders, JGroups 2.10.0.Alpha2:&amp;nbsp; 78'000 messages / sec / member &lt;/li&gt;&lt;/ul&gt;2.10.0.Alpha2 is ca 7% faster for 4 members.&lt;br /&gt;&lt;br /&gt;There is also a stress test for unicasts, UnicastTestRpcDist. It mimicks DIST mode of &lt;a href="http://www.infinispan.org/"&gt;Infinispan&lt;/a&gt; and has every member invoke 20'000 requests on 2 members; 80% of those requests are GETs (simple RPCs) and 20% are PUTs (2 RPCs in parallel). All RPCs are synchronous, so the caller always waits for the result and thus blocks for the roud trip time. Every member has 25 threads invoking the RPCs concurrently.&lt;br /&gt;&lt;br /&gt;On my home network, I got the following numbers:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;4 members, JGroups 2.9.0.GA:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 4'500 requests / sec / member&lt;/li&gt;&lt;li&gt;4 members, JGroups 2.10.0.Alpha2: 5'700 requests / sec / member&lt;/li&gt;&lt;/ul&gt;&lt;ul&gt;&lt;li&gt;6 members, JGroups 2.9.0.GA:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 4'000 requests / sec / member&lt;/li&gt;&lt;li&gt;6 members, JGroups 2.10.0.Alpha2: 5'000 requests / sec / member&lt;/li&gt;&lt;/ul&gt;&lt;ul&gt;&lt;li&gt;8 members, JGroups 2.9.0.GA:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 3'800 requests / sec / member&lt;/li&gt;&lt;li&gt;8 members, JGroups 2.10.0.Alpha2: 4'300 requests / sec / member&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;In our Atlanta lab (faster boxes), I got (unfortunately only for 2.10.0.Alpha2):&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;4 members, JGroups 2.10.0.Alpha2: 10'900 requests / sec / member&lt;br /&gt;&lt;/li&gt;&lt;li&gt;6 members, JGroups 2.10.0.Alpha2: 10'900 requests / sec / member&lt;/li&gt;&lt;li&gt;8 members, JGroups 2.10.0.Alpha2: 10'900 requests / sec / member&lt;/li&gt;&lt;/ul&gt;Since the focus of the first half of 2.10.0 was on improving unicast performance, the numbers above are already pretty good and show (at least for up to 8 members) linear scalability.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;[1] &lt;a href="https://jira.jboss.org/jira/secure/IssueNavigator.jspa?reset=true&amp;amp;pid=10053&amp;amp;fixfor=12314411"&gt;https://jira.jboss.org/jira/secure/IssueNavigator.jspa?reset=true&amp;amp;pid=10053&amp;amp;fixfor=12314411&lt;/a&gt;&lt;br /&gt;[2] &lt;a href="https://jira.jboss.org/jira/browse/JGRP-932"&gt;https://jira.jboss.org/jira/browse/JGRP-932&lt;/a&gt;&lt;br /&gt;[3] &lt;a href="http://community.jboss.org/docs/DOC-11594"&gt;http://community.jboss.org/docs/DOC-11594&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19835054-795957057496384422?l=belaban.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://belaban.blogspot.com/feeds/795957057496384422/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://belaban.blogspot.com/2010/03/status-report-performance-of-jgroups.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/795957057496384422'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/795957057496384422'/><link rel='alternate' type='text/html' href='http://belaban.blogspot.com/2010/03/status-report-performance-of-jgroups.html' title='Status report: performance of JGroups 2.10.0.Alpha2'/><author><name>Bela Ban</name><uri>http://www.blogger.com/profile/01830789377474906550</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='28' height='32' src='http://3.bp.blogspot.com/_NUNlUHL8KjI/ST6MrbBdIaI/AAAAAAAAAAM/bu0j0fvc6sI/S220/bela.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-19835054.post-7139254680056343496</id><published>2009-12-21T16:56:00.000+01:00</published><updated>2009-12-21T16:56:18.356+01:00</updated><title type='text'>JGroups 2.8.0.GA released</title><content type='html'>I'm happy to announce that JGroups 2.8.0 is finally GA !&lt;br /&gt;&lt;br /&gt;It has taken us almost a year since the last major release (2.7 was released in January), but to our defense 2.8.0.GA contains a lot of new features and I think they are worth the wait. We also released a number of 2.6.x versions in 2009, which are used in the JBoss Enterprise Application Platform (EAP).&lt;br /&gt;&lt;br /&gt;Before I get into a summary of some of the new features (a detailed list can be found at [1]), I'd like to thank all the developers, users and contributors of JGroups. Without this healthy community, producing code, bug reports, patches, documentation and user stories, JGroups wouldn't be anywhere close to where it is today !&lt;br /&gt;&lt;br /&gt;So a big thanks to everyone involved, Happy Holidays and a great start into 2010 !&lt;br /&gt;&lt;br /&gt;Here's a short list of features that made it into 2.8.0.GA (&lt;a href="http://javagroups.cvs.sourceforge.net/viewvc/javagroups/JGroups/doc/ReleaseNotes-2.8.txt?revision=1.10&amp;amp;view=markup&amp;amp;pathrev=Branch_JGroups_2_8"&gt;here&lt;/a&gt; are the release notes):&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Logical addresses: decouples physical addresses (which can change) from logical ones. Eliminates reincarnation issues. This alone is worth 2.8, as it eliminates a big source of problems !&lt;br /&gt;&lt;/li&gt;&lt;li&gt;Logical names: allow for meaningful channel names, logical names stay with a channel for its lifetime, even after reconnecting it&lt;/li&gt;&lt;li&gt;Improved merging / no more shunning: shunning was replaced by merging. Now we have a much simpler model: JOIN - LEAVE - MERGE. The merging algorithm was improved to take 'weird' (e.g. asymmetric) merges into account&lt;/li&gt;&lt;li&gt;Better IPv6 support&lt;/li&gt;&lt;li&gt;Better support for defaults for addresses: based on the type of the stack (IPv4, IPv6), we perform sanity checks and set default addresses of the correct type&lt;/li&gt;&lt;li&gt;FILE_PING / S3_PING: new discovery protocols, file-based and Amazon S3 based. The latter protocol can be used as a replacement for GossipRouter on EC2&lt;/li&gt;&lt;li&gt;Speaking of which: major overhaul of GossipRouter&lt;/li&gt;&lt;li&gt;Ability to have multiple protocols of the same class in the same stack&lt;/li&gt;&lt;li&gt;Ability to override message bundling on a per-message basis&lt;/li&gt;&lt;li&gt;Much improved and faster UNICAST&lt;/li&gt;&lt;li&gt;XSD schema for protocol configurations&lt;/li&gt;&lt;li&gt;STREAMING_STATE_TRANSFER now doesn't need to use TCP, but can also use the configured transport, e.g. UDP&lt;/li&gt;&lt;li&gt;RpcDispatcher: additional methods returning a Future rather than blocking&lt;/li&gt;&lt;li&gt;Probe.sh: ability to invoke methods cluster-wide. E.g. run message stability on all nodes: &lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;probe.sh invoke=STABLE.runMessageGarbageCollection&lt;/span&gt;&lt;/li&gt;&lt;li&gt;Logging&lt;/li&gt;&lt;ul&gt;&lt;li&gt;Removal of commons-logging.jar: JGroups now has &lt;b&gt;ZERO&lt;/b&gt; dependencies !&lt;/li&gt;&lt;li&gt;Configure logging level at runtime, e.g. through JMX (jconsole) or probe.sh, or programmatically. Use case: set logging for NAKACK from "warn" to "trace" for a unit test, then reset it back to "warn"&lt;/li&gt;&lt;li&gt;Ability to set custom log provider. This allows for support of new logging frameworks (JGroups ships with support for log4j and JDK logging)&lt;/li&gt;&lt;/ul&gt;&lt;/ul&gt;Enjoy !&lt;br /&gt;Bela, Vladimir and Richard&lt;br /&gt;&lt;br /&gt;[1] &lt;a href="http://javagroups.cvs.sourceforge.net/viewvc/javagroups/JGroups/doc/ReleaseNotes-2.8.txt?revision=1.10&amp;amp;view=markup&amp;amp;pathrev=Branch_JGroups_2_8"&gt;http://javagroups.cvs.sourceforge.net/viewvc/javagroups/JGroups/doc/ReleaseNotes-2.8.txt?revision=1.10&amp;amp;view=markup&amp;amp;pathrev=Branch_JGroups_2_8&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;[2] &lt;a href="http://community.jboss.org/wiki/Support"&gt;http://community.jboss.org/wiki/Support &lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19835054-7139254680056343496?l=belaban.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://belaban.blogspot.com/feeds/7139254680056343496/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://belaban.blogspot.com/2009/12/jgroups-280ga-released.html#comment-form' title='10 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/7139254680056343496'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/7139254680056343496'/><link rel='alternate' type='text/html' href='http://belaban.blogspot.com/2009/12/jgroups-280ga-released.html' title='JGroups 2.8.0.GA released'/><author><name>Bela Ban</name><uri>http://www.blogger.com/profile/01830789377474906550</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='28' height='32' src='http://3.bp.blogspot.com/_NUNlUHL8KjI/ST6MrbBdIaI/AAAAAAAAAAM/bu0j0fvc6sI/S220/bela.jpg'/></author><thr:total>10</thr:total></entry><entry><id>tag:blogger.com,1999:blog-19835054.post-2359564883097921168</id><published>2009-11-05T10:53:00.000+01:00</published><updated>2009-11-05T10:53:12.381+01:00</updated><title type='text'>IPv6 addresses in JGroups</title><content type='html'>I finished code to support scoped IPv6 link local addresses [1]. A link local address is an address that's not guaranteed to be unique on a given host (althougbh in most cases it will be), so it can be assigned on different interfaces of the same host.&lt;br /&gt;&lt;br /&gt;To differentiate between interfaces, a scope-id can be added, e.g. fe80::216:cbff:fea9:c3b5&lt;b&gt;%en0&lt;/b&gt; or fe80::216:cbff:fea9:c3b5&lt;b&gt;%3&lt;/b&gt;, where the %X suffix denotes the interface.&lt;br /&gt;&lt;br /&gt;Note that this is only relevant for TCP sockets, multicast or datagram sockets are not affected.&lt;br /&gt;&lt;br /&gt;Now, on the server side, we can bind to a scoped or unscoped link-local socket, e.g.&lt;br /&gt;&lt;br /&gt;ServerSocket srv_sock=new ServerSocket(7500, 50, InetAddress.getByName("fe80::216:cbff:fea9:c3b5"))&lt;br /&gt;&lt;br /&gt;binds to an unscoped link-local address, and&lt;br /&gt;&lt;br /&gt;ServerSocket srv_sock=new ServerSocket(7500, 50, InetAddress.getByName("fe80::216:cbff:fea9:c3b5%en0"))&lt;br /&gt;&lt;br /&gt;binds to the scoped equivalent.&lt;br /&gt;&lt;br /&gt;This is all fine, but on the client side, we cannot use scoped link-local addresses, e.g.&lt;br /&gt;&lt;br /&gt;Socket sock=new Socket(InetAddress.getByName("fe80::216:cbff:fea9:c3b5%en0"), 7500)&lt;br /&gt;&lt;br /&gt;fails !&lt;br /&gt;&lt;br /&gt;The reason is that a scope-id "en0" does not mean anything on a client, which might run on a different host.&lt;br /&gt;&lt;br /&gt;The correct code is&lt;br /&gt;&lt;br /&gt;Socket sock=new Socket(InetAddress.getByName("fe80::216:cbff:fea9:c3b5"), 7500),&lt;br /&gt;&lt;br /&gt;with the scope-id removed.&lt;br /&gt;&lt;br /&gt;JGroups runs into this problem, too: whenever we have a bind_addr which is a scoped link-local IPv6 address, certain discovery protocols (e.g. MPING, TCPGOSSIP) will return the scoped addresses, and the joiners will then try to connect to the existing members using the scoped addresses.&lt;br /&gt;&lt;br /&gt;To fix this, all Socket.connect() calls in JGroups have been replaced with Util.connect(Socket, SocketAddress, port). This method checks for scoped link-local IPv6 addresses and simply removes the scope-id from the destination address, so the connect() call will work.&lt;br /&gt;&lt;br /&gt;Note that this problem doesn't occur with global IPv6 addresses.&lt;br /&gt;&lt;br /&gt;I need to test whether this solution works on other operating systems, too, .e.g. on Windows, Solaris and MacOS.&lt;br /&gt;&lt;br /&gt;OK, I'm off to &lt;a href="http://www.davidoffswissindoors.ch/"&gt;http://www.davidoffswissindoors.ch&lt;/a&gt;, hope to see some good tennis !&lt;br /&gt;&lt;br /&gt;[1] &lt;a href="http://www.jboss.org/community/wiki/IPv6"&gt;http://www.jboss.org/community/wiki/IPv6&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19835054-2359564883097921168?l=belaban.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://belaban.blogspot.com/feeds/2359564883097921168/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://belaban.blogspot.com/2009/11/ipv6-addresses-in-jgroups.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/2359564883097921168'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/2359564883097921168'/><link rel='alternate' type='text/html' href='http://belaban.blogspot.com/2009/11/ipv6-addresses-in-jgroups.html' title='IPv6 addresses in JGroups'/><author><name>Bela Ban</name><uri>http://www.blogger.com/profile/01830789377474906550</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='28' height='32' src='http://3.bp.blogspot.com/_NUNlUHL8KjI/ST6MrbBdIaI/AAAAAAAAAAM/bu0j0fvc6sI/S220/bela.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-19835054.post-4363656080594781454</id><published>2009-10-28T10:31:00.000+01:00</published><updated>2009-10-28T10:31:42.924+01:00</updated><title type='text'>JGroups 2.8.0.CR3 released</title><content type='html'>Unfortunately, a little later than estimated, but better late than never ! The reason is that I got side tracked by EAP 5 performance testing and also by the good feedback from the community (you !) on CR2, and the associated bug reports.&lt;br /&gt;&lt;br /&gt;This &lt;a href="https://jira.jboss.org/jira/secure/IssueNavigator.jspa?reset=true&amp;amp;pid=10053&amp;amp;fixfor=12312047"&gt;version&lt;/a&gt; contains bug fixes, and mostly work around IPv6 versus IPv4 addresses. We now try to be smart and attempt to find out the type of stack used, and then default undefined IP addresses to addresses of the correct type. Note that IPv6 support is not yet 100% done, I'm continuing to work on this for either CR4 or GA. More on this topic in a later post...&lt;br /&gt;&lt;br /&gt;CR3 also added a new feature, which is marshaller pools in the transport. When we send messages, they're either bundled and sent as a batch of messages, or not. In either case, the marshalling of a message or message list is done in an output buffer for which we have to acquire a lock. When we have heavy message sending, e.g. through multiple sender threads, that lock is heavily contended.&lt;br /&gt;&lt;br /&gt;Not to say this is a big issue because the sender side is almost never the culprit in slow performance (the receiver side is !), but I've introduced a marshaller pool, which provides N output streams (default=2) rather than 1. The property marshaller_pool_size defines how many output streams we want in the pool and marshaller_pool_initial_size the initial size of each output stream (in bytes).&lt;br /&gt;&lt;br /&gt;Note that, for UDP, each output stream can grow up to 63535 bytes, so take that into account when allocating a large number of streams. &lt;br /&gt;&lt;br /&gt;In my perf tests, I haven't found that increasing the pool size makes a difference to performance, but if you use many threads which send messages concurrently, this does make a difference.&lt;br /&gt;&lt;br /&gt;2.8.0.CR3 can be downloaded from &lt;a href="http://sourceforge.net/projects/javagroups/files/JGroups/2.8.0.CR3"&gt;http://sourceforge.net/projects/javagroups/files/JGroups/2.8.0.CR3&lt;/a&gt;.&lt;br /&gt;Enjoy !&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19835054-4363656080594781454?l=belaban.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://belaban.blogspot.com/feeds/4363656080594781454/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://belaban.blogspot.com/2009/10/jgroups-280cr3-released.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/4363656080594781454'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/4363656080594781454'/><link rel='alternate' type='text/html' href='http://belaban.blogspot.com/2009/10/jgroups-280cr3-released.html' title='JGroups 2.8.0.CR3 released'/><author><name>Bela Ban</name><uri>http://www.blogger.com/profile/01830789377474906550</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='28' height='32' src='http://3.bp.blogspot.com/_NUNlUHL8KjI/ST6MrbBdIaI/AAAAAAAAAAM/bu0j0fvc6sI/S220/bela.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-19835054.post-2341756750098693376</id><published>2009-09-18T13:08:00.002+02:00</published><updated>2009-09-18T13:37:51.967+02:00</updated><title type='text'>JGroups 2.6.13.CR2 released</title><content type='html'>OK, going from CR1 to CR2 doesn't seem like a big deal, and certainly not worth posting as a blog entry ?&lt;br /&gt;&lt;br /&gt;You might wonder if I have nothing better to do (like &lt;a href="http://belaban.blogspot.com/2009/07/bike-tour-nice.html"&gt;biking in the French Alps&lt;/a&gt;) :-)&lt;br /&gt;&lt;br /&gt;But actually, there have been significant changes since CR1, so please read on !&lt;br /&gt;&lt;br /&gt;CR2 only contains 3 JIRA issues:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;&lt;a href="https://jira.jboss.org/jira/browse/JGRP-1034"&gt;Backport of NAKACK from head&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="https://jira.jboss.org/jira/browse/JGRP-1033"&gt;Backport of UNICAST from head&lt;/a&gt; and&lt;/li&gt;&lt;li&gt;&lt;a href="https://jira.jboss.org/jira/browse/JGRP-1043"&gt;Removal of UNICAST contention&lt;/a&gt; issues&lt;/li&gt;&lt;/ol&gt;#1 is a partial backport of NAKACK from head (2.8) to the 2.6 branch. This version doesn't acquire locks for incoming messages anymore, but uses a CAS (compare-and-swap) operation to decide whether to process a message, or not.&lt;br /&gt;&lt;br /&gt;What used to happen when a message from P is received is that we grabbed the receiver window for P and added the message. Then we grabbed the lock associated with P's window and - once acquired - removed as many messages as possible and passed them up to the application &lt;span style="font-weight: bold;"&gt;sequentially&lt;/span&gt;. Sequential order is always respected unless a message is tagged as OOB (out-of-band).&lt;br /&gt;&lt;br /&gt;So here's what happened: say we received 10 multicast messages from B and 3 from A. Both A's and B's messages would be delivered in parallel with respect to each other, but sequentially for a given sender. So A's message #34 would always get delivered before #35 before #36 and so on...&lt;br /&gt;&lt;br /&gt;However, say we have to process 10 messages from B: 1 2 3 4 5 6 7 8 9 10:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Every message would get into NAKACK on a separate thread&lt;/li&gt;&lt;li&gt;All the 10 messages would get added into B's receiver window&lt;/li&gt;&lt;li&gt;The thread with message #3 would grab the lock&lt;/li&gt;&lt;li&gt;All other threads would block, trying to acquire the lock&lt;/li&gt;&lt;li&gt;The thread with the lock would remove #1 and pass it up the stack, then #2, then #3 and so on, until it passed #10 up the stack to the application&lt;/li&gt;&lt;li&gt;Now it releases the lock&lt;/li&gt;&lt;li&gt;All other 9 threads now compete for the lock, but every single thread will return because there are no more messages in the receiver window&lt;/li&gt;&lt;/ul&gt;This is a terrible waste: we've wasted 9 threads; for the duration of removing and passing up 10 messages, these threads could have been put to better use, e.g. processing other messages !&lt;br /&gt;&lt;br /&gt;For example, if our total thread pool only had 10 threads, and 1 of them was processing messages and 9 were blocked on lock acquisition, if a message from a different sender came in (which could be delivered in parallel to B's messages), then no thread would be available !&lt;br /&gt;&lt;br /&gt;So the simple but effective change was to replace the lock on the receive window with a CAS: when a thread tries to remove messages, it simply set the CAS from false to true. If it succeed, it goes into the removal loop and sets the CAS back to false when done. Else, the thread simply returns because it knows that someone else will be processing the message it just added.&lt;br /&gt;&lt;br /&gt;Result: we've returned 9 threads to the thread pool, ready to serve other messages, without even locking !&lt;br /&gt;&lt;br /&gt;The net affect is faster performance and smaller thread pools. As a rule of thumb, a thread pool's max threads can now be around the number of cluster nodes: if every node sends messages, we only need 1 thread per sender to process all of the sender's messages...&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;#2 has 2 changes: same as above (locks replaced by CAS) and the changes outlined in the &lt;a href="http://javagroups.cvs.sourceforge.net/viewvc/javagroups/JGroups/doc/design/UNICAST.new.txt?revision=1.6&amp;amp;view=markup"&gt;design  document&lt;/a&gt;. The latter changes simplify UNICAST a lot and also handle the cases of asymmetrical connection closings. This was also back-ported from head (2.8)&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;#3 UNICAST contention issues&lt;br /&gt;We used to have 2 big fat locks in UNICAST, which severely impacted performance on high unicast message volumes. The bottleneck was detected as part of our EAP testing for JBoss.&lt;br /&gt;&lt;br /&gt;This has been fixed and is getting forward-ported to CVS head.&lt;br /&gt;&lt;br /&gt;I guess the 3 changes are worth trying out 2.6.13.CR2; in some cases this should make a real difference in performance !&lt;br /&gt;&lt;br /&gt;Enjoy,&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19835054-2341756750098693376?l=belaban.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://belaban.blogspot.com/feeds/2341756750098693376/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://belaban.blogspot.com/2009/09/jgroups-2613cr2-released.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/2341756750098693376'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/2341756750098693376'/><link rel='alternate' type='text/html' href='http://belaban.blogspot.com/2009/09/jgroups-2613cr2-released.html' title='JGroups 2.6.13.CR2 released'/><author><name>Bela Ban</name><uri>http://www.blogger.com/profile/01830789377474906550</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='28' height='32' src='http://3.bp.blogspot.com/_NUNlUHL8KjI/ST6MrbBdIaI/AAAAAAAAAAM/bu0j0fvc6sI/S220/bela.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-19835054.post-3762167835250981038</id><published>2009-08-24T08:53:00.002+02:00</published><updated>2009-08-24T08:59:00.174+02:00</updated><title type='text'>2.8.0.CR1 released</title><content type='html'>I just released 2.8.0.CR1, it can be downloaded from SourceForge (&lt;a href="http://sourceforge.net/projects/javagroups/files/JGroups/2.8.0.CR1/JGroups-2.8.0.CR1.bin.zip/download"&gt;binary&lt;/a&gt; and &lt;a href="http://sourceforge.net/projects/javagroups/files/JGroups/2.8.0.CR1/JGroups-2.8.0.CR1.src.zip/download"&gt;source&lt;/a&gt;).&lt;br /&gt;&lt;br /&gt;This version is pretty stable, and I expect a GA soon. The only open issues are currently a few IPv6 related issues and an &lt;a href="https://jira.jboss.org/jira/browse/JGRP-1009"&gt;issue&lt;/a&gt; which fixes spurious merges.&lt;br /&gt;&lt;br /&gt;The release notes are &lt;a href="http://javagroups.cvs.sourceforge.net/viewvc/javagroups/JGroups/doc/ReleaseNotes-2.8.txt?revision=1.4&amp;amp;view=markup"&gt;here&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Enjoy,&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19835054-3762167835250981038?l=belaban.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://belaban.blogspot.com/feeds/3762167835250981038/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://belaban.blogspot.com/2009/08/280cr1-released.html#comment-form' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/3762167835250981038'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/3762167835250981038'/><link rel='alternate' type='text/html' href='http://belaban.blogspot.com/2009/08/280cr1-released.html' title='2.8.0.CR1 released'/><author><name>Bela Ban</name><uri>http://www.blogger.com/profile/01830789377474906550</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='28' height='32' src='http://3.bp.blogspot.com/_NUNlUHL8KjI/ST6MrbBdIaI/AAAAAAAAAAM/bu0j0fvc6sI/S220/bela.jpg'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-19835054.post-2745415567819151487</id><published>2009-08-17T21:08:00.003+02:00</published><updated>2009-08-17T23:29:46.602+02:00</updated><title type='text'>2.6.12.GA released</title><content type='html'>Just uploaded to SourceForge, the JIRA issues are at &lt;a href="https://jira.jboss.org/jira/secure/IssueNavigator.jspa?reset=true&amp;amp;pid=10053&amp;amp;fixfor=12313820"&gt;https://jira.jboss.org/jira/secure/IssueNavigator.jspa?reset=true&amp;amp;pid=10053&amp;amp;fixfor=12313820&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;In a nutshell, 2.6.12 contains only 4 issues:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;GossipRouter consumed 40% CPU without doing anything: fixed&lt;/li&gt;&lt;li&gt;S3_PING is a new file-basedx discovery protocol for running JGroups on EC2 / S3&lt;/li&gt;&lt;li&gt;There was a memory leak in the GMS protocol on high member churn (high rate of joins and leaves)&lt;/li&gt;&lt;li&gt;FLUSH could lock up the entire cluster when the initial flush phase ran into a timeout. Thanks to Rado and Brian for discovering this bug by adding all weird combination of failure scenarios to their merciless tests... :-) And kudos to Vladimir for investigating the (60MB !) logs, finding the offending code and fixing it, all within 60 minutes !&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;2.6.12.GA can be downloaded from SourceForge in &lt;a href="http://sourceforge.net/projects/javagroups/files/JGroups/2.6.12.GA/JGroups-2.6.12.GA.bin.zip/download"&gt;binary&lt;/a&gt; and &lt;a href="http://sourceforge.net/projects/javagroups/files/JGroups/2.6.12.GA/JGroups-2.6.12.GA.src.zip/download"&gt;source&lt;/a&gt; versions.&lt;br /&gt;&lt;br /&gt;Enjoy,&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19835054-2745415567819151487?l=belaban.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://belaban.blogspot.com/feeds/2745415567819151487/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://belaban.blogspot.com/2009/08/2612ga-released.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/2745415567819151487'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/2745415567819151487'/><link rel='alternate' type='text/html' href='http://belaban.blogspot.com/2009/08/2612ga-released.html' title='2.6.12.GA released'/><author><name>Bela Ban</name><uri>http://www.blogger.com/profile/01830789377474906550</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='28' height='32' src='http://3.bp.blogspot.com/_NUNlUHL8KjI/ST6MrbBdIaI/AAAAAAAAAAM/bu0j0fvc6sI/S220/bela.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-19835054.post-5023287543359932217</id><published>2009-07-16T14:35:00.017+02:00</published><updated>2009-07-16T16:36:43.158+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='bike tour nice france'/><title type='text'>Bike tour Nice</title><content type='html'>Executive summary: very nice tour with 600km, 8 mountain passes and ca 17'000m of climbing, but unfortunately cut short by the weather, so the total is only 8 instead of 10 passes.&lt;br /&gt;&lt;br /&gt;That's the stats, if you want to know more, read on...&lt;br /&gt;&lt;br /&gt;By the way: &lt;span style="font-weight: bold;"&gt;a 'bike'&lt;/span&gt;&lt;span style="font-weight: bold;"&gt; is a bicy&lt;/span&gt;&lt;span style="font-weight: bold;"&gt;cle, *not* a motorbike ! :-)&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:180%;"&gt;&lt;br /&gt;Day 1: FRI July 10th 2009&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;I took the 9am flight from Zurich and arrived in Nice at 10:00am. My biggest concern was to assemble the bike and get out of the airport as soon as possible&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_NUNlUHL8KjI/Sl85HfMeuCI/AAAAAAAAACg/Ss7CvUVa92U/s1600-h/BILD2728.JPG"&gt;&lt;img style="margin: 0pt 10px 10px 0pt; float: left; cursor: pointer; width: 320px; height: 240px;" src="http://1.bp.blogspot.com/_NUNlUHL8KjI/Sl85HfMeuCI/AAAAAAAAACg/Ss7CvUVa92U/s320/BILD2728.JPG" alt="" id="BLOGGER_PHOTO_ID_5359064882389891106" border="0" /&gt;&lt;/a&gt; because it must be busy this time of the season (in France, vacation time started July 3rd).&lt;br /&gt;&lt;br /&gt;However, I was pleasantly surprised when I found that NCE even had a bike assembly station inside the airport, with a stand and tools. Who would have thought ?&lt;br /&gt;&lt;br /&gt;As you can see, I only had 2-3 kilos of baggage with me, attached to the saddle (no back pack).&lt;br /&gt;&lt;br /&gt;Alors, I assembled my bike, passed customs and took off. First, the ride was along the shore (at 0 meters elevation), then through Nice, with a bit too much traffic and off I went into the mountains.&lt;br /&gt;&lt;br /&gt;Once I found D19 towards Tourrette-Levens, traffic eased and the long but steady climb began. The ride was mostly through wooded areas, with lots of ups and downs and curves. Almost no traffic anymore, only other bikers.&lt;br /&gt;&lt;br /&gt;After the Col de St. Martin (1500m), I had already booked a hotel at &lt;a href="http://maps.google.com/maps?f=q&amp;amp;source=s_q&amp;amp;hl=en&amp;amp;geocode=&amp;amp;q=la+bolline,+france&amp;amp;sll=43.984046,7.216473&amp;amp;sspn=0.107088,0.226593&amp;amp;ie=UTF8&amp;amp;ll=44.071985,7.160769&amp;amp;spn=0.053465,0.080595&amp;amp;z=14"&gt;La Bolline&lt;/a&gt; and spent the night there. Very small place, but the good thing is it's quite high up so no air conditioning was needed. As a matter of fact, I spent all nights at altitudes over 1000m, so it was not too hot and not too cold. Just perfect !&lt;br /&gt;&lt;br /&gt;The only thing that wasn't perfect was that the French (at least in the South) start their repas (dinner) at 7:30pm, so when I arrived (usually very hungry), I still had to wait for a few hours !&lt;br /&gt;&lt;br /&gt;I 'fixed' this by having a late and long lunch (called dejeuner in France), so I would survive until 7:30pm...&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:180%;"&gt;Day 2: SAT July 11th 2009&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Big day: today I wanted to ride 2 passes, the Cole De Bonnette and the Col de Vars.&lt;br /&gt;But first things first. In the morning, there was a nice downhill from La Bolline to the junction with D2205.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_NUNlUHL8KjI/Sl8mPdpBi6I/AAAAAAAAABw/8DFU-BPEaA4/s1600-h/BILD2739.JPG"&gt;&lt;img style="cursor: pointer; width: 320px; height: 240px;" src="http://4.bp.blogspot.com/_NUNlUHL8KjI/Sl8mPdpBi6I/AAAAAAAAABw/8DFU-BPEaA4/s320/BILD2739.JPG" alt="" id="BLOGGER_PHOTO_ID_5359044128690768802" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;From there on, the climb to the Bonnette started. In the picture, coming down from Valdeblore, I took a right and started my ascent to Bonnette. This point is ca. at 500m, and Bonnette at 2802m, so a long climb of 2300m !&lt;br /&gt;&lt;br /&gt;All the passes I did are not very steep, but the climbs are very long, at maybe 5-8%, and that wears you out, too ! I prefer steep climbs and long downhill rides :-)&lt;br /&gt;&lt;br /&gt;The Col De Bonnette is the highest pass in Europe, but only because of a trick: some resourceful people (probably from the tourist office) added a loop around the top of Bonnette in the 60's, which added a few tens of meters, so Bonnette would surpass the Col de L'Iseran !&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_NUNlUHL8KjI/Sl8o2LwrLdI/AAAAAAAAAB4/5we6w9S3p8k/s1600-h/BILD2750.JPG"&gt;&lt;img style="cursor: pointer; width: 320px; height: 240px;" src="http://1.bp.blogspot.com/_NUNlUHL8KjI/Sl8o2LwrLdI/AAAAAAAAAB4/5we6w9S3p8k/s320/BILD2750.JPG" alt="" id="BLOGGER_PHOTO_ID_5359046992929172946" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;In the picture, one can see the loop starting and going around the top clock-wise.&lt;br /&gt;&lt;br /&gt;In &lt;a href="http://maps.google.com/maps?f=q&amp;amp;source=s_q&amp;amp;hl=en&amp;amp;geocode=&amp;amp;q=jausiers,+france&amp;amp;sll=44.138117,7.055626&amp;amp;sspn=0.21362,0.453186&amp;amp;ie=UTF8&amp;amp;ll=44.37884,6.824226&amp;amp;spn=0.212747,0.32238&amp;amp;z=12"&gt;Jausiers&lt;/a&gt;, I unfortunately had a big ham and cheese toast, with a few cokes and beers and - as the experienced athletes among you will know - this was somewhat detrimental to my effort to climb the Col de Vars ! Only 800m to climb, but I had to walk my bike and push because of (a) my stomach and (b) cramps.&lt;br /&gt;&lt;br /&gt;Note to self: tonight have loads of salt to avoid the cramps (binds the water in the body) and next time have spaghetti or something with carbs (and salt) for lunch !&lt;br /&gt;&lt;br /&gt;Anyway, I made it to the top (even biking the last 1.5kms) and after a nice downhill spent the night at Vars (a mountain resort).&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:180%;"&gt;Day 3: SUN July 12th 2009&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:100%;"&gt;3 passes under the buckle and 7 to go, I had a nice downhill ride (that's the advantage of spending the night halfway up the mountain, you always have a downhill the next day!) to &lt;a href="http://maps.google.com/maps?f=q&amp;amp;source=s_q&amp;amp;hl=en&amp;amp;geocode=&amp;amp;q=guillestre,+france&amp;amp;sll=44.725759,6.759338&amp;amp;sspn=0.422964,0.64476&amp;amp;ie=UTF8&amp;amp;z=12&amp;amp;iwloc=A"&gt;Guillestre&lt;/a&gt;. From here, the most beautiful pass, the Col de L'Izoard, started. In the picture below, you can see &lt;a href="http://maps.google.com/maps?f=q&amp;amp;source=s_q&amp;amp;hl=en&amp;amp;geocode=&amp;amp;q=brunissard,+france&amp;amp;sll=44.756242,6.745605&amp;amp;sspn=0.211371,0.32238&amp;amp;ie=UTF8&amp;amp;ll=44.762824,6.741486&amp;amp;spn=0.211347,0.32238&amp;amp;t=p&amp;amp;z=12&amp;amp;iwloc=A"&gt;Brunissard&lt;/a&gt;, looking back.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_NUNlUHL8KjI/Sl8ryUp1KrI/AAAAAAAAACA/S6zkRdK8UsE/s1600-h/BILD2773.JPG"&gt;&lt;img style="margin: 0pt 10px 10px 0pt; float: left; cursor: pointer; width: 320px; height: 240px;" src="http://4.bp.blogspot.com/_NUNlUHL8KjI/Sl8ryUp1KrI/AAAAAAAAACA/S6zkRdK8UsE/s320/BILD2773.JPG" alt="" id="BLOGGER_PHOTO_ID_5359050225131793074" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:100%;"&gt;&lt;br /&gt;The road climbed nicely through a dense forest and later passed the famous Casse Desert (Broken Desert), which looks like a piece of the moon right before the summit of the Izoard pass.&lt;br /&gt;&lt;/span&gt;&lt;span style="font-size:100%;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;span style="font-size:100%;"&gt;At the top of the Izoard, there was a concession stand, offering drinks and souvenirs, and there were many motorbikes and bikes (and cars, too). However, the downhill was fantastic: roads in great shape and winded curves, excellent to cruise down.&lt;br /&gt;&lt;br /&gt;The next pass (#5) was the &lt;a href="http://maps.google.com/maps?f=q&amp;amp;source=s_q&amp;amp;hl=en&amp;amp;geocode=&amp;amp;q=montgenevre,+france&amp;amp;sll=44.955809,6.800194&amp;amp;sspn=0.21064,0.32238&amp;amp;ie=UTF8&amp;amp;z=13"&gt;Montgenevre&lt;/a&gt;, starting from Briancon. At only 600 meters of climbing, it would have been a nice pass, but the traffic was overwhelming. Maybe because it was Sunday, everybody (and their grandmothers) was on their motorbikes. And sometimes they just love to accelerate when passing bikers (the real ones :-))...&lt;br /&gt;&lt;br /&gt;At least the hotel at the top was nice (Le Chalet Blanc), had a nice TV and so I could watch the Tour De France for that day.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:180%;"&gt;Day 4: MON July 13th 2009&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;The day started with a nice downhill to &lt;a href="http://maps.google.com/maps?f=q&amp;amp;source=s_q&amp;amp;hl=en&amp;amp;geocode=&amp;amp;q=cesana+torinese,+italy&amp;amp;sll=45.13495,7.050991&amp;amp;sspn=0.104991,0.16119&amp;amp;ie=UTF8&amp;amp;z=12"&gt;Cesana Torinese&lt;/a&gt;, and then on to &lt;a href="http://maps.google.com/maps?f=q&amp;amp;source=s_q&amp;amp;hl=en&amp;amp;geocode=&amp;amp;q=susa,+italy&amp;amp;sll=44.996125,6.871948&amp;amp;sspn=0.210492,0.32238&amp;amp;ie=UTF8&amp;amp;ll=45.13495,7.050991&amp;amp;spn=0.104991,0.16119&amp;amp;z=13"&gt;Susa, Italy&lt;/a&gt;. But that was it in terms of niceness: the next climb up Mont Cenis was hard, because it started from 500m and went all the way up to 2100m, for a climb of 1600m.&lt;br /&gt;&lt;br /&gt;In addition, once I could see the top (a bunch of hotels) through the fog, when I got closer I found out that there was a dam behind them, another 200m of climbing. Then I found out that the road didn't go alongside the lake, but climbed another hill, adding 200m again !&lt;br /&gt;&lt;br /&gt;At the top, it was very cold and so I just put on a long wind breaker  and rode the downhill into &lt;a href="http://maps.google.com/maps?f=q&amp;amp;source=s_q&amp;amp;hl=en&amp;amp;q=Lanslevillard,+Savoy,+Rh%C3%B4ne-Alpes,+France&amp;amp;sll=45.336702,7.02507&amp;amp;sspn=0.418473,0.64476&amp;amp;ie=UTF8&amp;amp;cd=1&amp;amp;geocode=FV4TswIdz3tpAA&amp;amp;split=0&amp;amp;ll=45.287207,6.912804&amp;amp;spn=0.20942,0.32238&amp;amp;z=12&amp;amp;iwloc=A"&gt;Lanslevillard&lt;/a&gt; where I had lunch (no ham/cheese toast though !).&lt;br /&gt;&lt;br /&gt;I decided to ride a little further to &lt;a href="http://maps.google.com/maps?f=q&amp;amp;source=s_q&amp;amp;hl=en&amp;amp;geocode=&amp;amp;q=bessans,+France&amp;amp;sll=45.287207,6.912804&amp;amp;sspn=0.20942,0.32238&amp;amp;ie=UTF8&amp;amp;ll=45.29083,6.92173&amp;amp;spn=0.209406,0.32238&amp;amp;z=12&amp;amp;iwloc=A"&gt;Bessans&lt;/a&gt; and took a hotel there.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:180%;"&gt;Day 5: TUE July 14th 2009&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Pass #7 was the Col de L'iseran (277&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_NUNlUHL8KjI/Sl8xHFgmkcI/AAAAAAAAACI/gLe3Y9nJNnk/s1600-h/BILD2798.JPG"&gt;&lt;img style="margin: 0pt 10px 10px 0pt; float: left; cursor: pointer; width: 320px; height: 240px;" src="http://4.bp.blogspot.com/_NUNlUHL8KjI/Sl8xHFgmkcI/AAAAAAAAACI/gLe3Y9nJNnk/s320/BILD2798.JPG" alt="" id="BLOGGER_PHOTO_ID_5359056079401947586" border="0" /&gt;&lt;/a&gt;0m), which is the second highest mountain pass in Europe, only to be passed by the Bonnette (albeit through a trick, as mentioned above). However, I started at 1677m, so the actual climbing was only 1100m, much better than the 2300m for the Bonnette.&lt;br /&gt;As once can see, I took the obligatory picture of the sign at the top of the pass, to prove that I was there :-)&lt;br /&gt;Well, actually, I'm not in the picture, so... hmmm :-)&lt;br /&gt;&lt;br /&gt;The downhill ride was very pleasant, through &lt;a href="http://maps.google.com/maps?f=q&amp;amp;source=s_q&amp;amp;hl=en&amp;amp;geocode=&amp;amp;q=val+disere,+France&amp;amp;sll=45.29083,6.92173&amp;amp;sspn=0.209406,0.32238&amp;amp;ie=UTF8&amp;amp;ll=45.407369,6.92276&amp;amp;spn=0.208976,0.32238&amp;amp;z=12"&gt;Val d'Isere&lt;/a&gt;, which is a famous winter sports resort and down to &lt;a href="http://maps.google.com/maps?f=q&amp;amp;source=s_q&amp;amp;hl=en&amp;amp;geocode=&amp;amp;q=montvalezan,+France&amp;amp;sll=45.554208,6.884308&amp;amp;sspn=0.208432,0.32238&amp;amp;ie=UTF8&amp;amp;z=13"&gt;Montvalezan&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;The Col de L'Iseran was pass # 7.&lt;br /&gt;&lt;br /&gt;From here, I took a shortcut to the Petit St. Bernard, although it had a steep climb of up to 16%. The ride up St Bernard would have been nice because it features a mild climb of mostly 5% or less, but the wind at the top made it hard to reach the old monestary, which is at the top.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_NUNlUHL8KjI/Sl8zeB1XiPI/AAAAAAAAACY/TEZPu0AkSls/s1600-h/BILD2814.JPG"&gt;&lt;img style="margin: 0pt 10px 10px 0pt; float: left; cursor: pointer; width: 240px; height: 320px;" src="http://1.bp.blogspot.com/_NUNlUHL8KjI/Sl8zeB1XiPI/AAAAAAAAACY/TEZPu0AkSls/s320/BILD2814.JPG" alt="" id="BLOGGER_PHOTO_ID_5359058672575547634" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;Here's the picture of me at the top.&lt;br /&gt;&lt;br /&gt;The weather forecast had rain and storms for the evening, so I didn't stay too long at the top and made my way down to &lt;a href="http://maps.google.com/maps?f=q&amp;amp;source=s_q&amp;amp;hl=en&amp;amp;geocode=&amp;amp;q=thuile,+italy&amp;amp;sll=45.612096,6.845357&amp;amp;sspn=0.104109,0.16119&amp;amp;ie=UTF8&amp;amp;z=12"&gt;La Thuile&lt;/a&gt;, Italy (the Pt. St. Bernard is the border between France and Italy) and took a hotel.&lt;br /&gt;&lt;br /&gt;Petit St. Bernard was pass #8.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:180%;"&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Day 6: WED July 15th 2009&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Unfortunately, it rained the whole night and the forecast for where I wanted to go (Gr. St. Bernard and Furka (Valais) in Switzerland) was very bad (rain and thunderstorms), so I decided to finish this tour by heading west.&lt;br /&gt;&lt;br /&gt;When the rain stopped, I rode down to &lt;a href="http://maps.google.com/maps?f=q&amp;amp;source=s_q&amp;amp;hl=en&amp;amp;geocode=&amp;amp;q=pre-saint-didier,+france&amp;amp;sll=45.805802,6.947726&amp;amp;sspn=0.414992,0.64476&amp;amp;ie=UTF8&amp;amp;ll=45.762853,6.984901&amp;amp;spn=0.103828,0.16119&amp;amp;z=13&amp;amp;iwloc=A"&gt;Pre-Saint-Didier&lt;/a&gt; (Italy) and up the hill to &lt;a href="http://maps.google.com/maps?f=q&amp;amp;source=s_q&amp;amp;hl=en&amp;amp;q=11013+Courmayeur+Aosta,+Aosta+Valley,+Italy&amp;amp;sll=45.762853,6.984901&amp;amp;sspn=0.103828,0.16119&amp;amp;ie=UTF8&amp;amp;cd=1&amp;amp;geocode=FcK9ugIdbmFqAA&amp;amp;split=0&amp;amp;ll=45.789552,6.971855&amp;amp;spn=0.207557,0.32238&amp;amp;z=12&amp;amp;iwloc=A"&gt;Courmayeur&lt;/a&gt; (Italy). There I took a bus through the &lt;a href="http://en.wikipedia.org/wiki/Mont_Blanc_Tunnel"&gt;Mont Blanc tunnel&lt;/a&gt; into &lt;a href="http://maps.google.com/maps?f=q&amp;amp;source=s_q&amp;amp;hl=en&amp;amp;geocode=&amp;amp;q=chamonix&amp;amp;sll=45.789552,6.971855&amp;amp;sspn=0.207557,0.32238&amp;amp;ie=UTF8&amp;amp;z=11&amp;amp;iwloc=A"&gt;Chamonix&lt;/a&gt; (France again).&lt;br /&gt;&lt;br /&gt;From Chamonix, I rode more or less A40 back to &lt;a href="http://maps.google.com/maps?f=q&amp;amp;source=s_q&amp;amp;hl=en&amp;amp;geocode=&amp;amp;q=geneva,+switzerland&amp;amp;sll=46.019853,6.428375&amp;amp;sspn=0.413395,0.64476&amp;amp;ie=UTF8&amp;amp;z=14"&gt;Geneva&lt;/a&gt; (Switzerland), where I took the train back home. The A40 is partially a 2 lane highway, but due to lack of other roads (very narrow valley), they allow bikes to use it. Hmm, not very pleasant to be passed by heavy trucks, you going at 65km/h and the trucks going at 100km/h...&lt;br /&gt;&lt;br /&gt;Anyway, this was a great tour, friendly people everywhere, I practiced my French, and toured a beautiful scenery. I might just do it again some day ! Next time though, maybe with an iphone (or even better, an Android HTC phone !) and lots of plan B's: in some places they had no public transportation (besides the bi-weekly bus :-)) and I would have been stuck in that place had it started to rain...&lt;br /&gt;&lt;br /&gt;If someone is interested in the exact tour, I have it on bikemap.net, let me know.&lt;br /&gt;Cheers,&lt;br /&gt;Bela&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19835054-5023287543359932217?l=belaban.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://belaban.blogspot.com/feeds/5023287543359932217/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://belaban.blogspot.com/2009/07/bike-tour-nice.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/5023287543359932217'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/5023287543359932217'/><link rel='alternate' type='text/html' href='http://belaban.blogspot.com/2009/07/bike-tour-nice.html' title='Bike tour Nice'/><author><name>Bela Ban</name><uri>http://www.blogger.com/profile/01830789377474906550</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='28' height='32' src='http://3.bp.blogspot.com/_NUNlUHL8KjI/ST6MrbBdIaI/AAAAAAAAAAM/bu0j0fvc6sI/S220/bela.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_NUNlUHL8KjI/Sl85HfMeuCI/AAAAAAAAACg/Ss7CvUVa92U/s72-c/BILD2728.JPG' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-19835054.post-3475599669181888714</id><published>2009-07-07T14:37:00.002+02:00</published><updated>2009-07-07T14:49:20.714+02:00</updated><title type='text'>Tour de France</title><content type='html'>From July 10th - 18th, I'll be on my own Tour De France: from Nice (France) back to my home town of Kreuzlingen (Switzerland).&lt;br /&gt;&lt;br /&gt;I'm flying to Nice this Friday (July 10th), and then bike (= bicycle) back, over some of the highest passes in Europe, e.g. Restefond-Bonnette (2802m, the highest pass in Europe), Iseran (the 2nd highest), Isoard, Cenis, Vars, Pt. St. Bernhard, Gr. St. Bernhard, Furka, etc...&lt;br /&gt;&lt;br /&gt;Google says this &lt;a href="http://maps.google.com/maps/ms?ie=UTF8&amp;amp;hl=en&amp;amp;msa=0&amp;amp;msid=101283342963662523577.00046b5e3a5a3f822dc1d&amp;amp;z=7"&gt;tour&lt;/a&gt; is ca 1000 kilometers and my guess is this will be over 20'000m of climbing (if anyone knows who to measure the total climbing in Google Earth, please let me know !).&lt;br /&gt;&lt;br /&gt;Last year I biked a similar tour, from Graz (Austria) back home, but that tour was shorter (850km) and the mountains lower...&lt;br /&gt;&lt;br /&gt;I (hope to !) be back on the 20th :-)&lt;br /&gt;&lt;br /&gt;Answering questions, bug fixing and all such stuff will be suspended during this time !&lt;br /&gt;Cheers,&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19835054-3475599669181888714?l=belaban.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://belaban.blogspot.com/feeds/3475599669181888714/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://belaban.blogspot.com/2009/07/tour-de-france.html#comment-form' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/3475599669181888714'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/3475599669181888714'/><link rel='alternate' type='text/html' href='http://belaban.blogspot.com/2009/07/tour-de-france.html' title='Tour de France'/><author><name>Bela Ban</name><uri>http://www.blogger.com/profile/01830789377474906550</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='28' height='32' src='http://3.bp.blogspot.com/_NUNlUHL8KjI/ST6MrbBdIaI/AAAAAAAAAAM/bu0j0fvc6sI/S220/bela.jpg'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-19835054.post-8782431340112372653</id><published>2009-06-19T10:05:00.002+02:00</published><updated>2009-06-19T10:11:07.201+02:00</updated><title type='text'>Jazoon and JBossWorld</title><content type='html'>I'll be speaking at the &lt;a href="http://jazoon.com/en/conference/presentations/tl/6040"&gt;Jazoon&lt;/a&gt; (in Zurich, June 24th, next week) and &lt;a href="http://www.jbossworld.com/agenda/tracks/abstracts_decodingthecode.html#528005"&gt;JBossWorld&lt;/a&gt; (in Chicago, Sept 2) conferences. At Jazoon, I'll talk about a memcached implementation in Java, at JBossWorld I'll talk about large JBoss clusters.&lt;br /&gt;&lt;br /&gt;Hope to see some of you there !&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19835054-8782431340112372653?l=belaban.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://belaban.blogspot.com/feeds/8782431340112372653/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://belaban.blogspot.com/2009/06/jazoon-and-jbossworld.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/8782431340112372653'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/8782431340112372653'/><link rel='alternate' type='text/html' href='http://belaban.blogspot.com/2009/06/jazoon-and-jbossworld.html' title='Jazoon and JBossWorld'/><author><name>Bela Ban</name><uri>http://www.blogger.com/profile/01830789377474906550</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='28' height='32' src='http://3.bp.blogspot.com/_NUNlUHL8KjI/ST6MrbBdIaI/AAAAAAAAAAM/bu0j0fvc6sI/S220/bela.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-19835054.post-6990527145811177984</id><published>2009-06-17T18:43:00.003+02:00</published><updated>2009-06-17T18:54:29.643+02:00</updated><title type='text'>Shunning has been shunned</title><content type='html'>I finally completed the &lt;a href="https://jira.jboss.org/jira/browse/JGRP-937"&gt;MERGE4&lt;/a&gt; functionality, which now handles asymmetrical merges and greatly improves the usefulness of JGroups in mobile networks. I've blogged about this earlier this year.&lt;br /&gt;&lt;br /&gt;The new merge functionality also allowed me to trash &lt;a href="http://www.jboss.org/community/docs/DOC-12269"&gt;shunning&lt;/a&gt;, which is great, because I've always had problems explaining the difference between shunning and merging. Merging would usually be needed when we had real network partitions, whereas shunning would be needed when only a single member was expelled from the group (e.g. because it failed to respond to heartbeats, but hasn't really crashed).&lt;br /&gt;&lt;br /&gt;However, with FD_ALL, there could be a scenario where everybody shunned everybody else (a shun-fest :-)), and so all the cluster nodes would leave and re-join the cluster, possibly even multiple times. Clearly not a desirable scenario, even though it didn't lead to incorrect results !&lt;br /&gt;&lt;br /&gt;The new model is now much simpler: we have members join, leave and merge. The latter happens on a network partition, for example. In the old model, when a member was unresponsive, it was shunned and subsequently rejoined. In the new model, there's simply going to be a merge between the group which found that member unresponsive and the (now newly responsive) member.&lt;br /&gt;&lt;br /&gt;Since I also improved merging speed and correctness (wrt concurrent merges), I suggest download 2.8.beta2 (which I'll upload to SourceForge shortly) and give it a try.&lt;br /&gt;&lt;br /&gt;One thing that I'll have to talk about (in my next post) is what to do with merging. For example, if we have shared state and it diverged during a network partition, how can the application make sure that the merge doesn't cause inconsistent states.&lt;br /&gt;&lt;br /&gt;More on this later, enjoy,&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19835054-6990527145811177984?l=belaban.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://belaban.blogspot.com/feeds/6990527145811177984/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://belaban.blogspot.com/2009/06/shunning-has-been-shunned.html#comment-form' title='8 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/6990527145811177984'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/6990527145811177984'/><link rel='alternate' type='text/html' href='http://belaban.blogspot.com/2009/06/shunning-has-been-shunned.html' title='Shunning has been shunned'/><author><name>Bela Ban</name><uri>http://www.blogger.com/profile/01830789377474906550</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='28' height='32' src='http://3.bp.blogspot.com/_NUNlUHL8KjI/ST6MrbBdIaI/AAAAAAAAAAM/bu0j0fvc6sI/S220/bela.jpg'/></author><thr:total>8</thr:total></entry><entry><id>tag:blogger.com,1999:blog-19835054.post-4772260553176263773</id><published>2009-05-20T15:33:00.003+02:00</published><updated>2009-05-20T15:40:12.187+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Testing JGroups protocols'/><title type='text'>Testing a protocol in isolation</title><content type='html'>Protocols are the most important feature of JGroups. They provide the actual functionality in any stack, such as retransmission of lost messages, ordering, flow control, state transfer, fragmentation and so on.&lt;br /&gt;&lt;br /&gt;While we do have a sizeable number of unit tests in JGroups, we don't have many tests which test just 1 protocol in isolation.&lt;br /&gt;&lt;br /&gt;Take a look at &lt;a href="http://javagroups.cvs.sourceforge.net/viewvc/javagroups/JGroups/tests/junit/org/jgroups/protocols/GMS_MergeTest.java?revision=1.6&amp;amp;view=markup"&gt;GMS_MergeTest (CVS head)&lt;/a&gt;, that's how we need to test&lt;br /&gt;protocols in the future. GMS_MergeTest tests (concurrent) merging, and only concurrent merging.&lt;br /&gt;&lt;br /&gt;The following features are in this test:&lt;br /&gt;&lt;br /&gt;    * Injection of views. We inject the subpartitioned views directly&lt;br /&gt;      into the cluster, rather than waiting for a failure detector to&lt;br /&gt;      kick in. Remember, we test merging and &lt;span style="font-weight: bold;"&gt;NOT&lt;/span&gt; failure detection. The&lt;br /&gt;      good thing is that (a) this is much faster and (b) we can really&lt;br /&gt;      focus on 1 protocol/feature compared to multiple ones&lt;br /&gt;    * Injection of MERGE events. Rather than relying on MERGE2 to (after&lt;br /&gt;      some time) generate a MERGE event, we directly inject MERGE events&lt;br /&gt;      into the merge leader(s). Same as above: this is much faster and&lt;br /&gt;      we don't test MERGE2, but GMS/CoordGmsImpl&lt;br /&gt;    * Temporary enabling/disabling of logging for GMS: injectMerge()&lt;br /&gt;      enabled TRACE logging for GMS, so we can see what's going on&lt;br /&gt;      *selectively*, and don't get a huge TRACE log for the stuff that's&lt;br /&gt;      going on before&lt;br /&gt;    * SHARED_LOOPBACK: well, I might as well use UDP, but&lt;br /&gt;      SHARED_LOOPBACK never loses any messages, therefore no&lt;br /&gt;      retransmissions. Again, focus is on testing merging and not&lt;br /&gt;      merging with packet loss&lt;br /&gt;    * No failure detection, no merge protocol in the stack: this allows&lt;br /&gt;      us to set a breakpoint in a debugger and step around for as long&lt;br /&gt;      as we want to, without FD suspecting and excluding us&lt;br /&gt;&lt;br /&gt;The use of these features allows us to really focus on testing &lt;span style="font-weight: bold;"&gt;one&lt;/span&gt;&lt;br /&gt;feature in isolation. We need to go further into this direction,&lt;br /&gt;thererfore, if you have existing tests you want to modify, go ahead !&lt;br /&gt;New tests should follow this paradigm whenever possible.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19835054-4772260553176263773?l=belaban.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://belaban.blogspot.com/feeds/4772260553176263773/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://belaban.blogspot.com/2009/05/testing-protocol-in-isolation.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/4772260553176263773'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/4772260553176263773'/><link rel='alternate' type='text/html' href='http://belaban.blogspot.com/2009/05/testing-protocol-in-isolation.html' title='Testing a protocol in isolation'/><author><name>Bela Ban</name><uri>http://www.blogger.com/profile/01830789377474906550</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='28' height='32' src='http://3.bp.blogspot.com/_NUNlUHL8KjI/ST6MrbBdIaI/AAAAAAAAAAM/bu0j0fvc6sI/S220/bela.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-19835054.post-8729148836231731007</id><published>2009-05-04T12:08:00.003+02:00</published><updated>2009-05-04T12:22:36.163+02:00</updated><title type='text'>First alpha of 2.8</title><content type='html'>I've uploaded a first alpha of 2.8 to SourceForge [&lt;a href="https://sourceforge.net/project/showfiles.php?group_id=6081&amp;amp;package_id=94868#"&gt;1&lt;/a&gt;].&lt;br /&gt;&lt;br /&gt;The major new features are &lt;a href="http://belaban.blogspot.com/2009/02/whats-cool-about-logical-addresses.html"&gt;logical addresses&lt;/a&gt;, improved support for &lt;a href="http://belaban.blogspot.com/2009/04/those-damn-edge-cases.html"&gt;asymmetrical merges&lt;/a&gt;, GossipRouter changes, and a new &lt;a href="http://belaban.blogspot.com/2009/04/fileping-new-discovery-protocol-based.html"&gt;shared dir based discovery protocol&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Despite its name, alpha3 is already very stable, and I wanted to release it so I can get some useful feedback to go into beta1. There are still ca 20 JIRA issues open, but I expect we close them and release GA by the summer 09.&lt;br /&gt;&lt;br /&gt;The remaining tasks are hardening of the merging code (e.g. to better handle concurrent merges), dropping commons-logging (so we can ship jgroups.jar &lt;span style="font-weight: bold;"&gt;without any other JAR dependency !&lt;/span&gt;), testing asymmetric merges some more and writing documentation for these new features.&lt;br /&gt;&lt;br /&gt;I want to thank Vladimir and Richard for their hard work on 2.8 !&lt;br /&gt;&lt;br /&gt;Enjoy !&lt;br /&gt;&lt;br /&gt;[1] https://sourceforge.net/project/showfiles.php?group_id=6081&amp;amp;package_id=94868#&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19835054-8729148836231731007?l=belaban.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://belaban.blogspot.com/feeds/8729148836231731007/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://belaban.blogspot.com/2009/05/first-alpha-of-28.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/8729148836231731007'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/8729148836231731007'/><link rel='alternate' type='text/html' href='http://belaban.blogspot.com/2009/05/first-alpha-of-28.html' title='First alpha of 2.8'/><author><name>Bela Ban</name><uri>http://www.blogger.com/profile/01830789377474906550</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='28' height='32' src='http://3.bp.blogspot.com/_NUNlUHL8KjI/ST6MrbBdIaI/AAAAAAAAAAM/bu0j0fvc6sI/S220/bela.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-19835054.post-6140907496970794343</id><published>2009-04-24T11:22:00.003+02:00</published><updated>2009-04-24T11:36:38.965+02:00</updated><title type='text'>FILE_PING: new discovery protocol based on shared storage</title><content type='html'>I've just created a first version of FILE_PING in 2.6 and 2.8. This is a new discovery protocol which uses a shared directory into which all nodes of a cluster write their addresses.&lt;br /&gt;&lt;br /&gt;New nodes can read the contents of that directory and then send their discovery requests to all nodes found in the dir.&lt;br /&gt;&lt;br /&gt;When a node leaves, it'll remove its address from the directory again.&lt;br /&gt;&lt;br /&gt;When would someone use FILE_PING, e.g. over TCPGOSSIP and  GossipRouter ?&lt;br /&gt;&lt;br /&gt;When IP multicasting is not enabled, or cannot be used for other reasons, we have to resort to either TCPPING , which lists nodes statically in the config, or TCPGOSSIP, which retrieves initial membership information from external process(es), the GossipRouter(s).&lt;br /&gt;&lt;br /&gt;The latter solution is a bit cumbersome since an additional process has to be maintained.&lt;br /&gt;&lt;br /&gt;FILE_PING is a simple solution to replace GossipRouter, so we don't have to maintain that external process.&lt;br /&gt;&lt;br /&gt;However, note that performance will most likely not be better: a shared directory e.g. on NFS or SMB requires a round trip for a read or write, too. So if we have 10 nodes which wrote their information to file, then we have to make 10 round trips via SMB to fetch that information, compared to 1 round trip to the GossipRouter(s) !&lt;br /&gt;&lt;br /&gt;So FILE_PING is an option for developers who prefer to take the perf hit (maybe in the order of a few additional milliseconds per discovery phase) over having to maintain an external GossipRouter process.&lt;br /&gt;&lt;br /&gt;FILE_PING is part of 2.6.10, which will be released early next week, or it can be downloaded from CVS (2.6. branch) or &lt;a href="http://javagroups.cvs.sourceforge.net/viewvc/javagroups/JGroups/src/org/jgroups/protocols/FILE_PING.java?revision=1.5.2.2&amp;amp;view=markup&amp;amp;pathrev=Branch_JGroups_2_6"&gt;here&lt;/a&gt;. In the latter case, place the FILE_PING.java into the src/org/jgroups/protocols directory and execute the 'jar' target in the build.xml Ant script of your JGroups src distro.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19835054-6140907496970794343?l=belaban.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://belaban.blogspot.com/feeds/6140907496970794343/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://belaban.blogspot.com/2009/04/fileping-new-discovery-protocol-based.html#comment-form' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/6140907496970794343'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/6140907496970794343'/><link rel='alternate' type='text/html' href='http://belaban.blogspot.com/2009/04/fileping-new-discovery-protocol-based.html' title='FILE_PING: new discovery protocol based on shared storage'/><author><name>Bela Ban</name><uri>http://www.blogger.com/profile/01830789377474906550</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='28' height='32' src='http://3.bp.blogspot.com/_NUNlUHL8KjI/ST6MrbBdIaI/AAAAAAAAAAM/bu0j0fvc6sI/S220/bela.jpg'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-19835054.post-8827612557769877101</id><published>2009-04-01T13:07:00.006+02:00</published><updated>2009-04-01T13:46:09.323+02:00</updated><title type='text'>Those damn edge cases !</title><content type='html'>While JGroups is over 10 years old and very mature, sometimes I still run into cases that aren't handled. While the average user won't run into edge cases because we test the normal cases very well, if you do run into one, in the best case, you have 'undefined' behavior (whatever that means !), in the worst case, you're hosed.&lt;br /&gt;&lt;br /&gt;Here's one.&lt;br /&gt;&lt;br /&gt;The other week I was at a (european) army, for a week of JGroups consulting. They have a system which runs JGroups nodes over flappy links, radio and satellite networks. Sometimes, a link between 2 nodes A and B can even turn asymmetric, meaning A can send to B, but B not to A !&lt;br /&gt;&lt;br /&gt;It turns out that they have a lot of partitions (e.g. when a satellite link goes down), followed by subsequent remerging when the link is restored. Sometimes, members would not be able to communicate with each other after the merge.&lt;br /&gt;&lt;br /&gt;This was caused by an edge case in UNICAST which doesn't handle &lt;span style="font-style: italic;"&gt;overlapping partitions&lt;/span&gt;.&lt;br /&gt;&lt;br /&gt;A non-overlapping partition is a partition where a cluster of {A,B,C,D} falls apart into 2 (or more) subclusters of {A,B} and {C,D}, or {A}, {B}, {C}, {D}. The latter case can easily be reproduced when you kill a switch connecting the 4 nodes.&lt;br /&gt;&lt;br /&gt;An overlapping partition is when the cluster falls apart into subclusters that overlap, e.g. {A,B,C} and {C,D}. This can happen with asymmetrical links (which never happens with a regular switch !), or FD and many nodes being killed at the same time and a merge occuring before all dead nodes have been removed from the cluster.&lt;br /&gt;&lt;br /&gt;If this sounds obsure, it actually is !&lt;br /&gt;&lt;br /&gt;But anyway, here's what happens at the UNICAST level.&lt;br /&gt;&lt;br /&gt;Warning: rough road ahead...&lt;br /&gt;&lt;br /&gt;UNICAST keeps state for each connection. E.g. if A sends a unicast message to B, A maintains the last sequence number (seqno) &lt;span style="font-weight: bold;"&gt;sent&lt;/span&gt; to B (e.g. #25) and B maintains the highest seqno &lt;span style="font-weight: bold;"&gt;received&lt;/span&gt; from A (#25). The same holds for message from B to A, let's say B's last message to A was #7.&lt;br /&gt;&lt;br /&gt;Now we have a network partition, which creates a new view {A,B} at A and {B} at B. So, in other words, B unilaterally excluded A from its view, but A didn't exclude B. The reason is that A can communicate with B, but B cannot communicate with A.&lt;br /&gt;&lt;br /&gt;Now, you might ask, wait a minute ! If A can communicate with B, why can't B communicate with A ?&lt;br /&gt;&lt;br /&gt;This doesn't happen with switches, but here we're talking about &lt;span style="font-style: italic;"&gt;separate up and down links over radios&lt;/span&gt;, and if a radio up-link goes down, that just means we cannot send, but still receive (through the down-link) !&lt;br /&gt;&lt;br /&gt;Let's now look at what happens:&lt;br /&gt;&lt;br /&gt;When B receives the new view {B}, it removes the entry for A from its connection table. It therefore loses the memory that its last message to A was #7.&lt;br /&gt;&lt;br /&gt;On the other side, A doesn't remove its connection entry for B, which is still at #25.&lt;br /&gt;&lt;br /&gt;When the partition heals and a merge ensues, A sends a message to B. The message's seqno is #25, the next message to B will be #26 and so on.&lt;br /&gt;&lt;br /&gt;On the receiver side, B creates a new connection table entry for A &lt;span style="font-weight: bold;"&gt;with seqno #1&lt;/span&gt;. When A#25 and A#26 are received, they're stored in the table, but not passed up to the application because we expect messages #1-#24 from A first.&lt;br /&gt;&lt;br /&gt;This is terrible because A will never send messages #1-#24 ! Because B will simply store all messages from A, it will run out of memory at some point, unless there's another view excluding A !&lt;br /&gt;&lt;br /&gt;Doom also lurks in the reverse direction: when B sends #1 to A, A expects #7 and therefore discards messages #1-#6 from B !&lt;br /&gt;&lt;br /&gt;This is bad and caused me to enhance the design for UNICAST. The new design includes connection IDs, so we'll reset an old connection when a new connection ID is received, and receivers asking senders for the initial seqnos if they have no entry for a given sender.&lt;br /&gt;&lt;br /&gt;This change will not affect current users, running systems which are connected via switches/VLANs etc. But it will remove a showstopper for running JGroups in rough environments like the one described above.&lt;br /&gt;&lt;br /&gt;The design for the new enhanced UNICAST protocol can be found at [1].&lt;br /&gt;&lt;br /&gt;[1]&lt;a href="http://javagroups.cvs.sourceforge.net/viewvc/javagroups/JGroups/doc/design/UNICAST.new.txt?revision=1.1.2.5&amp;amp;view=markup&amp;amp;pathrev=Branch_JGroups_2_6_MERGE"&gt; http://javagroups.cvs.sourceforge.net/viewvc/javagroups/JGroups/doc/design/UNICAST.new.txt?revision=1.1.2.5&amp;amp;view=markup&amp;amp;pathrev=Branch_JGroups_2_6_MERGE&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19835054-8827612557769877101?l=belaban.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://belaban.blogspot.com/feeds/8827612557769877101/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://belaban.blogspot.com/2009/04/those-damn-edge-cases.html#comment-form' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/8827612557769877101'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/8827612557769877101'/><link rel='alternate' type='text/html' href='http://belaban.blogspot.com/2009/04/those-damn-edge-cases.html' title='Those damn edge cases !'/><author><name>Bela Ban</name><uri>http://www.blogger.com/profile/01830789377474906550</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='28' height='32' src='http://3.bp.blogspot.com/_NUNlUHL8KjI/ST6MrbBdIaI/AAAAAAAAAAM/bu0j0fvc6sI/S220/bela.jpg'/></author><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-19835054.post-4212982485782917024</id><published>2009-03-16T10:24:00.003+01:00</published><updated>2009-03-16T10:35:17.268+01:00</updated><title type='text'>Status update and training</title><content type='html'>Here's a quick status update.&lt;br /&gt;&lt;br /&gt;Logical addresses are almost done and work with the UDP and TCP transports. I have yet to support TUNNEL (GossipRouter), but because Vladimir made a fair amount of changes to GR/TUNNEL on CVS head, I'll have to merge the logical addresses branch back to head first, before I can apply logical addresses to CVS head.&lt;br /&gt;&lt;br /&gt;Because I have some other important changes in 2.8 (e.g. anycasting support), I decided to curtail the scope of 2.8 a bit and move some stuff to the newly created 2.9. For example, currently there can only be 1 physical address associated with 1 logical address (UUID). Multiple physical addresses will be supported in 2.9.&lt;br /&gt;&lt;br /&gt;Also, canonicalization of UUIDs into shorts was pushed into 2.9. This is largely an optimization.&lt;br /&gt;&lt;br /&gt;Speaking of optimizations, performance of the branch looks promising ! Although we now ship both dest and src addresses with a Message, which makes the serialized message a bit bigger (for IPv4, *not* IPv6 !), UUIDs take up less space in memory and thus I got a throughput increase from 105MBytes/sec to 113MBytes/sec ! These are only preliminary results, and I have yet to run a full perf test.&lt;br /&gt;&lt;br /&gt;I'll probably merge the branch back to head and then make the necessary changes to TUNNEL/GR this week. Then we might release an alpha, for folks to try out logical addresses.&lt;br /&gt;&lt;br /&gt;The thing is that I'll be traveling a bit for the next couple of weeks: next week I'm near Amsterdam and April 6-9 I'll be in Munich, teaching the JBoss Clustering course (JB439). If you want to meet over a beer, or even join the course, drop me an email !&lt;br /&gt;Cheers,&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19835054-4212982485782917024?l=belaban.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://belaban.blogspot.com/feeds/4212982485782917024/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://belaban.blogspot.com/2009/03/status-update-and-training.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/4212982485782917024'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/4212982485782917024'/><link rel='alternate' type='text/html' href='http://belaban.blogspot.com/2009/03/status-update-and-training.html' title='Status update and training'/><author><name>Bela Ban</name><uri>http://www.blogger.com/profile/01830789377474906550</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='28' height='32' src='http://3.bp.blogspot.com/_NUNlUHL8KjI/ST6MrbBdIaI/AAAAAAAAAAM/bu0j0fvc6sI/S220/bela.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-19835054.post-5764848406755899609</id><published>2009-02-16T09:47:00.008+01:00</published><updated>2009-02-16T10:28:51.076+01:00</updated><title type='text'>What's cool about logical addresses ?</title><content type='html'>Finally, logical addresses (&lt;a href="https://jira.jboss.org/jira/browse/JGRP-129"&gt;https://jira.jboss.org/jira/browse/JGRP-129&lt;/a&gt;) will get implemented (in 2.8) !&lt;br /&gt;&lt;br /&gt;For those of you who've used JGroups, you'll know that the identity of a node was always its IP address and the port on which the receiver thread was listening, e.g. 192.168.1.5:7800.&lt;br /&gt;&lt;br /&gt;While this gives you a relatively compact and readable address (you can deduct from the address on which host it resides), there's also a problem: this type of address is not unique over space and time.&lt;br /&gt;&lt;br /&gt;Let's look at an example.&lt;br /&gt;&lt;br /&gt;Say we have a cluster of {A,B,C}. C's address is 192.168.1.5:7800. Let's assume A has sent 25 messages to C and C has multicast 104 messages. We're using sequence numbers (seqnos) to order messages, attached to a message via a header.&lt;br /&gt;&lt;br /&gt;So the next message that C will multicast is #105 and the next message it expects from A is #26.&lt;br /&gt;&lt;br /&gt;This is &lt;span style="font-weight: bold;"&gt;state&lt;/span&gt; that is maintained by the respective protocols in a JGroups stack.&lt;br /&gt;&lt;br /&gt;Now let's assume C is killed and restarted. Or C is shunned, therefore leaves the channel and then automatically (if configured) reconnects. Let's also assume that the failure detection protocol has not yet kicked in and therefore A and B will not have received a view {A,B} which excludes C.&lt;br /&gt;&lt;br /&gt;Now C rejoins the cluster. Because this is a &lt;span style="font-weight: bold;"&gt;reincarnation&lt;/span&gt; of C, it creates a new protocol stack, and all the state mentioned above is gone. The reincarnated C now sends #1 as next seqno and expects #1 from A as well.&lt;br /&gt;&lt;br /&gt;There are 2 things that happen now:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;When C multicasts its next message with seqno #1, both A and B will drop it. A drops it because it expects C's next message to be #105, not #1. As a matter of fact A will drop &lt;span style="font-weight: bold;"&gt;the first 104 messages from C&lt;/span&gt; !&lt;/li&gt;&lt;li&gt;A multicasts a message with seqno #26. However, C expects #1 from A and therefore buffers message #26. As a matter of fact, C will buffer &lt;span style="font-weight: bold;"&gt;all&lt;/span&gt; messages from A until it receives #1 which will not happen ! Consequence: C will run out of memory at some point. Even worse: C will prevent stability messages from purging messages seen by all cluster nodes, so in the worst case, &lt;span style="font-weight: bold;"&gt;all&lt;/span&gt; cluster nodes will run out of memory !&lt;/li&gt;&lt;/ol&gt;OK, while this is somewhat of an edge case and can be remedied by (a) waiting some time before restarting a node and/or (b) not pinning down ports, the fact is still that when this happens, it wreaks havoc.&lt;br /&gt;&lt;br /&gt;So how are logical address going to change this ?&lt;br /&gt;&lt;br /&gt;A logical address consists of&lt;br /&gt;&lt;ul&gt;&lt;li&gt;an org.jgroups.util.UUID (copied from java.util.UUID and relieved of some useless fields) and&lt;br /&gt;&lt;/li&gt;&lt;li&gt;a logical name&lt;/li&gt;&lt;/ul&gt;The logical name is given to a channel when the channel is created, e.g.&lt;br /&gt;&lt;pre&gt;&lt;span style="font-family:courier new;"&gt;JChannel channel=new JChannel("node-4", "/home/bela/udp.xml");&lt;/span&gt;&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;This means that the channel's address will always get printed as "node-4". Under the cover, however, we use a UUID (for equals() and hashCode()), which is unique over space and time. The UUID is recreated on channel connect, so the above reincarnation issue will not happen.&lt;br /&gt;&lt;br /&gt;The logical name is syntactic sugar, because if we have views consisting of UUIDs (16 bytes), that's not a pretty sight, so views like {"node-1", "node-2", "node-3", "node-4"} look much better.&lt;br /&gt;&lt;br /&gt;Note that the user will be able to pick whether to see UUIDs or logical names.&lt;br /&gt;&lt;br /&gt;Also, if null is passed as logical name, JGroups will create a logical name (e.g. using the host name and a counter).&lt;br /&gt;&lt;br /&gt;A UUID will get mapped to one or more physical addresses. The mapping is maintained by the transport and there will be an ARP-like protocol (handled by Discovery) to fetch the initial mappings, and to fetch a mapping if not available.&lt;br /&gt;&lt;br /&gt;The detailed design is described in &lt;a href="http://javagroups.cvs.sourceforge.net/viewvc/javagroups/JGroups/doc/design/LogicalAddresses.txt?revision=1.12&amp;amp;view=markup"&gt;http://javagroups.cvs.sourceforge.net/viewvc/javagroups/JGroups/doc/design/LogicalAddresses.txt?revision=1.12&amp;amp;view=markup&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;So the most important aspect of logical addresses is that they decouple the identity of a JGroups node from its physical address. &lt;/span&gt;&lt;br /&gt;&lt;br /&gt;This opens up interesting possibilities.&lt;br /&gt;&lt;br /&gt;We might for example associate multiple physical address with a UUID, and load balance over the physical addresses. We could open multiple sockets, and associate each (receiver) socket's physical address with the UUID. We could even change this at runtime: e.g. if a NIC is down and we get exceptions on the socket, simply create another socket, remove the old association across the cluster (there's a call for this) and associate the new physical address with the UUID.&lt;br /&gt;&lt;br /&gt;Another possibilty is to implement NATting, firewalling or STUNning this way !&lt;br /&gt;&lt;br /&gt;I'll probably make the picking of a physical address given a UUID pluggable, so developers can even provide their own address translation in the transport !&lt;br /&gt;&lt;br /&gt;This change is overdue and I'm happy that work has finally started on this. If you want to follow this, the branch is Branch_JGroups_2_8_LogicalAddresses.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19835054-5764848406755899609?l=belaban.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://belaban.blogspot.com/feeds/5764848406755899609/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://belaban.blogspot.com/2009/02/whats-cool-about-logical-addresses.html#comment-form' title='15 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/5764848406755899609'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/5764848406755899609'/><link rel='alternate' type='text/html' href='http://belaban.blogspot.com/2009/02/whats-cool-about-logical-addresses.html' title='What&apos;s cool about logical addresses ?'/><author><name>Bela Ban</name><uri>http://www.blogger.com/profile/01830789377474906550</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='28' height='32' src='http://3.bp.blogspot.com/_NUNlUHL8KjI/ST6MrbBdIaI/AAAAAAAAAAM/bu0j0fvc6sI/S220/bela.jpg'/></author><thr:total>15</thr:total></entry><entry><id>tag:blogger.com,1999:blog-19835054.post-8077614627606389169</id><published>2009-01-21T14:53:00.023+01:00</published><updated>2009-01-22T17:19:46.127+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='ReplCache JGroups raid distribution replication'/><title type='text'>ReplCache: storing your data in the cloud with variable replication</title><content type='html'>Some time ago, I wrote a &lt;a href="http://www.jgroups.org/memcached.html"&gt;prototype of a cache&lt;/a&gt; which distributes its elements (key-value pairs) across all cluster nodes. This is done by computing the consistent hash of a key K and picking a cluster node based on the hash mod N where N is the cluster size. So any given element will only ever be stored &lt;span style="font-weight: bold;"&gt;once&lt;/span&gt; in the cluster.&lt;br /&gt;&lt;br /&gt;This is great because it maximizes use of the aggregated memory of the 'cloud' (a.k.a. all cluster nodes). For example, if we have 10 nodes, and each node has 1 GB of memory, then the aggregated (cloud) memory is 10 GB. This is similar to a logical volume manager (e.g. LVM in Linux), where we 'see' a virtual volume, the size of which can grow or shrink, and which hides the mapping to physical disks.&lt;br /&gt;&lt;br /&gt;So, if we pick a good consistent hash algorithm, then for 1'000 elements, we can assume that in a cluster of 10 nodes, each node stores on average 100 elements. Also, with consistent hashing, if you pick a good hash algorithm, rehashing on view changes is minimal.&lt;br /&gt;&lt;br /&gt;Now, the question is what we do when a node crashes. All elements stored by that node are gone, and have to be re-read from somewhere, usually a database.&lt;br /&gt;&lt;br /&gt;To provide highly available data and minimize access to the database, a common technique is to replicate data. For example, if we &lt;span style="font-weight: bold;"&gt;replicate&lt;/span&gt; K to all 10 nodes, then we can tolerate 9 nodes going down and will still have K available.&lt;br /&gt;&lt;br /&gt;However, this comes at a cost: if everyone replicates all of its elements to all cluster nodes, then we can effectively only use 1/N of the 'cloud memory' (10 GB), which is 1 GB... So we trade access to the large cloud memory for availability.&lt;br /&gt;&lt;br /&gt;This is like RAID: if we have 2 disks of 500 GB each, then we can use them as RAID 0 or JBOD (Just a Bunch of Disks) and have 1 TB available for our data. If one of the disks crashes, we lose data that resides on that disk. If we happen to have a file F with 10 blocks, and 5 were stored on the crashed disk, then F is gone.&lt;br /&gt;&lt;br /&gt;If we use RAID 1, then the contents of disk-1 are mirrored onto disk-2 and vice versa. This is great, because we can now lose 1 disk and still have all of our data available. However, we now have only 500 MB of disk space available for our data !&lt;br /&gt;&lt;br /&gt;Enter &lt;a href="http://www.jgroups.org/replcache.html"&gt;&lt;span style="font-weight: bold;"&gt;ReplCache&lt;/span&gt;&lt;/a&gt;. This is a prototype I've been working on for the last 2 weeks.&lt;br /&gt;&lt;br /&gt;ReplCache allows for &lt;span style="font-weight: bold;"&gt;variable replication&lt;/span&gt;, which means we can tell it on a put(key, value, K) how many copies (replication count) of that element should be stored in the cloud. A replication count K can be:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;K == 1: the element is stored only once. This is the same as what PartitionedHashMap does&lt;br /&gt;&lt;/li&gt;&lt;li&gt;K == -1: the element is stored on all nodes in the cluster&lt;/li&gt;&lt;li&gt;K == &gt; 1: the element is stored on K nodes only. ReplCache makes sure to always have K instances of an element available, and if K drops because a node leaves or crashes, ReplCache might copy or move the element to bring K back up to the original value&lt;/li&gt;&lt;/ul&gt;So why is ReplCache better than PartitionedHashMap ?&lt;br /&gt;&lt;br /&gt;ReplCache is a superset of PartitionedHashMap, which means it can be used as a PartitionedHashMap: just use K == 1 for all elements to be inserted !&lt;br /&gt;&lt;br /&gt;The more important feature, however, is that ReplCache can use &lt;span style="font-style: italic;"&gt;more&lt;/span&gt; of the available cloud memory and that it allows a user to define availability as a quality of service &lt;span style="font-style: italic;"&gt;per data element&lt;/span&gt; !  Data that can be re-read from the DB can be stored with K == 1. Data that should be highly available should use K == -1, and data which should be more or less highly available, but can still be read from the DB (but maybe that's costly), should be stored with K &gt; 1.&lt;br /&gt;&lt;br /&gt;Compare this to RAID: once we've configured RAID 1, then all data written to disk-1 will always be mirrored to disk-2, even data that could be trashed on a crash, for example data in /tmp.&lt;br /&gt;&lt;br /&gt;With ReplCache, the user (who knows his/her data best) takes control and defines QoS for each element !&lt;br /&gt;&lt;br /&gt;Below is a screenshot of 2 ReplCache instances (started with java org.jgroups.demos.ReplCacheDemo -props /home/bela/udp.xml) which shows that we've added some data:&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_NUNlUHL8KjI/SXcyx6VBb0I/AAAAAAAAABY/_GA32D-NUV0/s1600-h/Picture+1.png"&gt;&lt;img style="cursor: pointer; width: 320px; height: 140px;" src="http://2.bp.blogspot.com/_NUNlUHL8KjI/SXcyx6VBb0I/AAAAAAAAABY/_GA32D-NUV0/s320/Picture+1.png" alt="" id="BLOGGER_PHOTO_ID_5293755720049717058" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;It shows that both instance have key "everywhere" because it is replicated to all cluster nodes due to K == -1. The same goes for key "two": because K == 2, it is stored on both instances as we only have 2 cluster nodes.&lt;br /&gt;There are 2 keys with K == 1: "id" and "name". Both are stored on instance 2, but that's coincidence. For K keys and N cluster nodes, every node should store approximately K/N keys.&lt;br /&gt;&lt;br /&gt;ReplCache is experimental, and serves as a prototype to play with &lt;a href="http://jboss.org/community/docs/DOC-10278"&gt;data partitioning/striping&lt;/a&gt; for JBossCache.&lt;br /&gt;ReplCache is in the JGroups CVS (head) and the code can be downloaded &lt;a href="http://downloads.sourceforge.net/javagroups/replcachedemo.jar"&gt;here&lt;/a&gt;. To run the demo, execute:&lt;br /&gt;java -jar replcachedemo.jar&lt;br /&gt;&lt;br /&gt;For the technical details, the design is &lt;a href="http://javagroups.cvs.sourceforge.net/viewvc/javagroups/JGroups/doc/design/ReplCache.txt?revision=1.13&amp;amp;view=markup"&gt;here&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;There is a nice 5 minute demo at &lt;a href="http://www.jgroups.org/demos.html"&gt;http://www.jgroups.org/demos.html&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Feedback is appreciated, use the JGroups mailing lists !&lt;br /&gt;&lt;br /&gt;Enjoy !&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19835054-8077614627606389169?l=belaban.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://belaban.blogspot.com/feeds/8077614627606389169/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://belaban.blogspot.com/2009/01/replcache-storing-your-data-in-cloud.html#comment-form' title='13 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/8077614627606389169'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/8077614627606389169'/><link rel='alternate' type='text/html' href='http://belaban.blogspot.com/2009/01/replcache-storing-your-data-in-cloud.html' title='ReplCache: storing your data in the cloud with variable replication'/><author><name>Bela Ban</name><uri>http://www.blogger.com/profile/01830789377474906550</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='28' height='32' src='http://3.bp.blogspot.com/_NUNlUHL8KjI/ST6MrbBdIaI/AAAAAAAAAAM/bu0j0fvc6sI/S220/bela.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_NUNlUHL8KjI/SXcyx6VBb0I/AAAAAAAAABY/_GA32D-NUV0/s72-c/Picture+1.png' height='72' width='72'/><thr:total>13</thr:total></entry><entry><id>tag:blogger.com,1999:blog-19835054.post-1118074332478978844</id><published>2009-01-05T16:42:00.003+01:00</published><updated>2009-01-05T16:59:41.885+01:00</updated><title type='text'>JGroups 2.7 released</title><content type='html'>Finally, after almost a year of development, I released 2.7.0.GA this morning. It can be downloaded from &lt;a href="http://sourceforge.net/project/showfiles.php?group_id=6081&amp;amp;package_id=94868&amp;amp;release_id=651542"&gt;http://sourceforge.net/project/showfiles.php?group_id=6081&amp;amp;package_id=94868&amp;amp;release_id=651542&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Although 2.7 has 211 JIRA issues (bugfixes, tasks or features), most of the bugs have been back ported to the 2.6 branch. Why ? Because 2.6.7 is the version that ships with JBoss 5, and we made sure JGroups works optimally with it.&lt;br /&gt;&lt;br /&gt;So what's new ?&lt;br /&gt;&lt;br /&gt;There are almost no new features ! (Can you tell I'm not a marketing person ? :-))&lt;br /&gt;&lt;br /&gt;Most work (besides bug fixes) went into refactoring, e.g. we converted our test suite from JUnit to TestNG, allowing for parallel test execution and thus reduced our testing time from 2.5 hours to 15 minutes !&lt;br /&gt;&lt;br /&gt;Another change was that all properties are now set using JSR 175 annotations, so we could remove a lot of boilerplate code from protocol implementations. In my opinion, the more code I can remove (without impacting functionality), the better !&lt;br /&gt;&lt;br /&gt;Using annotations for properties also allows us to automatically generate documentation for the properties of all protocols.&lt;br /&gt;&lt;br /&gt;I also marked unsupported or experimental classes/methods with @Unsupported or @Experimental annotations.&lt;br /&gt;&lt;br /&gt;We were able to increase performance a bit, compared to 2.6, but 2.6 is already quite fast, so unless you need those additional 5-10%, go for 2.6.7.&lt;br /&gt;&lt;br /&gt;In a nutshell, 2.7 serves as the groundwork for 2.8, which will have many new features.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19835054-1118074332478978844?l=belaban.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://belaban.blogspot.com/feeds/1118074332478978844/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://belaban.blogspot.com/2009/01/jgroups-27-released.html#comment-form' title='14 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/1118074332478978844'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/1118074332478978844'/><link rel='alternate' type='text/html' href='http://belaban.blogspot.com/2009/01/jgroups-27-released.html' title='JGroups 2.7 released'/><author><name>Bela Ban</name><uri>http://www.blogger.com/profile/01830789377474906550</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='28' height='32' src='http://3.bp.blogspot.com/_NUNlUHL8KjI/ST6MrbBdIaI/AAAAAAAAAAM/bu0j0fvc6sI/S220/bela.jpg'/></author><thr:total>14</thr:total></entry><entry><id>tag:blogger.com,1999:blog-19835054.post-4224573385546040100</id><published>2008-12-09T12:35:00.002+01:00</published><updated>2008-12-09T12:38:23.891+01:00</updated><title type='text'>Better late than never</title><content type='html'>This is my new blog.&lt;br /&gt;&lt;br /&gt;I know, I said that in 2005 already. This time, though, I'm serious.&lt;br /&gt;&lt;br /&gt;I'll write mostly about JGroups, but when I have thoughts on different topics I'll write them down, too.&lt;br /&gt;&lt;br /&gt;Take my postings with a grain of salt and remember, these are my presonal thoughts and not necessarily those of my employer, JBoss / RedHat.&lt;br /&gt;Bela Ban&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19835054-4224573385546040100?l=belaban.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://belaban.blogspot.com/feeds/4224573385546040100/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://belaban.blogspot.com/2008/12/better-late-than-never.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/4224573385546040100'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19835054/posts/default/4224573385546040100'/><link rel='alternate' type='text/html' href='http://belaban.blogspot.com/2008/12/better-late-than-never.html' title='Better late than never'/><author><name>Bela Ban</name><uri>http://www.blogger.com/profile/01830789377474906550</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='28' height='32' src='http://3.bp.blogspot.com/_NUNlUHL8KjI/ST6MrbBdIaI/AAAAAAAAAAM/bu0j0fvc6sI/S220/bela.jpg'/></author><thr:total>0</thr:total></entry></feed>
