Tuesday, December 31, 2019

Spanning JGroups Kubernetes-based clusters across Google and Amazon clouds

In this (long!) post, I'll provide step-by-step instructions on how to create JGroups clusters in Google Kubernetes Engine (GKE) and Amazon (EKS) clusters, and connect them into one virtual cluster using RELAY2.

Each local cluster is called a site. In this tutorial, we'll call the sites NYC and SFC. We'll start 5 nodes in NYC and 3 in SFC.

The sample deployments and services are defined in YAML and we're using Kubernetes to create the clusters.

To try this yourself, you'll need kubectl, eksctl and gcloud installed, and accounts on both EKS and GKE.

The demo is RelayDemo [1]. It is a simple chat, started in a pod, and every typed line appears in all pods across all sites, and then every pod sends a response back to the sender, which displays all responses. This way, we know who received our chat message.


Architecture

The setup of this tutorial is as follows:



On the left, we have nodes A,B,C,D,E in site NYC (Amazon EKS) and on the right, X,Y,Z in SFC (Google GKE).

A in NYC and X in SFC assume the role of site master (see [2]). This means, they join a separate JGroups cluster, called bridge cluster, which connects the two sites, and relay messages between the sites.

A site master is not a dedicated node, but any node can assume the role of site master. For example, when A leaves or crashes, B will take over the site master role, join the bridge cluster and relay messages between sites NYC and SFC.

The problem with Kubernetes / Openshift is that a pod cannot directly connect to a pod in a different cluster, region, or cloud provider. That is, without resorting to specific container network implementations (CNI) implementations.

To overcome this problem, the above setup uses a GossipRouter and TUNNEL [3]: this way, A and X can communicate across different regions or (in this case) even different cloud providers.

The way this is done is simple: the configuration of the bridge cluster includes TUNNEL as transport and a list of GossipRouters, in this case the ones in NYC and SFC (more details later).

A and X connect to both GossipRouters via TCP, under their respective cluster names. So A connects to GR-NYC and GR-SFC and X connects to its local GR, and the remote one in NYC.

When A wants to send a message to X, it can use either its local GossipRouter, or the one in SFC (by default, JGroups load-balances requests between the GossipRouters). In any case, the ingress TCP connection established by X to a GossipRouter is used to send egress traffic to X.

This means, we can send messages to any member of the bridge cluster, as long as all GossipRouters are publicly accessible and the members of the bridge cluster can connect to them.

But now let's get cracking! We'll do the following in the next sections:
  • Set up an EKS cluster (NYC)
  • Set up a GKE cluster (SFC)
  • Deploy a GossipRouter service in both sites
  • Deploy 5 pods in NYC and 3 pods in SFC
  • Use one of the pods in each site to talk to the other site with RelayDemo

Set up the NYC cluster in EKS

This can be done via the GUI, the AWS CLI or eksctl [4]. For simplicity, I chose the latter.
To create a cluster "nyc" in the us-east-1 region, execute:

eksctl create cluster --name nyc --region us-east-1 --nodegroup-name nyc-nodegroup --node-type t3.small --nodes 2 --nodes-min 1 --nodes-max 4 --managed

This will take 10-15 minutes.

The local kubeconfig should now point to the AWS cluster. This can be seen with kubectl config get-contexts. If this is not the case, use the AWS CLI to change this, e.g.:

aws eks --region use-east-1 update-kubeconfig --name nyc

This make kubectl access the NYC cluster by default.

Let's now deploy the GossipRouter in NYC:

kubectl apply -f https://raw.githubusercontent.com/belaban/jgroups-docker/master/yaml/gossiprouter.yaml

The YAML file contains a deployment of the GossipRouter and a LoadBalancer service: [5]. The public address of the GossipRouter service can be seen as follows:

kubectl get svc gossiprouter
NAME           TYPE           CLUSTER-IP     EXTERNAL-IP                                                                     PORT(S)                                         AGE
gossiprouter   LoadBalancer   10.100.28.38   a6abc71e42b2211ea9c3716e7fa74966-862f92ba6a28fd36.elb.us-east-1.amazonaws.com   8787:31598/TCP,9000:30369/TCP,12001:31936/TCP   2m56s


We can see that the public address is a6abc71e42b2211ea9c3716e7fa74966-862f92ba6a28fd36.elb.us-east-1.amazonaws.com. Write this down somewhere, as we'll need to add it to our TUNNEL configuration later.


Set up SFC cluster in GKE

To create a cluster on GKE, execute:

gcloud container clusters create sfc  --num-nodes 2

This will create a cluster in the default region configured in gcloud.

Note that this added a new context to the kube config, and switched to it. If not, manually switch to it, e.g.

kubectl config use-context gke_ispnperftest_us-central1-a_sfc

Now deploy the GossipRouter in SFC (same as above, for NYC):
kubectl apply -f https://raw.githubusercontent.com/belaban/jgroups-docker/master/yaml/gossiprouter.yaml

Now get the public IP address of the GossipRouter:
kubectl get svc gossiprouter
NAME           TYPE           CLUSTER-IP      EXTERNAL-IP     PORT(S)                                         AGE
gossiprouter   LoadBalancer   10.19.247.254   35.232.92.116   8787:30150/TCP,9000:32534/TCP,12001:32455/TCP   101s


The public IP is 35.232.92.116. Take a note of this, as we'll need it later.
We're now ready to deploy the cluster nodes in NYC and SFC.

Deploy the pods in NYC

We'll deploy 5 pods in NYC. To do this, we first need to switch the context back to NYC, e.g. by executing
kubectl config use-context jgroups@nyc.us-east-1.eksctl.io

Next, download the 2 YAML files for NYC and SFC locally (we need to make changes):
mkdir tmp ; cd tmp
curl https://raw.githubusercontent.com/belaban/jgroups-docker/master/yaml/nyc.yaml > nyc.yaml
curl https://raw.githubusercontent.com/belaban/jgroups-docker/master/yaml/sfc.yaml > sfc.yaml

Now edit both YAML files and replace the TUNNEL_INITIAL_HOSTS system variable "load-balancer-1[12001],load-balancer-2[12001]" with
"a6abc71e42b2211ea9c3716e7fa74966-862f92ba6a28fd36.elb.us-east-1.amazonaws.com[12001],35.232.92.116[12001]".

This points the TUNNEL protocol to the two publicly accessible GossipRouters in NYC and SFC:

<TUNNEL

  port_range="${PORT_RANGE:0}"  gossip_router_hosts="${TUNNEL_INITIAL_HOSTS:127.0.0.1[12001]}"/>

This means that TUNNEL will establish 2 TCP connections, one to the GossipRouter in NYC and the other one to the GossipRouter in SFC.

Now deploy the NYC pods:
> kubectl apply -f tmp/nyc.yaml
deployment.apps/nyc created
service/nyc created


This shows that 1 pod has been created:
> kubectl get pods -o wide
NAME                           READY   STATUS    RESTARTS   AGE   IP              NODE                             NOMINATED NODE   READINESS GATES
gossiprouter-f65bb6858-jks8q   1/1     Running   0          25m   192.168.36.19   ip-192-168-38-111.ec2.internal   <none>           <none>
nyc-5f4964d444-9v5dm           1/1     Running   0          73s   192.168.26.87   ip-192-168-8-51.ec2.internal     <none>           <none>


Next, scale this to 5:
> kubectl scale --replicas=5 deployment nyc
deployment.extensions/nyc scaled


Listing the pods shows 5 'nyc' pods:
> kubectl get pods
NAME                           READY   STATUS    RESTARTS   AGE
gossiprouter-f65bb6858-jks8q   1/1     Running   0          27m
nyc-5f4964d444-2ttfp           1/1     Running   0          49s
nyc-5f4964d444-4lccs           1/1     Running   0          49s
nyc-5f4964d444-8622d           1/1     Running   0          49s
nyc-5f4964d444-9v5dm           1/1     Running   0          3m21s
nyc-5f4964d444-tm5h5           1/1     Running   0          49s


Let's exec into one of the and make sure that the local cluster formed:
> kubectl exec nyc-5f4964d444-2ttfp probe.sh
#1 (307 bytes):
local_addr=nyc-5f4964d444-2ttfp-24388
physical_addr=192.168.53.43:7800
view=[nyc-5f4964d444-9v5dm-21647|4] (5) [nyc-5f4964d444-9v5dm-21647, nyc-5f4964d444-tm5h5-64872, nyc-5f4964d444-2ttfp-24388, nyc-5f4964d444-8622d-63103, nyc-5f4964d444-4lccs-4487]
cluster=RelayDemo
version=4.1.9-SNAPSHOT (Mont Ventoux)

1 responses (1 matches, 0 non matches)


This shows a view of 5, so the 5 pods did indeed form a cluster.

Deploy the pods in SFC

Let's now switch the kubeconfig back to SFC (see above) and deploy the SFC cluster:
> kubectl apply -f tmp/sfc.yaml
deployment.apps/sfc created
service/sfc created

>  kubectl scale --replicas=3 deployment/sfc
deployment.extensions/sfc scaled

> kubectl get pods
NAME                            READY   STATUS    RESTARTS   AGE
gossiprouter-6cfdc58df5-7jph4   1/1     Running   0          21m
sfc-5d6774b647-25tk5            1/1     Running   0          50s
sfc-5d6774b647-sgxsk            1/1     Running   0          50s
sfc-5d6774b647-sjt9k            1/1     Running   0          88s



This shows that we have 3 pods in SFC running.

Run the demo

So, now we can run RelayDemo to see if the virtual cluster across the two clouds is working correctly. To do this, we run a bash in one of the pods:
> kubectl get pods
NAME                            READY   STATUS    RESTARTS   AGE
gossiprouter-6cfdc58df5-7jph4   1/1     Running   0          28m
sfc-5d6774b647-25tk5            1/1     Running   0          7m50s
sfc-5d6774b647-sgxsk            1/1     Running   0          7m50s
sfc-5d6774b647-sjt9k            1/1     Running   0          8m28s
> kubectl exec -it sfc-5d6774b647-sgxsk bash

bash-4.4$ 

The RelayDemo can be started with relay.sh:
relay.sh -props sfc.xml -name Temp

-------------------------------------------------------------------
GMS: address=Temp, cluster=RelayDemo, physical address=10.16.1.6:7801
-------------------------------------------------------------------
View: [sfc-5d6774b647-sjt9k-37487|9]: sfc-5d6774b647-sjt9k-37487, sfc-5d6774b647-sgxsk-6308, sfc-5d6774b647-25tk5-47315, Temp


We can see that our cluster member named 'Temp' has joined the cluster.

When we send a message, we can see that all 3 members of the (local) SFC cluster and the 5 members of the (remote) NYC cluster are replying (we're also getting a reply from self):
hello
: << response from sfc-5d6774b647-sgxsk-6308
<< response from sfc-5d6774b647-sjt9k-37487
<< response from sfc-5d6774b647-25tk5-47315
<< hello from Temp
<< response from Temp
<< response from nyc-5f4964d444-9v5dm-21647:nyc
<< response from nyc-5f4964d444-2ttfp-24388:nyc
<< response from nyc-5f4964d444-tm5h5-64872:nyc
<< response from nyc-5f4964d444-8622d-63103:nyc
<< response from nyc-5f4964d444-4lccs-4487:nyc


The topology can be shown by typing 'topo' ('help' lists more commands):
: topo

nyc
  nyc-5f4964d444-9v5dm-21647 (192.168.26.87:7800) (me) // site master
  nyc-5f4964d444-tm5h5-64872 (192.168.30.27:7800)
  nyc-5f4964d444-2ttfp-24388 (192.168.53.43:7800)
  nyc-5f4964d444-8622d-63103 (192.168.62.83:7800)
  nyc-5f4964d444-4lccs-4487 (192.168.40.102:7800)

sfc
  sfc-5d6774b647-sjt9k-37487 (10.16.1.5:7800) (me) // site master
  sfc-5d6774b647-sgxsk-6308 (10.16.1.6:7800)
  sfc-5d6774b647-25tk5-47315 (10.16.0.10:7800)
  Temp (10.16.1.6:7801)


This shows the members of both sites, plus their (internal) IP addresses and who the site masters are.

Dump the contents of the GossipRouters

This can be done via a utility program shipped with JGroups:
> java -cp jgroups.jar org.jgroups.tests.RouterStubGet -host 35.232.92.116 -cluster bridge
1: null:nyc, name=_nyc-5f4964d444-9v5dm-21647, addr=192.168.26.87:45275, server
2: null:sfc, name=_sfc-5d6774b647-sjt9k-37487, addr=10.16.1.5:42812, server


This shows the members of the bridge cluster, which registered with both GossipRouters.

Alternatively, the other GossipRouter can be used, but it show list the same members.

Add firewall/ingress rules to make the GossipRouter publicly available

If the GossipRouters cannot be accessed by the above command, then there has to be a firewall/ingress rule to be added to allow ingress traffic to port 12001.

Cross-site replication

The RelayDemo sample application is very basic and not very useful by itself, but the setup can be used for other types of applications, e.g. replication between data centers.

If we have in-memory data in NYC, and use SFC as a backup for NYC (and vice versa), then a total loss of the NYC cluster will not lose all the data, but clients can be failed over to SFC and will continue to work with the data.

This can be done for example by Red Hat Data Grid [6] and cross-site replication; as a matter of fact, all that needs to be done is to change the configuration, as explained in this post!

As usual, send questions and feedback to the JGroups mailing list.

Enjoy!



[1] https://github.com/belaban/JGroups/blob/master/src/org/jgroups/demos/RelayDemo.java
[2] http://www.jgroups.org/manual4/index.html#Relay2Advanced
[3] http://www.jgroups.org/manual4/index.html#TUNNEL_Advanced
[4] https://docs.aws.amazon.com/eks/latest/userguide/create-cluster.html
[5] https://github.com/belaban/jgroups-docker/blob/master/yaml/gossiprouter.yaml
[6] https://access.redhat.com/documentation/en-us/red_hat_data_grid/7.3/html/red_hat_data_grid_user_guide/x_site_replication


Wednesday, July 03, 2019

Compiling JGroups to native code with Quarkus/GraalVM

I'm happy to announce the availability of a JGroups extension for Quarkus!


What?


Quarkus is a framework that (among other things) compiles Java code down to native code (using GraalVM [4]), removing code that's not needed at run time.

Quarkus analyzes the code in a build phase, and removes code that's not used at run time, in order to have a small executable that starts up quickly.

This means that reflection cannot be used at run time, as all classes that are not used are removed at build time. However, reflection can be used at build time.

The other limitations that affect JGroups are threads and the creation of sockets. Both cannot be done at build time, but have to be done at run time. (More limitations of JGroups under Quarkus are detailed in [5]).

So what's the point of a providing a JGroups extension for Quarkus?

While a JGroups application can be compiled directly to native code (using GraalVM's native-image), it is cumbersome, and the application has to be restructured (see [6] for an example) to accommodate the limitations of native compilation.

In contrast, the JGroups extension provides a JChannel that can be injected into the application. The channel has been created according to a configuration file and connected (= joined the cluster) by the extension. The extension takes care of doing the reflection, the socket creation and the starting of threads at the right time (build- or run-time), and the user doesn't need to worry about this.

How?

So let's take a look at a sample application (available at [2]).

The POM includes the extension: groupId=org.jgroups.quarkus.extension and artifactId=quarkus-jgroups. This provides a JChannel that can be injected. The main class is ChatResource:

@ApplicationScoped 
@Path("/chat")
public class ChatResource extends ReceiverAdapter implements Publisher<String> {
    protected final Set<Subscriber<? super String>> subscribers=new HashSet<>();

    @Inject JChannel channel;

    protected void init(@Observes StartupEvent evt) throws Exception {
        channel.setReceiver(this);
        System.out.printf("-- view: %s\n", channel.getView());
    }

    protected void destroy(@Observes ShutdownEvent evt) {
        Util.close(channel);
        subscribers.forEach(Subscriber::onComplete);
        subscribers.clear();
    }

    @GET
    @Produces(MediaType.TEXT_PLAIN)
    @Path("/send/{msg}")
    public String sendMessage(@PathParam("msg") String msg) throws Exception {
        channel.send(null, Objects.requireNonNull(msg).getBytes());
        return String.format("message \"%s\" was sent on channel \n", msg);
    }

    @GET
    @Produces(MediaType.SERVER_SENT_EVENTS)
    @Path("/subscribe")
    public Publisher<String> greeting() {
        return this;
    }

    public void receive(Message msg) {
        onNext(msg);
    }

    public void receive(MessageBatch batch) {
        for(Message msg: batch)
            onNext(msg);
    }

    public void viewAccepted(View view) {
        System.out.printf("-- new view: %s\n", view);
    }

    public void subscribe(Subscriber<? super String> s) {
        if(s != null)
            subscribers.add(s);
    }

    protected void onNext(Message msg) {
        String s=new String(msg.getRawBuffer(), msg.getOffset(), msg.getLength());
        System.out.printf("-- from %s: %s\n", msg.src(), s);
        subscribers.forEach(sub -> sub.onNext(s));
    }
}
 
It has a JChannel channel which is injected by Arc (the dependency mechanism used in Quarkus). This channel is fully created and connected when it is injected.

The receive(Message) and receive(MessageBatch) methods receive messages sent by itself or other members in the cluster. It in turn publishes them via the Publisher interface. All subscribers will therefore receive all messages sent in the cluster.

The sendMessage() method is invoked when a URL of the form http://localhost:8080/chat/send/mymessage is received. It takes the string parameter ("mymessage") and uses the injected channel to send it to all members of the cluster.

The URL http://localhost:8080/chat/subscribe (or http://localhost:8080/streaming.html in a web browser) can be used to subscribe to messages being received by the channel.

Demo

Let's now run a cluster of 2 instances: open 2 shells and type the following commands (output has been edited for brevity):

Shell1:
[belasmac] /Users/bela/quarkus-jgroups-chat$ mvn compile quarkus:dev
...
[INFO] --- quarkus-maven-plugin:0.18.0:dev (default-cli) @ quarkus-jgroups-chat ---
2019-07-03 14:12:05,025 DEBUG [org.jgr.qua.ext.JChannelTemplate] (main) creating channel based on config config=chat-tcp.xml, bind_addr=, initial_hosts=, cluster=quarkus-jgroups-chat

 
-------------------------------------------------------------------
GMS: address=belasmac-19612, cluster=quarkus-jgroups-chat, physical address=127.0.0.1:7800
-------------------------------------------------------------------
-- view: [belasmac-19612|0] (1) [belasmac-19612]


Shell2:
[belasmac] /Users/bela/quarkus-jgroups-chat$ mvn compile quarkus:dev -Dquarkus.http.port=8081
...
[INFO] --- quarkus-maven-plugin:0.18.0:dev (default-cli) @ quarkus-jgroups-chat ---
2019-07-03 14:15:02,463 DEBUG [org.jgr.qua.ext.JChannelTemplate] (main) creating channel based on config config=chat-tcp.xml, bind_addr=, initial_hosts=, cluster=quarkus-jgroups-chat

-------------------------------------------------------------------
GMS: address=belasmac-25898, cluster=quarkus-jgroups-chat, physical address=127.0.0.1:7801
-------------------------------------------------------------------
-- view: [belasmac-19612|1] (2) [belasmac-19612, belasmac-25898]

The system property quarkus.http.port=8081 is needed, or else there would be a port collision, as the default port of 8080 has already been taken by the first application (both apps are started on the same host).

The output shows that there's a cluster of 2 members.

We can now post a message by invoking curl http://localhost:8080/chat/send/hello%20world and curl http://localhost:8081/chat/send/message2.

Both shells show that they received both messages:
-- view: [belasmac-19612|1] (2) [belasmac-19612, belasmac-25898]
-- from belasmac-19612: hello world
-- from belasmac-25898: message2


Of course, we could also use a web browser to send the HTTP GET requests.

When subscribing to the stream of messages in a web browser (http://localhost:8081/streaming.html), this would look as follows:
 


Note that both channels bind to the loopback (127.0.0.1) address. This can be changed by changing bind_addr and initial_hosts in application.properties:

quarkus.channel.config=chat-tcp.xml

quarkus.channel.cluster=quarkus-jgroups-chat

# quarkus.channel.bind_addr=192.168.1.105

# quarkus.channel.initial_hosts=192.168.1.105[7800]

Uncomment the 2 properties and set them accordingly.

Alternatively, we can pass these as system properties, e.g.:

[belasmac] /Users/bela/quarkus-jgroups-chat$ mvn compile quarkus:dev -Dbind_addr=192.168.1.105 -Dinitial_hosts=192.168.1.105[7800],192.168.1.105[7801]
...
[INFO] --- quarkus-maven-plugin:0.18.0:dev (default-cli) @ quarkus-jgroups-chat ---
2019-07-03 14:38:28,258 DEBUG [org.jgr.qua.ext.JChannelTemplate] (main) creating channel based on config config=chat-tcp.xml, bind_addr=, initial_hosts=, cluster=quarkus-jgroups-chat

-------------------------------------------------------------------
GMS: address=belasmac-10738, cluster=quarkus-jgroups-chat, physical address=192.168.1.105:7800
-------------------------------------------------------------------
-- view: [belasmac-10738|0] (1) [belasmac-10738]


Native compilation

To compile the application to native code, the mvn package -Pnative command has to be executed:

[belasmac] /Users/bela/quarkus-jgroups-chat$ mvn package -Pnative
[INFO] Building jar: /Users/bela/quarkus-jgroups-chat/target/quarkus-jgroups-chat-1.0.0-SNAPSHOT.jar
[INFO]
[INFO] --- quarkus-maven-plugin:0.18.0:build (default) @ quarkus-jgroups-chat ---
[INFO] [io.quarkus.deployment.QuarkusAugmentor] Beginning quarkus augmentation
[INFO] [org.jboss.threads] JBoss Threads version 3.0.0.Beta4
[INFO] [io.quarkus.deployment.QuarkusAugmentor] Quarkus augmentation completed in 1343ms
[INFO] [io.quarkus.creator.phase.runnerjar.RunnerJarPhase] Building jar: /Users/bela/quarkus-jgroups-chat/target/quarkus-jgroups-chat-1.0.0-SNAPSHOT-runner.jar
[INFO]
[INFO] --- quarkus-maven-plugin:0.18.0:native-image (default) @ quarkus-jgroups-chat ---
[INFO] [io.quarkus.creator.phase.nativeimage.NativeImagePhase] Running Quarkus native-image plugin on OpenJDK 64-Bit Server VM
[INFO] [io.quarkus.creator.phase.nativeimage.NativeImagePhase] /Users/bela/graalvm/Contents/Home/bin/native-image -J-Djava.util.logging.manager=org.jboss.logmanager.LogManager --initialize-at-build-time= -H:InitialCollectionPolicy=com.oracle.svm.core.genscavenge.CollectionPolicy$BySpaceAndTime -jar quarkus-jgroups-chat-1.0.0-SNAPSHOT-runner.jar -J-Djava.util.concurrent.ForkJoinPool.common.parallelism=1 -H:FallbackThreshold=0 -H:+ReportUnsupportedElementsAtRuntime -H:+ReportExceptionStackTraces -H:+PrintAnalysisCallTree -H:-AddAllCharsets -H:EnableURLProtocols=http -H:-SpawnIsolates -H:+JNI --no-server -H:-UseServiceLoaderFeature -H:+StackTrace
[quarkus-jgroups-chat-1.0.0-SNAPSHOT-runner:93574]    classlist:   6,857.25 ms
[quarkus-jgroups-chat-1.0.0-SNAPSHOT-runner:93574]        (cap):   4,290.72 ms
[quarkus-jgroups-chat-1.0.0-SNAPSHOT-runner:93574]        setup:   6,430.30 ms
14:43:05,540 INFO  [org.jbo.threads] JBoss Threads version 3.0.0.Beta4
14:43:06,468 INFO  [org.xnio] XNIO version 3.7.2.Final
14:43:06,528 INFO  [org.xni.nio] XNIO NIO Implementation Version 3.7.2.Final
[quarkus-jgroups-chat-1.0.0-SNAPSHOT-runner:93574]   (typeflow):  17,331.26 ms
[quarkus-jgroups-chat-1.0.0-SNAPSHOT-runner:93574]    (objects):  24,511.12 ms
[quarkus-jgroups-chat-1.0.0-SNAPSHOT-runner:93574]   (features):   1,194.16 ms
[quarkus-jgroups-chat-1.0.0-SNAPSHOT-runner:93574]     analysis:  44,204.65 ms
[quarkus-jgroups-chat-1.0.0-SNAPSHOT-runner:93574]     (clinit):     579.00 ms
[quarkus-jgroups-chat-1.0.0-SNAPSHOT-runner:93574]     universe:   1,715.40 ms
[quarkus-jgroups-chat-1.0.0-SNAPSHOT-runner:93574]      (parse):   3,315.80 ms
[quarkus-jgroups-chat-1.0.0-SNAPSHOT-runner:93574]     (inline):   4,563.11 ms
[quarkus-jgroups-chat-1.0.0-SNAPSHOT-runner:93574]    (compile):  24,906.58 ms
[quarkus-jgroups-chat-1.0.0-SNAPSHOT-runner:93574]      compile:  34,907.28 ms
[quarkus-jgroups-chat-1.0.0-SNAPSHOT-runner:93574]        image:   4,557.78 ms
[quarkus-jgroups-chat-1.0.0-SNAPSHOT-runner:93574]        write:   2,531.16 ms
[quarkus-jgroups-chat-1.0.0-SNAPSHOT-runner:93574]      [total]: 109,858.54 ms
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  01:58 min
[INFO] Finished at: 2019-07-03T14:44:40+02:00
 


This uses GraalVM's native-image to generate a native executable. After a while, the resulting executable is in the ./target directory:

It's size is only 27MB and we can see that it is a MacOS native executable:
[belasmac] /Users/bela/quarkus-jgroups-chat/target$ ls -lh quarkus-jgroups-chat-1.0.0-SNAPSHOT-runner
-rwxr-xr-x  1 bela  staff    27M Jul  3 14:44 quarkus-jgroups-chat-1.0.0-SNAPSHOT-runner
[belasmac] /Users/bela/quarkus-jgroups-chat/target$ file quarkus-jgroups-chat-1.0.0-SNAPSHOT-runner
quarkus-jgroups-chat-1.0.0-SNAPSHOT-runner: Mach-O 64-bit executable x86_64


To run it:

[belasmac] /Users/bela/quarkus-jgroups-chat/target$ ./quarkus-jgroups-chat-1.0.0-SNAPSHOT-runner

-------------------------------------------------------------------
GMS: address=belasmac-55106, cluster=quarkus-jgroups-chat, physical address=127.0.0.1:7800
-------------------------------------------------------------------
-- view: [belasmac-55106|0] (1) [belasmac-55106]
 


When you run this yourself, you will notice that quick startup time of the second and subsequent members. Why not the first member? The first member has to wait for GMS.join_timeout millis (defined in chat-tcp.xml) to see if it discovers any other members, and so it always runs into this timeout.

To change bind_addr and initial_hosts, application.properties has to be changed before compiling to native code.



Caveats

The quarkus-jgroups extension depends on JGroups-4.1.2-SNAPSHOT, which it may not find unless a snapshot repository has been added to the POM (or settings.xml). Alternatively, git clone https://github.com/belaban/JGroups.git ; cd JGroups ; mvn install will generate and install this artifact in your local maven repo.

Currently, only TCP is supported (chat-tcp.xml). UDP will be supported once MulticastSockets are properly supported by GraalVM (see [5] for details).

For some obscure reason,
<enableJni>true</enableJni>

had to be enabled in the POM, or else native compilation would fail. I hope to remove this once I understand why...

Outlook

This was a relatively quick port of JGroups to native code. For feedback and questions please use the JGroups mailing list.

The following things are on my todo list for further development:
  • Provide more JGroups classes via extensions, e.g. RpcDispatcher (to make remote method calls)
  • Provide docker images with native executables
  • Implement support for UDP
  • Trim down the size of the executable even more

Enjoy!



[1] https://github.com/jgroups-extras/quarkus-jgroups
[2] https://github.com/jgroups-extras/quarkus-jgroups-chat
[3] https://quarkus.io
[4] https://www.graalvm.org
[5] https://github.com/belaban/JGroups/blob/master/doc/design/PortingToGraalVM.txt
[6] https://github.com/belaban/JGroups/blob/master/tests/perf/org/jgroups/tests/perf/ProgrammaticUPerf2.java

Wednesday, June 05, 2019

Network sniffing

Oftentimes, JGroups/Datagrid users capture traffic from the network for analysis. Using tools such as wireshark or tshark, they can look at UDP packets or TCP streams.

There used to be a wireshark plugin, written by Richard Achmatowicz, but since it was written in C, every time the wire format changed, the C code had to be changed, too. It is therefore not maintained any longer.

However, there's a class in JGroups that can be used to read messages from the wire: ParseMessages. Since it uses the the same code that's reading messages off the wire, it can always parse messages in the version it's shipped with. It is therefore resistant to wire format changes.

In 4.1.0, I changed ParseMessages to be more useful:
  • Reading of TCP streams is now supported
  • It can read packets from stdin (ideal for piping from tshark)
  • Handling of binary data (e.g. from a PCAP capture) is supported
  • Views are parsed and displayed (e.g. in VIEW or JOIN response messages)
  • Logical names can be displayed: instead of {node-2543, node-2543} instead of {3673e687-fafb-63e0-2ff1-67c0a8a6f8eb,312aa7da-f3d5-5999-1f5c-227f6e43728e}
 To demonstrate how to use this, I made 4 short videos:
  1. Capture UDP IPv4 traffic with tshark
  2. Capture TCP IPv6 traffic with tshark
  3. Capture with tcpdump and wireshark
The documentation is here.

Happy network sniffing!



Tuesday, May 21, 2019

4.1.0 released

I'm happy to announce that I just released JGroups 4.1.0!

Why the bump to a new minor version?

I had to make some API changes and - as I'm trying to avoid that in micro releases - I decided to bump the version to 4.1.0. The changes involve removing a few (rather exotic) JChannel constructors, but chances are you've never used any of them anyway. The other change is a signature change in Streamable, where I now throw IOExceptions and ClassNotFoundExceptions instead of simple Exceptions.

Here's a list of the major changes:

GraalVM / Quarkus support

  • JGroups can now be compiled into a native executable using GraalVM's native-image. This is a very cool feature; I've used ProgrammaticUPerf2 to start a member in ~1 millisecond!
  • This means JGroups can now be used by other applications to create native binaries. Not yet very polished, and I'll write a Quarkus extension next, but usable by folks who know GraalVM...
  • I'll blog about the port to GraalVM and the Quarkus extension (once it's ready) next, so stay tuned!
  • https://issues.jboss.org/browse/JGRP-2332
  • https://issues.jboss.org/browse/JGRP-2341

 Parsing of network packets

Diagnostics handler without reflection

  • This is related to the GraalVM port: the default DiagnosticsHandler (used by probe.sh) uses reflection, which is not allowed in GraalVM
  • This additional DiagnosticsHandler can be used instead of the default one when creating a native binary of an application. The advantage is that probe.sh still works, even in a native binary.
  • https://issues.jboss.org/browse/JGRP-2337

RpcDispatcher without reflection

 Probe: support when running under TCP

  • Also required when running on GraalVM: since JGroups currently only supports TCP (MulticastSockets don't currently work on GraalVM), probe.sh needs to be given the address of *one* member, to fetch information about all members
  • https://issues.jboss.org/browse/JGRP-2336

 Change in how IPv4/IPv6 addresses are picked

  • The new algorithm centers around bind_addr defined in the transport (UDP, TCP, TCP_NIO2); the value of this address determines how other addresses (such as loopback, site_local, global, localhost or default values) are resolved.
  • Example: if bind_addr=::1, then all other addresses that are not explicitly defined will be IPv6. If bind_addr=127.0.0.1, the all other addresses will be IPv4 addresses.
  • The advantage of this is that you can run a JGroups stack using IPv4 and another one using IPv6 in the same process!
  • https://issues.jboss.org/browse/JGRP-2343
A complete list of JIRA issues is at https://issues.jboss.org/projects/JGRP/versions/12341388
Please post questions/issues to the mailing list at https://groups.google.com/forum/#!forum/jgroups-dev.

Enjoy!

Thursday, March 14, 2019

4.0.19

FYI: I just released 4.0.19.Final.

This changes the way ASYM_ENCRYPT disseminates the private shared group key to members, going from a pull- to a push-based approach [1] [2].

In combination with JGRP-2293 [3], this should help a lot in environments like Kubernetes or Openshift where pods with JGroups nodes are started/stopped dynamically, and where encryption is required.

The design is at [3].

Check it out and let me know if you run into issues.
Cheers,

[1] https://issues.jboss.org/browse/JGRP-2297
[2] https://issues.jboss.org/browse/JGRP-2293
[3] https://github.com/belaban/JGroups/blob/master/doc/design/ASYM_ENCRYPT.txt

Wednesday, February 13, 2019

and 4.0.17

Sorry for the short release cycle between 4.0.16 and 4.0.17, but there was an important feature in 4.0.17, which I wanted to release as soon as possible: JGRP-2293.

This has better support for concurrent graceful leaves of multiple members (also coordinators), which is important in cloud environments, where pods are started and stopped dynamically by Kubernetes.

Not having this enhancement would lead to members leaving gracefully, but that would sometimes not be recognized and failure detection would have to kick in to exclude those members and installing the correct views. This would slow down things, and the installing of new views would be goverened by the timeouts in the failure detection protocols (FD_ALL, FD_SOCK). On top of this, in some edge cases, MERGE3 would have to kick in and fix views, further slowing things down.

There's a unit test [1] which tests the various aspects of concurrent leaving, e.g. all coordinators leaving sequentially, multiple members leaving concurrently etc

I recommend installing this version as soon as possible, especially if you run in the cloud. Questions / problems etc -> [2]

Cheers,

[1] https://github.com/belaban/JGroups/blob/master/tests/junit-functional/org/jgroups/tests/LeaveTest.java

[2] https://groups.google.com/forum/#!forum/jgroups-dev

Monday, January 21, 2019

JGroups 4.0.16

I just released 4.0.16. The most important features/bug fixes are (for a comprehensive list see [1]):

Enjoy!

[1] https://issues.jboss.org/projects/JGRP/versions/12339241