In this (long!) post, I'll provide step-by-step instructions on how to create JGroups clusters in Google Kubernetes Engine (GKE) and Amazon (EKS) clusters, and connect them into one virtual cluster using
RELAY2.
Each local cluster is called a
site. In this tutorial, we'll call the sites
NYC and
SFC. We'll start 5 nodes in NYC and 3 in SFC.
The sample deployments and services are defined in YAML and we're using
Kubernetes to create the clusters.
To try this yourself, you'll need
kubectl,
eksctl and
gcloud installed, and accounts on both EKS and GKE.
The demo is
RelayDemo [1]. It is a simple chat, started in a pod, and every typed line appears in all pods across all sites, and then every pod sends a response back to the sender, which displays all responses. This way, we know who received our chat message.
Architecture
The setup of this tutorial is as follows:
On the left, we have nodes A,B,C,D,E in site NYC (Amazon EKS) and on the right, X,Y,Z in SFC (Google GKE).
A in NYC and X in SFC assume the role of
site master (see [2]). This means, they join a separate JGroups cluster, called
bridge cluster, which connects the two sites, and relay messages between the sites.
A site master is not a dedicated node, but any node can assume the role of site master. For example, when A leaves or crashes, B will take over the site master role, join the bridge cluster and relay messages between sites NYC and SFC.
The problem with Kubernetes / Openshift is that a pod cannot directly connect to a pod in a different cluster, region, or cloud provider. That is, without resorting to specific container network implementations (CNI) implementations.
To overcome this problem, the above setup uses a
GossipRouter and
TUNNEL [3]: this way,
A and X can communicate across different regions or (in this case) even different cloud providers.
The way this is done is simple: the configuration of the bridge cluster includes TUNNEL as transport and a list of GossipRouters, in this case the ones in NYC and SFC (more details later).
A and X connect to both GossipRouters via TCP, under their respective cluster names. So A connects to GR-NYC and GR-SFC and X connects to its local GR, and the remote one in NYC.
When A wants to send a message to X, it can use either its local GossipRouter, or the one in SFC (by default, JGroups load-balances requests between the GossipRouters). In any case,
the ingress TCP connection established by X to a GossipRouter is used to send egress traffic to X.
This means, we can send messages to any member of the bridge cluster, as long as
all GossipRouters are publicly accessible and the members of the bridge cluster can connect to them.
But now let's get cracking! We'll do the following in the next sections:
- Set up an EKS cluster (NYC)
- Set up a GKE cluster (SFC)
- Deploy a GossipRouter service in both sites
- Deploy 5 pods in NYC and 3 pods in SFC
- Use one of the pods in each site to talk to the other site with RelayDemo
Set up the NYC cluster in EKS
This can be done via the GUI, the AWS CLI or eksctl [4]. For simplicity, I chose the latter.
To create a cluster "nyc" in the us-east-1 region, execute:
eksctl create cluster --name nyc --region us-east-1 --nodegroup-name nyc-nodegroup --node-type t3.small --nodes 2 --nodes-min 1 --nodes-max 4 --managed
This will take 10-15 minutes.
The local kubeconfig should now point to the AWS cluster. This can be seen with kubectl config get-contexts. If this is not the case, use the AWS CLI to change this, e.g.:
aws eks --region use-east-1 update-kubeconfig --name nyc
This make kubectl access the NYC cluster by default.
Let's now deploy the GossipRouter in NYC:
kubectl apply -f https://raw.githubusercontent.com/belaban/jgroups-docker/master/yaml/gossiprouter.yaml
The YAML file contains a deployment of the GossipRouter and a LoadBalancer service: [5]. The public address of the GossipRouter service can be seen as follows:
kubectl get svc gossiprouter
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
gossiprouter LoadBalancer 10.100.28.38 a6abc71e42b2211ea9c3716e7fa74966-862f92ba6a28fd36.elb.us-east-1.amazonaws.com 8787:31598/TCP,9000:30369/TCP,12001:31936/TCP 2m56s
We can see that the public address is a6abc71e42b2211ea9c3716e7fa74966-862f92ba6a28fd36.elb.us-east-1.amazonaws.com. Write this down somewhere, as we'll need to add it to our TUNNEL configuration later.
Set up SFC cluster in GKE
To create a cluster on GKE, execute:
gcloud container clusters create sfc --num-nodes 2
This will create a cluster in the default region configured in gcloud.
Note that this added a new context to the kube config, and switched to it. If not, manually switch to it, e.g.
kubectl config use-context gke_ispnperftest_us-central1-a_sfc
Now deploy the GossipRouter in SFC (same as above, for NYC):
kubectl apply -f https://raw.githubusercontent.com/belaban/jgroups-docker/master/yaml/gossiprouter.yaml
Now get the public IP address of the GossipRouter:
kubectl get svc gossiprouter
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
gossiprouter LoadBalancer 10.19.247.254 35.232.92.116 8787:30150/TCP,9000:32534/TCP,12001:32455/TCP 101s
The public IP is 35.232.92.116. Take a note of this, as we'll need it later.
We're now ready to deploy the cluster nodes in NYC and SFC.
Deploy the pods in NYC
We'll deploy 5 pods in NYC. To do this, we first need to switch the context back to NYC, e.g. by executing
kubectl config use-context jgroups@nyc.us-east-1.eksctl.io
Next, download the 2 YAML files for NYC and SFC locally (we need to make changes):
mkdir tmp ; cd tmp
curl https://raw.githubusercontent.com/belaban/jgroups-docker/master/yaml/nyc.yaml > nyc.yaml
curl https://raw.githubusercontent.com/belaban/jgroups-docker/master/yaml/sfc.yaml > sfc.yaml
Now edit both YAML files and replace the TUNNEL_INITIAL_HOSTS system variable
"load-balancer-1[12001],load-balancer-2[12001]" with
"a6abc71e42b2211ea9c3716e7fa74966-862f92ba6a28fd36.elb.us-east-1.amazonaws.com[12001],35.232.92.116[12001]".
This points the TUNNEL protocol to the two publicly accessible GossipRouters in NYC and SFC:
<TUNNEL
port_range="${PORT_RANGE:0}" gossip_router_hosts="${TUNNEL_INITIAL_HOSTS:127.0.0.1[12001]}"/>
This means that TUNNEL will establish 2 TCP connections, one to the GossipRouter in NYC and the other one to the GossipRouter in SFC.
Now deploy the NYC pods:
> kubectl apply -f tmp/nyc.yaml
deployment.apps/nyc created
service/nyc created
This shows that 1 pod has been created:
> kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
gossiprouter-f65bb6858-jks8q 1/1 Running 0 25m 192.168.36.19 ip-192-168-38-111.ec2.internal <none> <none>
nyc-5f4964d444-9v5dm 1/1 Running 0 73s 192.168.26.87 ip-192-168-8-51.ec2.internal <none> <none>
Next, scale this to 5:
> kubectl scale --replicas=5 deployment nyc
deployment.extensions/nyc scaled
Listing the pods shows 5 'nyc' pods:
> kubectl get pods
NAME READY STATUS RESTARTS AGE
gossiprouter-f65bb6858-jks8q 1/1 Running 0 27m
nyc-5f4964d444-2ttfp 1/1 Running 0 49s
nyc-5f4964d444-4lccs 1/1 Running 0 49s
nyc-5f4964d444-8622d 1/1 Running 0 49s
nyc-5f4964d444-9v5dm 1/1 Running 0 3m21s
nyc-5f4964d444-tm5h5 1/1 Running 0 49s
Let's exec into one of the and make sure that the local cluster formed:
> kubectl exec nyc-5f4964d444-2ttfp probe.sh
#1 (307 bytes):
local_addr=nyc-5f4964d444-2ttfp-24388
physical_addr=192.168.53.43:7800
view=[nyc-5f4964d444-9v5dm-21647|4] (5) [nyc-5f4964d444-9v5dm-21647, nyc-5f4964d444-tm5h5-64872, nyc-5f4964d444-2ttfp-24388, nyc-5f4964d444-8622d-63103, nyc-5f4964d444-4lccs-4487]
cluster=RelayDemo
version=4.1.9-SNAPSHOT (Mont Ventoux)
1 responses (1 matches, 0 non matches)
This shows a view of 5, so the 5 pods did indeed form a cluster.
Deploy the pods in SFC
Let's now switch the kubeconfig back to SFC (see above) and deploy the SFC cluster:
> kubectl apply -f tmp/sfc.yaml
deployment.apps/sfc created
service/sfc created
> kubectl scale --replicas=3 deployment/sfc
deployment.extensions/sfc scaled
> kubectl get pods
NAME READY STATUS RESTARTS AGE
gossiprouter-6cfdc58df5-7jph4 1/1 Running 0 21m
sfc-5d6774b647-25tk5 1/1 Running 0 50s
sfc-5d6774b647-sgxsk 1/1 Running 0 50s
sfc-5d6774b647-sjt9k 1/1 Running 0 88s
This shows that we have 3 pods in SFC running.
Run the demo
So, now we can run RelayDemo to see if the virtual cluster across the two clouds is working correctly. To do this, we run a bash in one of the pods:
> kubectl get pods
NAME READY STATUS RESTARTS AGE
gossiprouter-6cfdc58df5-7jph4 1/1 Running 0 28m
sfc-5d6774b647-25tk5 1/1 Running 0 7m50s
sfc-5d6774b647-sgxsk 1/1 Running 0 7m50s
sfc-5d6774b647-sjt9k 1/1 Running 0 8m28s
> kubectl exec -it sfc-5d6774b647-sgxsk bash
bash-4.4$
The RelayDemo can be started with relay.sh:
relay.sh -props sfc.xml -name Temp
-------------------------------------------------------------------
GMS: address=Temp, cluster=RelayDemo, physical address=10.16.1.6:7801
-------------------------------------------------------------------
View: [sfc-5d6774b647-sjt9k-37487|9]: sfc-5d6774b647-sjt9k-37487, sfc-5d6774b647-sgxsk-6308, sfc-5d6774b647-25tk5-47315, Temp
:
We can see that our cluster member named 'Temp' has joined the cluster.
When we send a message, we can see that all 3 members of the (local) SFC cluster and the 5 members of the (remote) NYC cluster are replying (we're also getting a reply from self):
hello
: << response from sfc-5d6774b647-sgxsk-6308
<< response from sfc-5d6774b647-sjt9k-37487
<< response from sfc-5d6774b647-25tk5-47315
<< hello from Temp
<< response from Temp
<< response from nyc-5f4964d444-9v5dm-21647:nyc
<< response from nyc-5f4964d444-2ttfp-24388:nyc
<< response from nyc-5f4964d444-tm5h5-64872:nyc
<< response from nyc-5f4964d444-8622d-63103:nyc
<< response from nyc-5f4964d444-4lccs-4487:nyc
The topology can be shown by typing 'topo' ('help' lists more commands):
: topo
nyc
nyc-5f4964d444-9v5dm-21647 (192.168.26.87:7800) (me) // site master
nyc-5f4964d444-tm5h5-64872 (192.168.30.27:7800)
nyc-5f4964d444-2ttfp-24388 (192.168.53.43:7800)
nyc-5f4964d444-8622d-63103 (192.168.62.83:7800)
nyc-5f4964d444-4lccs-4487 (192.168.40.102:7800)
sfc
sfc-5d6774b647-sjt9k-37487 (10.16.1.5:7800) (me) // site master
sfc-5d6774b647-sgxsk-6308 (10.16.1.6:7800)
sfc-5d6774b647-25tk5-47315 (10.16.0.10:7800)
Temp (10.16.1.6:7801)
This shows the members of both sites, plus their (internal) IP addresses and who the site masters are.
Dump the contents of the GossipRouters
This can be done via a utility program shipped with JGroups:
> java -cp jgroups.jar org.jgroups.tests.RouterStubGet -host 35.232.92.116 -cluster bridge
1: null:nyc, name=_nyc-5f4964d444-9v5dm-21647, addr=192.168.26.87:45275, server
2: null:sfc, name=_sfc-5d6774b647-sjt9k-37487, addr=10.16.1.5:42812, server
This shows the members of the bridge cluster, which registered with both GossipRouters.
Alternatively, the other GossipRouter can be used, but it show list the same members.
Add firewall/ingress rules to make the GossipRouter publicly available
If the GossipRouters cannot be accessed by the above command, then there has to be a firewall/ingress rule to be added to allow ingress traffic to port 12001.
Cross-site replication
The RelayDemo sample application is very basic and not very useful by itself, but the setup can be used for other types of applications, e.g.
replication between data centers.
If we have in-memory data in NYC, and use SFC as a backup for NYC (and vice versa), then a total loss of the NYC cluster will not lose all the data, but clients can be failed over to SFC and will continue to work with the data.
This can be done for example by Red Hat Data Grid [6] and cross-site replication; as a matter of fact, all that needs to be done is to change the configuration, as explained in this post!
As usual, send questions and feedback to the JGroups mailing list.
Enjoy!
[1]
https://github.com/belaban/JGroups/blob/master/src/org/jgroups/demos/RelayDemo.java
[2]
http://www.jgroups.org/manual4/index.html#Relay2Advanced
[3]
http://www.jgroups.org/manual4/index.html#TUNNEL_Advanced
[4]
https://docs.aws.amazon.com/eks/latest/userguide/create-cluster.html
[5]
https://github.com/belaban/jgroups-docker/blob/master/yaml/gossiprouter.yaml
[6]
https://access.redhat.com/documentation/en-us/red_hat_data_grid/7.3/html/red_hat_data_grid_user_guide/x_site_replication