Friday, July 17, 2020

Double your performance: virtual threads (fibers) and JDK 15/16!

If you use UDP as transport and want to double your performance: read on!

If you use TCP, your performance won't change much. You might still be interested in what more recent JDKs and virtual threads (used to be called 'fibers') will bring to the table.

Virtual threads


Virtual threads are lightweight threads, similar in concept to the old Green Threads, and are managed by the JVM rather than the kernel. Many virtual threads can map to the same OS native (carrier) thread (only one at a time, of course), so we can have millions of virtual threads.

Virtual threads are implemented with continuations, but that's a detail. What's important is that all blocking calls in the JDK (LockSupport.park() etc) have been modified to yield rather than block. This means that we don't waste the precious native carrier thread, but simply go to a non-RUNNING state. When the block is over, the thread is simply marked as RUNNABLE again and the scheduler continues the continuation where it left off.

Main advantages:
  • Blocking calls don't need to be changed, e.g. into reactive calls
  • No need for thread pools: simply create a virtual thread
  • Fewer context switches (reduced/eliminated blocking calls)
  • We can have lots of virtual threads
It will be a while until virtual threads show up in your JDK, but JGroups has already added support for it: just set use_fibers="true" in the transport. If the JVM supports virtual threads, they will be used, otherwise we fall back to regular native threads.


UDP: networking improvements 

While virtual threads bring advantages to JGroups, the other performance increase can be had by trying a more recent JDK.

Starting in JDK 15, the implementation of DatagramSockets and MulticastSockets has been changed to delegate to DatagramChannels and MulticastChannels. In addition, virtual threads are supported.

This increases the performance of UDP which uses DatagramChannels and MulticastChannels.

The combination of networking code improvements and virtual threads leads to astonishing results for UDP, read below.

Performance

I used UPerf for testing on a cluster of 8 (physical) boxes (1 GBit ethernet), with JDKs 11, 16-ea5 and 16-loom+2-14. The former two use native threads, the latter uses virtual threads.

As can be seen in [1], UDP's performance goes from 44'691 on JDK 11 to 81'402 on JDK 16-ea5; that's a whopping 82% increase! Enabling virtual threads increases the performance between 16-ea5 and 16-loom+2-14 to 88'252, that's another 8%!

The performance difference between JDK 11 and 16-loom is 97%!

The difference in TCP's performance is miniscule; I guess because the TCP code was already optimized in JDK 11.

Running in JDK 16-loom+2-14 shows that UDP's performance is now on par with TCP, as a matter of fact, UDP is even 3% faster than TCP!

If you want to try for yourself: head over to the JGroups Github repo and create the JAR (ant jar). Or wait a bit: I will soon release 5.0.0.Final which contains the changes.

Not sure if I want to backport the changes to the 4.x branch...

Enjoy!

[1] https://drive.google.com/file/d/1Ars1LOM7cEf6AWpPwZHeIfu_kKLa9gv0/view?usp=sharing

[2] http://openjdk.java.net/jeps/373