RHPool - Rendezvous Hashing for fast DSR Load Balancing

July 23, 2025

algorithms, Rendezvous Hashing, C, bngx, IXDP

Updated to reflect rhpool.test.load which instantiates one testing RHPool instance and makes the additional testing RPCL words available (operating on that instance).

Overview #

RHPool implements a Rendezvous Hash Pool, which holds multiple possible destinations (or servers) at a given time.
RHPool is esentially a skiplist with that uint64_t as the key.
For DSR load balancing (the intended purpose), a MAC address is mapped to the first 6 bytes in memory order, the remaining 2 bytes allow to insert the same MAC address multiple times to allow weighting a destination.
A RHPool entry contains as the sole value a time stamp, which is also an uint64_t.
RHPool uses the tabulation hash function f2568() to determine the HRW (Highest Random Weight) result for a 64 bit hash of the desired connection tuple inside an accepted packet.
A combination of Linux/POSIX rwlocks and atomic uint64_t access allows a maximum of concurrency during regular operation.
Locking for write access is only needed when a new entry is inserted or when a cleanup sweep to remove outdated entries is performed.

The RHPool API #

RHPool* RHPoolCreate();

void RHPoolDestroy(RHPool* rhpool);

void RHPoolInsert(RHPool* rhpool, uint64_t destination);

uint64_t RHPoolSelect(RHPool* rhpool, uint64_t tuple_hash);

void RHPoolCleanup(RHPool* rhpool, uint32_t maxage);

RPCL Testing Control Words #

rhpool.test.load                 loads rhpool.test words ( - )

rhpool.test.load instatiates one RHPool for testing and loads the following words operating on that:

rhpool.test.cleanup              remove outdated nodes ( maxage - )
rhpool.test.dtest                distribution test ( nservers lookups - )
rhpool.test.info                 print rhpool info ( - )
rhpool.test.insert               insert node ( uint64_t - )
rhpool.test.load                 loads rhpool.test words ( - )
rhpool.test.select               select node ( uint64_t - uint64_t )

Distribution Testing #

The rather small machine that these tests have been run on has 8 cores with the following data:

vendor_id     : GenuineIntel
cpu family    : 6
model         : 158
model name    : Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz
stepping      : 9
microcode     : 0x8e
cpu MHz       : 800.000
cache size    : 8192 KB

1 Entry #

ok timer-start 1 10000000 rhpool.test.dtest timer-show 
0000000000000000  10000000 100.0000%
expected    10000000.0000
chi_squared 0.0000
327973 microseconds elapsed
ok ▂

30,490,314 selections/second

2 Entries #

ok timer-start 2 10000000 rhpool.test.dtest timer-show 
0000000000000000   4997282  49.9728%
0000000000000001   5002718  50.0272%
expected    5000000.0000
chi_squared 2.9550
490679 microseconds elapsed
ok ▂

20,379,922 selections/second

10 Entries #

ok timer-start 10 10000000 rhpool.test.dtest timer-show 
0000000000000000   1000089  10.0009%
0000000000000001   1001166  10.0117%
0000000000000002    997538   9.9754%
0000000000000003    999905   9.9990%
0000000000000004   1001008  10.0101%
0000000000000005    999991   9.9999%
0000000000000006    998782   9.9878%
0000000000000007   1000610  10.0061%
0000000000000008   1001388  10.0139%
0000000000000009    999523   9.9952%
expected    1000000.0000
chi_squared 12.4638
1534562 microseconds elapsed
ok ▂

6,516,517 selections/second

100 Entries #

ok timer-start 100 10000000 rhpool.test.dtest timer-show 
0000000000000000    100191   1.0019%
0000000000000001    100101   1.0010%
... lines omitted ...
0000000000000062    100229   1.0023%
0000000000000063    100702   1.0070%
expected    100000.0000
chi_squared 92.4308
12174253 microseconds elapsed
ok ▂

821,405 selections/second

1000 Entries #

ok timer-start 100 10000000 rhpool.test.dtest timer-show 
0000000000000000     10022   0.1002%
0000000000000001      9889   0.0989%
... lines omitted ...
00000000000003e6     10073   0.1007%
00000000000003e7     10030   0.1003%
expected    10000.0000
chi_squared 986.0008
122478626 microseconds elapsed
ok ▂

81,646 selections/second

Observations #

The strong f2568() properties have been confirmed.
The lookup time increases linear with the number of destinations (not surprising).
The combination of POSIX rwlocks and atomic timestamp updates allow any parallel concurrency.
The distribution is perfect according to the Chi-Squared results.

Conclusion #

This principle will be the core distribution method for our new products and projects.

Resources #

Here’s a good explanation how to calculate the chi-squared value and to test a null hypothesis with a Chi-Squared distribution table: