Welcome! Log In Create A New Profile

Advanced

Re: consistent hashing using split_clients

October 31, 2012 12:46PM
Maxim Dounin Wrote:
>
> Percentage values are stored in fixed point with 2 digits after
> the point. Configuration parsing will complain if you'll try to
> specify more digits after the point.
>
> > How many "buckets" does the hash table for split_clients
> > have (it doesn't seem to be configurable)?
>
> The split_clients algorithm doesn't use buckets, as it's not a
> hash table. Instead, it calculates hash function of the
> original value, and selects resulting value based on a hash
> function result. See http://nginx.org/r/split_clients for
> details.
>

So clearly I am down the wrong path here, and split_clients just cannot do what I need. I will have to rethink things.

The 3rd-party ngx_http_consistent_hash module appears to be un-maintained, un-commented. It also uses binary search to find an upstream instead of a hash table, making it O(log(n)) for each request. My C skills haven't been used in anger since about 1997, so updating or maintaining it myself would probably not be a fruitless exercise.

Perhaps I will have to fall back to using perl to get a hash bucket for the time being. I assume 4096 upstreams is not a problem for nginx given that it is used widely by CDNs.

A long time ago Igor mentioned he was working on an variable-based upstream hashing module using MurmurHash3:
http://forum.nginx.org/read.php?29,212712,212739#msg-212739

I suppose other work took priority. Maybe Igor has some code stashed somewhere that just needs testing and polishing.

If not, it seems that the current "ip_hash" scheme used in nginx could be easily adapted to fast consistent hashing by simply
-using MurmurHash3 or similar instead of the current simple multiply+modulo scheme
-allowing arbitrary nginx variables as hash input instead of just the IP address during upstream selection
-at initialization utilizing a hash table of 4096 or whatever configurable number of buckets
-fill the hash table by sorting the server array on murmurhash3(bucket_number + server_name + server_weight_counter) and taking the first server

Is there a mechanism for sponsoring development along these lines and getting it into the official nginx distribution? Consistent hashing is the one commonly-used proxy server function that nginx seems to be missing.
Subject Author Posted

consistent hashing using split_clients

rmalayter October 31, 2012 10:31AM

Re: consistent hashing using split_clients

Maxim Dounin October 31, 2012 10:52AM

Re: consistent hashing using split_clients

rmalayter October 31, 2012 12:46PM

Re: consistent hashing using split_clients

姚伟斌 November 01, 2012 01:46AM

Re: consistent hashing using split_clients

Maxim Dounin November 01, 2012 06:00AM



Sorry, only registered users may post in this forum.

Click here to login

Online Users

Guests: 268
Record Number of Users: 8 on April 13, 2023
Record Number of Guests: 421 on December 02, 2018
Powered by nginx      Powered by FreeBSD      PHP Powered      Powered by MariaDB      ipv6 ready