Redis reduces memory consumption for 20-50 character string keys

I have a key that is generated by combining many different items .:

[15,000 unique lines] + [:] + [5 unique lines] + [:] + [1 or 0] + [:] + [15,000 unique lines] + [:] + [5 unique lines] + [: ] + [1 or 0] = A A string of 20 to 50 characters (for example: Vancouver: temp: 1: Kelowna: high: 0)

According to my calculations, there will be about 1 billion combinations, each of which will be key. Reading redis documentation ( http://redis.io/topics/memory-optimization ), they recommend you hash keys: eg. "object: 11558960" => "1" can become "object: 1155" "8960" => "1".

I am thinking about the best way to apply memory optimization. My first idea is to create a numeric representation for strings. So I would use MySQL and create a lookup table where each row has a corresponding numeric integer. That way I could hash more appropriately, since I could share numbers more easily than String could. Also, the numbers would create shorter keys, which I think would save memory. The problem here is 1 billion keys, which is a lot of MySQL overhead since I would have to create connections and all.

Another solution I read about is to take my string that I am creating and then compress it using something like php gzcompres before inserting it into redis. ( http://labs.octivi.com/how-we-cut-down-memory-usage-by-82/ ).

Are there any optimizations I could use to reduce the memory consumption of redis as it is currently still too high? I'm willing to give up on processor power to save memory. My values ​​will only be single or two digit integers from 0 to 50.

+3


source to share


1 answer


The lookup table is completely disabled, don't even bother. The hash solution looks like it suits your needs well. You would like your key to split right into 15,000 unique bites to give you enough hash keys to make it worth the effort.

So instead of:

SET Vancouver:temp:1:Kelowna:high:0 10

      

Would you use

HSET Vancouver:temp:1 Kelowna:high:0 10

      

Now everything after the first [1 or 0] will be a hash key, so you have about 150,000 possible keys per hash.

My calculations for your total key space are a little off you:

15000 * 5 * 2 * 15000 * 5 * 2 == 22500000000 (22.5 billion)

      

So you have 150,000 possible redis keys with 150,000 possible hash keys each.

The further you pause between the redis key and the hash key, the more the numbers are garbled for hash keys. For example, if you broke it like

HSET Vancouver:temp 1:Kelowna:high:0 10

      

Then you have 75,000 redis keys for hashes, and each hash can contain 300,000 key / value pairs.




Another way you could do this is to use an integer value for your key. If you had integer collations for each of two sets of 15,000 unique strings and 5 unique strings, then you could use a total of 34 bits to represent any key. For example.

 0000000000000   000   0   0000000000000   000   0
|      13     | | 3 | |1| |     13      | | 3 | |1|

      

13 bits gives you a range of 0-16383 (which covers 1-15,000) 3 bits gives you a range of 0-7 (which covers 1-5) AND 1 bit gives you a binary range of 1 or 0.

So, assuming these values ​​are: Vancouver == 9.987 temp == 3 Kelowna == 3.454 high = 2

You will have:

(9987 << 21) + (3 << 18) + (1 << 17) + (3454 << 4) + (2 << 1) + (0 << 0)
==
20945229796

      

To return values ​​from a given key, you simply clear the bit and mask

20945229796 >> 20
9987

(20945229796 >> 4) & ((1 << 13) - 1)
3454

      

Here is a simple python script that converts values ​​to int and ints to values:

values = [9987, 3, 1, 3454, 2, 0]
bits =   [21, 18, 17, 4, 1, 0]

value_and_shift = zip(values, bits)


def key_from_values(values_and_shift):
    return sum(x << y for x, y in value_and_shift)

def extract_values(values_and_shift):
    last_shift = 35
    for value, shift in value_and_shift:
        print "Value should be:", value
        print "Value extracted:", (key >> shift) & ((1 << (last_shift - shift)) - 1)
        print
        last_shift = shift

key = key_from_values(value_and_shift)
print "Using value of:", key

extract_values(value_and_shift) 

      

OUTPUT

Using value of: 20945229796

Value should be: 9987
Value extracted: 9987

Value should be: 3
Value extracted: 3

Value should be: 1
Value extracted: 1

Value should be: 3454
Value extracted: 3454

Value should be: 2
Value extracted: 2

Value should be: 0
Value extracted: 0

      

+3


source







All Articles