Pulling cache from Django Memcached

I had my django app set up with memcached and everything worked smoothly.

I am trying to populate the cache over time by adding to it as new data is coming from external APIs. Here is the gist of what I am doing:

basic view

api_query, more_results = apiQuery(**params)
cache_key = "mystring"
cache.set(cache_key, data_list, 600)

if more_results:
    t = Thread(target = 'apiMoreResultsQuery', args = (param1, param2, param3))
    t.daemon = True
    t.start()

      

results function

cache_key = "mystring"
my_cache = cache.get(cache_key)
api_query, more_results = apiQuery(**params)
new_cache = my_cache + api_query
cache.set(cache_key, new_cache, 600)

if more_results:
    apiMoreResultsQuery(param1, param2, param3)

      

This method works for multiple iterations through apiMoreResultsQuery

, but at some point the cache returns None

, causing the entire loop to crash. I tried to increase the cache expiration date but didn't change anything. Why does the cache disappear suddenly?

For clarification, I am running apiMoreResultsQuery

on a separate thread because I need to return the response from the initial call faster, then the full dataset will be populated, so I want the population to happen in the background, while the response can still be returned.

+3


source to share


1 answer


When you set a specific cache key and the item you are setting is larger than the size allocated for the cached item, it fails and your key gets set None

. (I know this because I got bitten.)

Memcached uses pickle

to cache objects, so at some point it new_cache

gets pickled

, and it's just larger than the size allocated for cached items.

Size Default memcached 1 MB , and you can increase it, but a big problem, which seems a bit strange, is that you use the same key over and over again, and your own cached item just becomes more and more.

Wouldn't it be better to set new items in the cache and make sure those items are small enough to be cached?

Anyway, if you want to see how big your position is, so you can check if it will go to the cache, you can do some of the following:



>>> import pickle
>>> some_object = [1, 2, 3]
>>> len(pickle.dumps(some_object, -1))
22
>>> new_object = list(range(1000000))
>>> len(pickle.dumps(new_object, -1))
4871352   # Wow, that got pretty big!

      

Note that this can increase significantly if you are collecting Django model instances, in which case it is probably a good idea to just fetch the values โ€‹โ€‹you want from the instance.

See the following answer for more information:

How do I get the size of a python object in bytes in a Google AppEngine?

+1


source







All Articles