REST Service and Memcache

I am considering Memcache support for my large scale REST service. However, I have some questions regarding the best approaches to these key value stores.

Setting:

  • A database wrapper that has functions to select, update, etc.
  • A REST structure that contains all the API functions (getUser, createUser, etc.).

In my head, the ideal approach would be to integrate Memcache into the database wrapper so that, for example, every SQL query was md5 hashed and stored in the cache (which is what most online resources offer, by the way). However, there is clearly a problem with this approach: if a search query was cached and one of the users from the search result was updated after the cached result, this will not be reflected in the next query (since it is now in the cache).

As I see it, I have several ways to pass this:

  • Implement Memcache in a REST framework for each function (getUser, createUser, etc.) and thus explicitly handle updating the cache, etc. if users are updated. This can lead to redundant code.
  • Let the cached values ​​expire very quickly and live with some queries showing old cached values.
  • Make a more complex Memcache implementation in a database wrapper so that I can identify which parts (e.g. users) to update, e.g. search request.

Could you lead me to which one of the following or some other? Thank you in advance.

+3


source to share


1 answer


Enabling cache for a web application is not easy.

You may have done this already ... I recommend that you first create a goal based on business needs or forcast (ex: should accept 1000 requests per second) and then stress test your system correctly to have numbers in front of you to start change something and then identify the bottleneck.

I usually use profiling tools like HXProf (via facebook).

Caching all your data for database mirroring may not be the best.

Find out how big you can set aside for your cache. If your architecture allows you to allocate 100MB for your memcache, this will affect your decision about what you use and how long you cache it.

The best cache is cache forever. But we all know that data changes. You can start by caching data that is frequently requested and requires most resources to get it.

Always try to make sure that you are not working to improve anything that might lead to a small improvement.

Without understanding the depth of your architecture, it would be dangerous for everyone to recommend the caching strategy that best suits your needs.



Maybe you should cache the reuse output of your web services? For example using a reverse proxy (as @Darrel is talking about) or using output buffering ...

Optimize your database queries before thinking about caching. Make sure you are using PHP Op cache (like APC) and all those things that are standard practice.

If you want to cache data and not use old or old data, the trick is to identify your data (perhaps the primary key?), And when the data is updated or deleted, you delete or update the cache for that id.

<?php
// After inserting into DB, you can also put it in the cache
$memcache->set($userId, $userData);

// After updating or deleting the user, you update or delete the data
$memcache->delete($userId);

      

Many sites will show stale data. When I'm on stackoverflow and my reputation goes up and then I hit the stackoverflow chat, the reputation showed my old reputation. When I got reputation 20 (the reputation needed to chat) I still couldn't communicate for another 5 minutes because the chat system had my old reputation data and didn't yet know that my reputation had increased enough to let me chat ... Some data may be out of date, while other data types should never be out of date. Please note that when caching data.

Conclusion

Your approaches may be valid depending on the factors I mentioned above. In fact, you can use a combination of these for all the types of data you want to cache, and how long is it acceptable to display old data for them. Perhaps categories or a list of countries (since they don't change often) can be cached for a long time, whereas reputation (or any data that changes all the time for all users) should only be cached for a short period.

+2


source







All Articles