Blpop stops processing queue after a while
In my organization, we have a number of redis workers for our critical tasks. Usually, once or twice a day, our workers stop processing queues.
The code essentially looks like this:
while ($item = $redis->blpop(array('someQueue', 'anotherQueue'), 3600)) {
someFunction();
}
If you can see, not much is happening in terms of code, but from time to time the queue starts to grow and the worker does not pop any item from the queue. Setting a timeout for blpop
is not useful at all as we assume the problem is with the redis client connection.
At this point we have created several listeners that warn us when the queue is being created and then we restart the workers, but the problem still persists. We can also set a timeout for our redis client, but again this is not a perfect solution.
- Has anyone else encountered this?
- What could be the problem?
- Are we doing something wrong?
Our question is similar to Error while implementing message queue using redis, errors when using BLPOP , but we are not getting any errors. The worker just stops abruptly.
Information
Redis Server: 2.8.2
PHP Redis: phpredis
Update # 1
Workers who have worked for a long time have stopped processing the queue. Upon launch, CLIENT LIST
we noticed that these workers have a lot of downtime compared to the rest, and their flag is set to N
instead b
. What could be causing this?
Update # 2
The problem was someFunction()
. There was a code fragment due to which the function did not return control, due to which the client was idling for a long time, and, therefore, the "N" flag at startup CLIENT LIST
.
source to share
I suggest checking if there is a problem and reporting the problem to the Redis project as a problem in case you find something server-side. However, the following steps will help you fix the problem, even if in some other part of your stack (which is likely, since there are no known issues like the ones above).
Steps to check what's going on:
- Wait for one client to stop.
- Verify that there are actually items in the list using the command
LLEN
. - Mark with a help
CLIENT LIST
that your client is actually specified by blocking (you will see the command name) and check the size of the response to see if it is your client that is not actually consuming the responses it receives.
Random remarks:
- Redis 2.8.2. too old, he recommended updating.
- Phpredis might have errors that could cause this if it is as old as the Redis server.
source to share
We would have a different problem: if the application server loses contact with the Redis server for a moment, the Redis descriptor becomes invalid (by the way, we expect this - this is not an error). While your problem is different, the work we used might work for you too:
You can do something like this:
while (true) {
// The factory method below will check whether handle is valid. If not, create a new one and return
$redis = MyRedisFactory::getInstance();
$item = $redis->blpop(array('someQueue', 'anotherQueue'), 3600);
someFunction();
}
source to share