Make the supervisor the right celery job
I see a lot of strange things when using celery. For example, I update tasks.py, supervisorctl reload
(restart), but the tasks are wrong. Some tasks disappear, etc.
Today I found that because supervisorctl stop all
it can not stop all workers of celery. And only kill -9 'pgrep python' can kill them all.
situation:
root@ubuntu12:/data/www/article_fetcher# supervisorctl
celery_beat RUNNING pid 29597, uptime 0:52:18
celery_worker1 RUNNING pid 29556, uptime 0:52:20
celery_worker2 RUNNING pid 29570, uptime 0:52:19
celery_worker3 RUNNING pid 29557, uptime 0:52:20
celery_worker4 RUNNING pid 29586, uptime 0:52:18
uwsgi RUNNING pid 29604, uptime 0:52:18
supervisor> stop all
celery_beat: stopped
celery_worker2: stopped
celery_worker4: stopped
celery_worker3: stopped
uwsgi: stopped
celery_worker1: stopped
supervisor> status
celery_beat STOPPED Aug 04 11:05 AM
celery_worker1 STOPPED Aug 04 11:05 AM
celery_worker2 STOPPED Aug 04 11:05 AM
celery_worker3 STOPPED Aug 04 11:05 AM
celery_worker4 STOPPED Aug 04 11:05 AM
uwsgi STOPPED Aug 04 11:05 AM
processes:
root@ubuntu12:~# ps -aux|grep 'python'
Warning: bad ps syntax, perhaps a bogus '-'? See http://procps.sf.net/faq.html
root 8683 0.0 0.1 61420 11768 ? Ss Aug03 0:27 /usr/bin/python /usr/bin/supervisord
root 29310 0.1 0.1 57120 11344 pts/2 S+ 11:05 0:00 /usr/bin/python /usr/bin/supervisorctl
nobody 29556 2.2 0.5 132484 45988 ? S 11:06 0:00 /data/www/article_fetcher/venv/bin/python /data/www/article_fetcher/manage.py celery worker -n W1 -Ofair --app=celery_worker:app
nobody 29557 2.2 0.5 132480 45996 ? S 11:06 0:00 /data/www/article_fetcher/venv/bin/python /data/www/article_fetcher/manage.py celery worker -n W3 -Ofair --app=celery_worker:app
nobody 29570 2.4 0.5 132740 45996 ? S 11:06 0:00 /data/www/article_fetcher/venv/bin/python /data/www/article_fetcher/manage.py celery worker -n W2 -Ofair --app=celery_worker:app
nobody 29571 26.9 1.4 217688 115804 ? R 11:06 0:09 /data/www/article_fetcher/venv/bin/python /data/www/article_fetcher/manage.py celery worker -n W3 -Ofair --app=celery_worker:app
nobody 29572 33.7 0.7 158396 59808 ? R 11:06 0:12 /data/www/article_fetcher/venv/bin/python /data/www/article_fetcher/manage.py celery worker -n W3 -Ofair --app=celery_worker:app
nobody 29573 29.6 1.4 215176 115928 ? R 11:06 0:10 /data/www/article_fetcher/venv/bin/python /data/www/article_fetcher/manage.py celery worker -n W1 -Ofair --app=celery_worker:app
nobody 29574 27.2 1.4 218244 118180 ? R 11:06 0:09 /data/www/article_fetcher/venv/bin/python /data/www/article_fetcher/manage.py celery worker -n W3 -Ofair --app=celery_worker:app
......
......
......
I found this question: Stopup Supervisor does not stop celery workers , but it asks another thing, the accepted supervisorctl stop all
do answer does not actually work. So I decided to find the right way.
source to share
I look at the dispatcher docs and find this:
killasgroup
If this is true, when calling the SIGKILL to the program to terminate it, send it to the whole process group, taking care of its children is also useful, for example, using Python programs using multiprocessing.
Default: false
Required parameter: None.
Introduced: 3.0a11
Then I think that each worker creates 4 child processes (per cpu cores), becomes a process group, so supervisorctl stop all
does not work.
So I add killasgroup
to supervisord.conf:
[program:celery_worker1]
; Set full path to celery program if using virtualenv
directory=/data/www/article_fetcher
command=/data/www/article_fetcher/venv/bin/python /data/www/article_fetcher/manage.py celery worker -n W2 -Ofair --app=celery_worker:app
user=nobody
numprocs=1
stdout_logfile=/data/www/article_fetcher/logs/celery.log
stderr_logfile=/data/www/article_fetcher/logs/celery.log
autostart=true
autorestart=true
startsecs=5
killasgroup=true
.....
.....
Then supervisorctl stop all
really stop celery! very good ~
source to share