Ryan Showalter
2011-10-14 08:11:05 UTC
Hey everyone,
I've been running into an issue lately where after my server has been
running for a few hours, the uWSGI master process will start killing
off workers for seemingly no reason. When this happens, spool jobs
also fail to be created, but no failures seem to occur in the logs.
What's weird is that this error seems to have only begun manifesting
itself in the last couple of weeks. I'm not sure if this is related
to increased traffic or what. My memory usage is not outrageous, I
still have plenty of physical memory and swap available. Also, CPU
usage is very minimal. Restarting uwsgi will fix the issue for a few
hours before it begins to creep up again, at first just one or two
workers will be killed off every few minutes, and eventually it gets
to the point where at least 1 worker is killed at every request.
Below is my configuration, along with an excerpt from my log file
which shows several workers being killed and respawned within less
than a minute. I'm running uWSGI 0.9.9.2 with debug enabled (this
issue also happens with debug disabled).
Any help is much appreciated!
[program:uwsgi-www]
command=/opt/webapps/www.example.com/bin/uwsgi
--master
--processes 16
--memory-report
--harakiri 7200
--vacuum
--max-requests 500
--optimize 2
--enable-threads
--cache 1000
--import example-app.custom.uwsgi_utils
--spooler /opt/webapps/www.example.com/spooler
--home /opt/webapps/www.example.com/
--pythonpath /opt/webapps/www.example.com
--spooler-import example-app.custom.uwsgi_utils
--socket /opt/webapps/www.example.com/sock/uwsgi.sock
--pidfile /opt/webapps/www.example.com/pid/wsgi.pid
--wsgi-file /opt/webapps/www.example.com/django.wsgi
directory=/opt/webapps/www.example.com/example-app
environment=DJANGO_SETTINGS_MODULE='example-app.settings'
stdout_logfile=/var/log/uwsgi/www.example.com_supervisord.log
stderr_logfile=/var/log/uwsgi/www.example.com_supervisord.err
user=www-data
autostart=true
autorestart=true
stopsignal=QUIT
DAMN ! worker 12 (pid: 446) died :( trying respawn ...
Respawned uWSGI worker 12 (new pid: 1174)
DAMN ! worker 9 (pid: 893) died :( trying respawn ...
[uWSGI DEBUG] called 0xb2e9d534 0xb2f4702c 6950
Respawned uWSGI worker 9 (new pid: 1175)
DAMN ! worker 10 (pid: 1074) died :( trying respawn ...
[uWSGI DEBUG] called 0xb2e9d534 0xb2f4702c 6950
Respawned uWSGI worker 10 (new pid: 1176)
DAMN ! worker 14 (pid: 1101) died :( trying respawn ...
[uWSGI DEBUG] called 0xb2e9d534 0xb2f4702c 6950
Respawned uWSGI worker 14 (new pid: 1177)
DAMN ! worker 13 (pid: 1123) died :( trying respawn ...
[uWSGI DEBUG] called 0xb2e9d534 0xb2f4702c 6950
Respawned uWSGI worker 13 (new pid: 1178)
DAMN ! worker 1 (pid: 1134) died :( trying respawn ...
[uWSGI DEBUG] called 0xb2e9d534 0xb2f4702c 6950
Respawned uWSGI worker 1 (new pid: 1179)
DAMN ! worker 8 (pid: 1148) died :( trying respawn ...
[uWSGI DEBUG] called 0xb2e9d534 0xb2f4702c 6950
Respawned uWSGI worker 8 (new pid: 1180)
DAMN ! worker 5 (pid: 1149) died :( trying respawn ...
Respawned uWSGI worker 5 (new pid: 1181)
DAMN ! worker 6 (pid: 1157) died :( trying respawn ...
[uWSGI DEBUG] called 0xb2ec3bec 0xb2f43b0c 1
{address space usage: 111661056 bytes/106MB} {rss usage: 19304448
bytes/18MB} [pid: 1159|app: 0|req: 176/70] 50.23.94.74 () {30 vars in
390 bytes} [Fri Oct 14 00:04:46 2011] GET / => generated 8514 bytes in
227 msecs (HTTP/1.0 200) 4 headers in 333 bytes (1 switches on core 0)
[uWSGI DEBUG] called 0xb2e9d534 0xb2f4702c 6950
[uWSGI DEBUG] called 0xb2e9d534 0xb2f4702c 6950
Respawned uWSGI worker 6 (new pid: 1182)
[uWSGI DEBUG] called 0xb2e9d534 0xb2f4702c 6950
Fri Oct 14 00:04:57 2011 - master sent signal 63 to worker 8
Fri Oct 14 00:04:58 2011 - master sent signal 40 to worker 1
DAMN ! worker 8 (pid: 1180) died :( trying respawn ...
Respawned uWSGI worker 8 (new pid: 1183)
DAMN ! worker 1 (pid: 1179) died :( trying respawn ...
[uWSGI DEBUG] called 0xb2e9d534 0xb2f4702c 6950
Respawned uWSGI worker 1 (new pid: 1184)
[uWSGI DEBUG] called 0xb2e9d534 0xb2f4702c 6950
Fri Oct 14 00:04:59 2011 - master sent signal 47 to worker 9
DAMN ! worker 9 (pid: 1175) died :( trying respawn ...
Respawned uWSGI worker 9 (new pid: 1185)
[uWSGI DEBUG] called 0xb2e9d534 0xb2f4702c 6950
Fri Oct 14 00:05:02 2011 - master sent signal 42 to worker 9
DAMN ! worker 9 (pid: 1185) died :( trying respawn ...
Respawned uWSGI worker 9 (new pid: 1191)
[uWSGI DEBUG] called 0xb2e9d534 0xb2f4702c 6950
Ryan-
I've been running into an issue lately where after my server has been
running for a few hours, the uWSGI master process will start killing
off workers for seemingly no reason. When this happens, spool jobs
also fail to be created, but no failures seem to occur in the logs.
What's weird is that this error seems to have only begun manifesting
itself in the last couple of weeks. I'm not sure if this is related
to increased traffic or what. My memory usage is not outrageous, I
still have plenty of physical memory and swap available. Also, CPU
usage is very minimal. Restarting uwsgi will fix the issue for a few
hours before it begins to creep up again, at first just one or two
workers will be killed off every few minutes, and eventually it gets
to the point where at least 1 worker is killed at every request.
Below is my configuration, along with an excerpt from my log file
which shows several workers being killed and respawned within less
than a minute. I'm running uWSGI 0.9.9.2 with debug enabled (this
issue also happens with debug disabled).
Any help is much appreciated!
[program:uwsgi-www]
command=/opt/webapps/www.example.com/bin/uwsgi
--master
--processes 16
--memory-report
--harakiri 7200
--vacuum
--max-requests 500
--optimize 2
--enable-threads
--cache 1000
--import example-app.custom.uwsgi_utils
--spooler /opt/webapps/www.example.com/spooler
--home /opt/webapps/www.example.com/
--pythonpath /opt/webapps/www.example.com
--spooler-import example-app.custom.uwsgi_utils
--socket /opt/webapps/www.example.com/sock/uwsgi.sock
--pidfile /opt/webapps/www.example.com/pid/wsgi.pid
--wsgi-file /opt/webapps/www.example.com/django.wsgi
directory=/opt/webapps/www.example.com/example-app
environment=DJANGO_SETTINGS_MODULE='example-app.settings'
stdout_logfile=/var/log/uwsgi/www.example.com_supervisord.log
stderr_logfile=/var/log/uwsgi/www.example.com_supervisord.err
user=www-data
autostart=true
autorestart=true
stopsignal=QUIT
DAMN ! worker 12 (pid: 446) died :( trying respawn ...
Respawned uWSGI worker 12 (new pid: 1174)
DAMN ! worker 9 (pid: 893) died :( trying respawn ...
[uWSGI DEBUG] called 0xb2e9d534 0xb2f4702c 6950
Respawned uWSGI worker 9 (new pid: 1175)
DAMN ! worker 10 (pid: 1074) died :( trying respawn ...
[uWSGI DEBUG] called 0xb2e9d534 0xb2f4702c 6950
Respawned uWSGI worker 10 (new pid: 1176)
DAMN ! worker 14 (pid: 1101) died :( trying respawn ...
[uWSGI DEBUG] called 0xb2e9d534 0xb2f4702c 6950
Respawned uWSGI worker 14 (new pid: 1177)
DAMN ! worker 13 (pid: 1123) died :( trying respawn ...
[uWSGI DEBUG] called 0xb2e9d534 0xb2f4702c 6950
Respawned uWSGI worker 13 (new pid: 1178)
DAMN ! worker 1 (pid: 1134) died :( trying respawn ...
[uWSGI DEBUG] called 0xb2e9d534 0xb2f4702c 6950
Respawned uWSGI worker 1 (new pid: 1179)
DAMN ! worker 8 (pid: 1148) died :( trying respawn ...
[uWSGI DEBUG] called 0xb2e9d534 0xb2f4702c 6950
Respawned uWSGI worker 8 (new pid: 1180)
DAMN ! worker 5 (pid: 1149) died :( trying respawn ...
Respawned uWSGI worker 5 (new pid: 1181)
DAMN ! worker 6 (pid: 1157) died :( trying respawn ...
[uWSGI DEBUG] called 0xb2ec3bec 0xb2f43b0c 1
{address space usage: 111661056 bytes/106MB} {rss usage: 19304448
bytes/18MB} [pid: 1159|app: 0|req: 176/70] 50.23.94.74 () {30 vars in
390 bytes} [Fri Oct 14 00:04:46 2011] GET / => generated 8514 bytes in
227 msecs (HTTP/1.0 200) 4 headers in 333 bytes (1 switches on core 0)
[uWSGI DEBUG] called 0xb2e9d534 0xb2f4702c 6950
[uWSGI DEBUG] called 0xb2e9d534 0xb2f4702c 6950
Respawned uWSGI worker 6 (new pid: 1182)
[uWSGI DEBUG] called 0xb2e9d534 0xb2f4702c 6950
Fri Oct 14 00:04:57 2011 - master sent signal 63 to worker 8
Fri Oct 14 00:04:58 2011 - master sent signal 40 to worker 1
DAMN ! worker 8 (pid: 1180) died :( trying respawn ...
Respawned uWSGI worker 8 (new pid: 1183)
DAMN ! worker 1 (pid: 1179) died :( trying respawn ...
[uWSGI DEBUG] called 0xb2e9d534 0xb2f4702c 6950
Respawned uWSGI worker 1 (new pid: 1184)
[uWSGI DEBUG] called 0xb2e9d534 0xb2f4702c 6950
Fri Oct 14 00:04:59 2011 - master sent signal 47 to worker 9
DAMN ! worker 9 (pid: 1175) died :( trying respawn ...
Respawned uWSGI worker 9 (new pid: 1185)
[uWSGI DEBUG] called 0xb2e9d534 0xb2f4702c 6950
Fri Oct 14 00:05:02 2011 - master sent signal 42 to worker 9
DAMN ! worker 9 (pid: 1185) died :( trying respawn ...
Respawned uWSGI worker 9 (new pid: 1191)
[uWSGI DEBUG] called 0xb2e9d534 0xb2f4702c 6950
Ryan-