Berkay Koyutürk
2018-09-25 11:45:29 UTC
I am using CentOS 7 with uwsgi(2.0.17.1), uwsgi-plugin-python2 (2.0.17)
and Flask (1.0.2).
I have a flask application that runs many shell commands. Some of them
hangs indefinitely. I am sharing one of them as an example below, it is
easy to reproduce.
I am using uwsgi in threaded mode.
*Â Â Â #!/usr/bin/python**
**Â Â Â from flask import Flask**
**Â Â Â import subprocess**
****
**Â Â Â app = Flask(__name__)**
****
**Â Â Â @app.route("/test/")**
**Â Â Â def test():**
**Â Â Â Â Â Â Â print 'state 1'**
**Â Â Â Â Â Â Â p = subprocess.Popen(['ntpdate', '-u',
'0.pool.ntp.org'],close_fds=True)**
**Â Â Â Â Â Â Â print 'state 2'**
**Â Â Â Â Â Â Â out, err = p.communicate()**
**Â Â Â Â Â Â Â print 'state 3'**
****
**Â Â Â Â Â Â Â print 'output: {}'.format(out)**
**Â Â Â Â Â Â Â print 'error: {}'.format(err)**
**Â Â Â Â Â Â Â return 'finish'*
uwsgi command:
*> \# uwsgi --ini uwsgi.conf --http-socket :8001*
uwsgi conf file:
*Â Â [uwsgi]**
**Â Â Â #virtual env's base folder**
**Â Â Â base = /root/plaground**
**Â Â Â home = /usr/**
****
**Â Â Â #cmd options**
**Â Â Â bin = /usr/sbin/uwsgi**
****
**Â Â Â #python module to import**
**Â Â Â module = app:app**
****
**Â Â Â pythonpath = /root/playground**
****
**Â Â Â catch-exceptions = true**
****
**Â Â Â plugins-dir = /usr/lib64/uwsgi/**
**Â Â Â plugins = python**
****
**Â Â Â need-app = true**
****
**Â Â Â master = true**
**Â Â Â threads = 10*
After starting uwsgi, first request to /test route completes
succesfully, Second one stucking on command line execution and holds
response forever.
The first call output:
*state 1**
**Â Â Â state 2**
**Â Â Â 21 Sep 16:02:48 ntpdate[19943]: adjust time server 78.46.53.11
offset -0.000224 sec**
**Â Â Â state 3**
**Â Â Â output: None**
**Â Â Â error: None*
The second call output:
*state 1**
**Â Â Â state 2*
ps aux output after second call:
*# ps aux | grep ntpdate**
**   root    20820 0.0 0.0 23632 1516 pts/3   S<+ 16:03  0:00
ntpdate -u 0.pool.ntp.org*
strace -p 20820 output:
*Â Â # strace -p 20820**
**Â Â Â strace: Process 20820 attached**
**Â Â Â restart_syscall(<... resuming interrupted call ...>) = 0 **
**Â Â Â poll([{fd=3, events=POLLIN}, {fd=4, events=POLLIN}], 2, 60000) = 0
(Timeout)**
**Â Â Â poll([{fd=3, events=POLLIN}, {fd=4, events=POLLIN}], 2, 60000) = 0
(Timeout*)
process fd list:
*Â Â Â # ls -l /proc/20820/fd**
**Â Â Â total 0**
**Â Â Â lr-x------ 1 root root 64 Eyl 21 16:05 0 -> /dev/null**
**Â Â Â lrwx------ 1 root root 64 Eyl 21 16:05 1 -> /dev/pts/3**
**Â Â Â lrwx------ 1 root root 64 Eyl 21 16:03 2 -> /dev/pts/3**
**Â Â Â lrwx------ 1 root root 64 Eyl 21 16:05 3 -> socket:[463358]**
**Â Â Â lrwx------ 1 root root 64 Eyl 21 16:05 4 -> socket:[463359]*
lsof output:
*# lsof -p 20820**
**Â Â Â COMMANDÂ Â PID USERÂ Â FDÂ Â TYPE DEVICE SIZE/OFFÂ Â Â Â NODE NAME**
**   ntpdate 20820 root cwd   DIR 253,0      53 17726086
/root/playground**
**   ntpdate 20820 root rtd   DIR 253,0     244      64 / **
**   ntpdate 20820 root txt   REG 253,0  110240  224413
/usr/sbin/ntpdate **
**   ntpdate 20820 root mem   REG 253,0  106848  361769
/usr/lib64/libresolv-2.17.so **
**   ntpdate 20820 root mem   REG 253,0   31824   31549
/usr/lib64/libnss_dns-2.17.so **
**   ntpdate 20820 root mem   REG 253,0   62184   31551
/usr/lib64/libnss_files-2.17.so **
**   ntpdate 20820 root mem   REG 253,0   19896  142588
/usr/lib64/libattr.so.1.1.0 **
**   ntpdate 20820 root mem   REG 253,0   90664  142317
/usr/lib64/libz.so.1.2.7 **
**   ntpdate 20820 root mem   REG 253,0   19776   95386
/usr/lib64/libdl-2.17.so **
**   ntpdate 20820 root mem   REG 253,0 2173512   95380
/usr/lib64/libc-2.17.so **
**   ntpdate 20820 root mem   REG 253,0   20032  142590
/usr/lib64/libcap.so.2.22 **
**   ntpdate 20820 root mem   REG 253,0 2512832  190324
/usr/lib64/libcrypto.so.1.0.2k **
**   ntpdate 20820 root mem   REG 253,0  164240   95374
/usr/lib64/ld-2.17.so **
**   ntpdate 20820 root   0r  CHR   1,3     0t0    5078 /dev/null **
**   ntpdate 20820 root   1u  CHR 136,3     0t0       6 /dev/pts/3 **
**   ntpdate 20820 root   2u  CHR 136,3     0t0       6 /dev/pts/3 **
**   ntpdate 20820 root   3u sock   0,7     0t0  463358 protocol:
UDP **
**   ntpdate 20820 root   4u sock   0,7     0t0  463359 protocol:
UDPv6 *
If I try ntpdate command with shell or flask builtin web server, this
problem is not occuring.
I found similar problem on uwsgi github
(https://github.com/unbit/uwsgi/issues/1034) page but this fix does not
solve my problem.
I also tried it with uwsgi parameters close-on-exec and close-on-exec2.
No luck with these also.
Thanks for helping.
and Flask (1.0.2).
I have a flask application that runs many shell commands. Some of them
hangs indefinitely. I am sharing one of them as an example below, it is
easy to reproduce.
I am using uwsgi in threaded mode.
*Â Â Â #!/usr/bin/python**
**Â Â Â from flask import Flask**
**Â Â Â import subprocess**
****
**Â Â Â app = Flask(__name__)**
****
**Â Â Â @app.route("/test/")**
**Â Â Â def test():**
**Â Â Â Â Â Â Â print 'state 1'**
**Â Â Â Â Â Â Â p = subprocess.Popen(['ntpdate', '-u',
'0.pool.ntp.org'],close_fds=True)**
**Â Â Â Â Â Â Â print 'state 2'**
**Â Â Â Â Â Â Â out, err = p.communicate()**
**Â Â Â Â Â Â Â print 'state 3'**
****
**Â Â Â Â Â Â Â print 'output: {}'.format(out)**
**Â Â Â Â Â Â Â print 'error: {}'.format(err)**
**Â Â Â Â Â Â Â return 'finish'*
uwsgi command:
*> \# uwsgi --ini uwsgi.conf --http-socket :8001*
uwsgi conf file:
*Â Â [uwsgi]**
**Â Â Â #virtual env's base folder**
**Â Â Â base = /root/plaground**
**Â Â Â home = /usr/**
****
**Â Â Â #cmd options**
**Â Â Â bin = /usr/sbin/uwsgi**
****
**Â Â Â #python module to import**
**Â Â Â module = app:app**
****
**Â Â Â pythonpath = /root/playground**
****
**Â Â Â catch-exceptions = true**
****
**Â Â Â plugins-dir = /usr/lib64/uwsgi/**
**Â Â Â plugins = python**
****
**Â Â Â need-app = true**
****
**Â Â Â master = true**
**Â Â Â threads = 10*
After starting uwsgi, first request to /test route completes
succesfully, Second one stucking on command line execution and holds
response forever.
The first call output:
*state 1**
**Â Â Â state 2**
**Â Â Â 21 Sep 16:02:48 ntpdate[19943]: adjust time server 78.46.53.11
offset -0.000224 sec**
**Â Â Â state 3**
**Â Â Â output: None**
**Â Â Â error: None*
The second call output:
*state 1**
**Â Â Â state 2*
ps aux output after second call:
*# ps aux | grep ntpdate**
**   root    20820 0.0 0.0 23632 1516 pts/3   S<+ 16:03  0:00
ntpdate -u 0.pool.ntp.org*
strace -p 20820 output:
*Â Â # strace -p 20820**
**Â Â Â strace: Process 20820 attached**
**Â Â Â restart_syscall(<... resuming interrupted call ...>) = 0 **
**Â Â Â poll([{fd=3, events=POLLIN}, {fd=4, events=POLLIN}], 2, 60000) = 0
(Timeout)**
**Â Â Â poll([{fd=3, events=POLLIN}, {fd=4, events=POLLIN}], 2, 60000) = 0
(Timeout*)
process fd list:
*Â Â Â # ls -l /proc/20820/fd**
**Â Â Â total 0**
**Â Â Â lr-x------ 1 root root 64 Eyl 21 16:05 0 -> /dev/null**
**Â Â Â lrwx------ 1 root root 64 Eyl 21 16:05 1 -> /dev/pts/3**
**Â Â Â lrwx------ 1 root root 64 Eyl 21 16:03 2 -> /dev/pts/3**
**Â Â Â lrwx------ 1 root root 64 Eyl 21 16:05 3 -> socket:[463358]**
**Â Â Â lrwx------ 1 root root 64 Eyl 21 16:05 4 -> socket:[463359]*
lsof output:
*# lsof -p 20820**
**Â Â Â COMMANDÂ Â PID USERÂ Â FDÂ Â TYPE DEVICE SIZE/OFFÂ Â Â Â NODE NAME**
**   ntpdate 20820 root cwd   DIR 253,0      53 17726086
/root/playground**
**   ntpdate 20820 root rtd   DIR 253,0     244      64 / **
**   ntpdate 20820 root txt   REG 253,0  110240  224413
/usr/sbin/ntpdate **
**   ntpdate 20820 root mem   REG 253,0  106848  361769
/usr/lib64/libresolv-2.17.so **
**   ntpdate 20820 root mem   REG 253,0   31824   31549
/usr/lib64/libnss_dns-2.17.so **
**   ntpdate 20820 root mem   REG 253,0   62184   31551
/usr/lib64/libnss_files-2.17.so **
**   ntpdate 20820 root mem   REG 253,0   19896  142588
/usr/lib64/libattr.so.1.1.0 **
**   ntpdate 20820 root mem   REG 253,0   90664  142317
/usr/lib64/libz.so.1.2.7 **
**   ntpdate 20820 root mem   REG 253,0   19776   95386
/usr/lib64/libdl-2.17.so **
**   ntpdate 20820 root mem   REG 253,0 2173512   95380
/usr/lib64/libc-2.17.so **
**   ntpdate 20820 root mem   REG 253,0   20032  142590
/usr/lib64/libcap.so.2.22 **
**   ntpdate 20820 root mem   REG 253,0 2512832  190324
/usr/lib64/libcrypto.so.1.0.2k **
**   ntpdate 20820 root mem   REG 253,0  164240   95374
/usr/lib64/ld-2.17.so **
**   ntpdate 20820 root   0r  CHR   1,3     0t0    5078 /dev/null **
**   ntpdate 20820 root   1u  CHR 136,3     0t0       6 /dev/pts/3 **
**   ntpdate 20820 root   2u  CHR 136,3     0t0       6 /dev/pts/3 **
**   ntpdate 20820 root   3u sock   0,7     0t0  463358 protocol:
UDP **
**   ntpdate 20820 root   4u sock   0,7     0t0  463359 protocol:
UDPv6 *
If I try ntpdate command with shell or flask builtin web server, this
problem is not occuring.
I found similar problem on uwsgi github
(https://github.com/unbit/uwsgi/issues/1034) page but this fix does not
solve my problem.
I also tried it with uwsgi parameters close-on-exec and close-on-exec2.
No luck with these also.
Thanks for helping.