Discussion:
[uWSGI] process created in UWSGI does not finish in threaded mode
Berkay Koyutürk
2018-09-25 11:45:29 UTC
Permalink
I am using CentOS 7 with uwsgi(2.0.17.1), uwsgi-plugin-python2 (2.0.17)
and Flask (1.0.2).

I have a flask application that runs many shell commands. Some of them
hangs indefinitely. I am sharing one of them as an example below, it is
easy to reproduce.
I am using uwsgi in threaded mode.

*    #!/usr/bin/python**
**    from flask import Flask**
**    import subprocess**
****
**    app = Flask(__name__)**
****
**    @app.route("/test/")**
**    def test():**
**        print 'state 1'**
**        p = subprocess.Popen(['ntpdate', '-u',
'0.pool.ntp.org'],close_fds=True)**
**        print 'state 2'**
**        out, err = p.communicate()**
**        print 'state 3'**
****
**        print 'output: {}'.format(out)**
**        print 'error: {}'.format(err)**
**        return 'finish'*

uwsgi command:

*> \# uwsgi --ini uwsgi.conf --http-socket :8001*

uwsgi conf file:

*   [uwsgi]**
**    #virtual env's base folder**
**    base = /root/plaground**
**    home = /usr/**
****
**    #cmd options**
**    bin = /usr/sbin/uwsgi**
****
**    #python module to import**
**    module = app:app**
****
**    pythonpath = /root/playground**
****
**    catch-exceptions = true**
****
**    plugins-dir = /usr/lib64/uwsgi/**
**    plugins = python**
****
**    need-app = true**
****
**    master = true**
**    threads = 10*

After starting uwsgi, first request to /test route completes
succesfully, Second one stucking on command line execution and holds
response forever.

The first call output:

*state 1**
**    state 2**
**    21 Sep 16:02:48 ntpdate[19943]: adjust time server 78.46.53.11
offset -0.000224 sec**
**    state 3**
**    output: None**
**    error: None*

The second call output:

*state 1**
**    state 2*


ps aux output after second call:

*# ps aux | grep ntpdate**
**    root     20820  0.0  0.0  23632  1516 pts/3    S<+ 16:03   0:00
ntpdate -u 0.pool.ntp.org*

strace -p 20820 output:

*   # strace -p 20820**
**    strace: Process 20820 attached**
**    restart_syscall(<... resuming interrupted call ...>) = 0 **
**    poll([{fd=3, events=POLLIN}, {fd=4, events=POLLIN}], 2, 60000) = 0
(Timeout)**
**    poll([{fd=3, events=POLLIN}, {fd=4, events=POLLIN}], 2, 60000) = 0
(Timeout*)

process fd list:

*    # ls -l /proc/20820/fd**
**    total 0**
**    lr-x------ 1 root root 64 Eyl 21 16:05 0 -> /dev/null**
**    lrwx------ 1 root root 64 Eyl 21 16:05 1 -> /dev/pts/3**
**    lrwx------ 1 root root 64 Eyl 21 16:03 2 -> /dev/pts/3**
**    lrwx------ 1 root root 64 Eyl 21 16:05 3 -> socket:[463358]**
**    lrwx------ 1 root root 64 Eyl 21 16:05 4 -> socket:[463359]*

lsof output:

*# lsof -p 20820**
**    COMMAND   PID USER   FD   TYPE DEVICE SIZE/OFF     NODE NAME**
**    ntpdate 20820 root  cwd    DIR  253,0       53 17726086
/root/playground**
**    ntpdate 20820 root  rtd    DIR  253,0      244       64 / **
**    ntpdate 20820 root  txt    REG  253,0   110240   224413
/usr/sbin/ntpdate **
**    ntpdate 20820 root  mem    REG  253,0   106848   361769
/usr/lib64/libresolv-2.17.so **
**    ntpdate 20820 root  mem    REG  253,0    31824    31549
/usr/lib64/libnss_dns-2.17.so **
**    ntpdate 20820 root  mem    REG  253,0    62184    31551
/usr/lib64/libnss_files-2.17.so **
**    ntpdate 20820 root  mem    REG  253,0    19896   142588
/usr/lib64/libattr.so.1.1.0 **
**    ntpdate 20820 root  mem    REG  253,0    90664   142317
/usr/lib64/libz.so.1.2.7 **
**    ntpdate 20820 root  mem    REG  253,0    19776    95386
/usr/lib64/libdl-2.17.so **
**    ntpdate 20820 root  mem    REG  253,0  2173512    95380
/usr/lib64/libc-2.17.so **
**    ntpdate 20820 root  mem    REG  253,0    20032   142590
/usr/lib64/libcap.so.2.22 **
**    ntpdate 20820 root  mem    REG  253,0  2512832   190324
/usr/lib64/libcrypto.so.1.0.2k **
**    ntpdate 20820 root  mem    REG  253,0   164240    95374
/usr/lib64/ld-2.17.so **
**    ntpdate 20820 root    0r   CHR    1,3      0t0     5078 /dev/null **
**    ntpdate 20820 root    1u   CHR  136,3      0t0        6 /dev/pts/3 **
**    ntpdate 20820 root    2u   CHR  136,3      0t0        6 /dev/pts/3 **
**    ntpdate 20820 root    3u  sock    0,7      0t0   463358 protocol:
UDP **
**    ntpdate 20820 root    4u  sock    0,7      0t0   463359 protocol:
UDPv6 *

If I try ntpdate command with shell or flask builtin web server, this
problem is not occuring.
I found similar problem on uwsgi github
(https://github.com/unbit/uwsgi/issues/1034) page but this fix does not
solve my problem.

I also tried it with uwsgi parameters close-on-exec and close-on-exec2.
No luck with these also.

Thanks for helping.
Roberto De Ioris
2018-09-25 14:01:15 UTC
Permalink
Post by Berkay Koyutürk
I am using CentOS 7 with uwsgi(2.0.17.1), uwsgi-plugin-python2 (2.0.17)
and Flask (1.0.2).
I have a flask application that runs many shell commands. Some of them
hangs indefinitely. I am sharing one of them as an example below, it is
easy to reproduce.
I am using uwsgi in threaded mode.
*    #!/usr/bin/python**
**    from flask import Flask**
**    import subprocess**
****
**    app = Flask(__name__)**
****
**    def test():**
**        print 'state 1'**
**        p = subprocess.Popen(['ntpdate', '-u',
'0.pool.ntp.org'],close_fds=True)**
**        print 'state 2'**
**        out, err = p.communicate()**
**        print 'state 3'**
****
**        print 'output: {}'.format(out)**
**        print 'error: {}'.format(err)**
**        return 'finish'*
*> \# uwsgi --ini uwsgi.conf --http-socket :8001*
*   [uwsgi]**
**    #virtual env's base folder**
**    base = /root/plaground**
**    home = /usr/**
****
**    #cmd options**
**    bin = /usr/sbin/uwsgi**
****
**    #python module to import**
**    module = app:app**
****
**    pythonpath = /root/playground**
****
**    catch-exceptions = true**
****
**    plugins-dir = /usr/lib64/uwsgi/**
**    plugins = python**
****
**    need-app = true**
****
**    master = true**
**    threads = 10*
After starting uwsgi, first request to /test route completes
succesfully, Second one stucking on command line execution and holds
response forever.
*state 1**
**    state 2**
**    21 Sep 16:02:48 ntpdate[19943]: adjust time server 78.46.53.11
offset -0.000224 sec**
**    state 3**
**    output: None**
**    error: None*
*state 1**
**    state 2*
Hi, from the strace output it looks like the spawned process is blocked
waiting for network data. Are you sure you do not have something like
selinux (or at firewall level) that is blocking traffic ?
--
Roberto De Ioris
http://unbit.com
Loading...