Discussion:
[uWSGI] Monitoring backlog queue
Jean-Baptiste Quenot
2011-02-01 13:36:28 UTC
Permalink
Hi there,

Is it possible to monitor the size of the backlog queue (number of
incoming requests waiting for a worker to handle the request)? And
the max wait time?

I'm looking for stats similar to pgbouncer's 'show pools'
http://pgbouncer.projects.postgresql.org/doc/usage.html

As stated in pgbouncer's doc: maxwait == How long the first (oldest)
client in queue has waited, in seconds. If this starts increasing,
then the current pool of servers does not handle requests quick
enough. Reason may be either overloaded server or just too small of a
pool_size setting.

Thanks,
--
Jean-Baptiste Quenot
Roberto De Ioris
2011-02-01 13:56:50 UTC
Permalink
Post by Jean-Baptiste Quenot
Hi there,
Is it possible to monitor the size of the backlog queue (number of
incoming requests waiting for a worker to handle the request)?
This is already done in the master process if you use TCP sockets on Linux
(no way for UNIX socket). It alert you when the backlog is getting full,
but i can easily export this value to the uwsgi api.

And
Post by Jean-Baptiste Quenot
the max wait time?
This is harder as the webserver must pass the timestamp it started sending
the request. I have sent some month ago a patch to the nginx list to
export a $time variable but got rejected :(
--
Roberto De Ioris
http://unbit.it
t***@smail.ee
2011-02-01 14:24:34 UTC
Permalink
Hello Roberto,

can you also export current used core count? "busy cores" during async. or it's
already done?
Roberto De Ioris
2011-02-03 20:18:05 UTC
Permalink
Post by t***@smail.ee
Hello Roberto,
can you also export current used core count? "busy cores" during async. or it's
already done?
I do not think it is "realiable"

Take this situation:

0 running
1 running
2 unused
3 running
4 unused
5 running
6 unused

The value will report 7 core running...

It would be better to add another counter that increment after every
accept() and decrement after every close()

I will add this in 0.9.7 (i think its async stack will be finished in a
couple of days)
--
Roberto De Ioris
http://unbit.it
Jean-Baptiste Quenot
2011-02-14 14:50:41 UTC
Permalink
Post by Roberto De Ioris
Post by Jean-Baptiste Quenot
Is it possible to monitor the size of the backlog queue (number of
incoming requests waiting for a worker to handle the request)?
This is already done in the master process if you use TCP sockets on Linux
(no way for UNIX socket). It alert you when the backlog is getting full,
but i can easily export this value to the uwsgi api.
I switched back to TCP sockets and get this in uwsgi output in production:

Mon Feb 14 15:18:27 2011 - *** uWSGI listen queue of socket 4 full !!!
(129/128) ***
Mon Feb 14 15:18:28 2011 - *** uWSGI listen queue of socket 4 full !!!
(129/128) ***
Mon Feb 14 15:18:29 2011 - *** uWSGI listen queue of socket 4 full !!!
(129/128) ***
Mon Feb 14 15:18:30 2011 - *** uWSGI listen queue of socket 4 full !!!
(129/128) ***
Mon Feb 14 15:18:31 2011 - *** uWSGI listen queue of socket 4 full !!!
(129/128) ***
Mon Feb 14 15:18:32 2011 - *** uWSGI listen queue of socket 4 full !!!
(129/128) ***
Mon Feb 14 15:18:33 2011 - *** uWSGI listen queue of socket 4 full !!!
(129/128) ***
Mon Feb 14 15:18:34 2011 - *** uWSGI listen queue of socket 4 full !!!
(128/128) ***
SIGPIPE: writing to a closed pipe/socket/fd (probably the client
disconnected) on request /status !!!
write(): Broken pipe [plugins/python/wsgi_subhandler.c line 186]

I'd like to monitor the listen queue, in order to diagnose if we are
constantly "near the limit", or if it is an "accident". Can you
export the variable please?

BTW what's the status of the syslog support? Can I use --log-syslog
reliably, and get the above messages at warn or err level? This would
be really nice!

Thanks in advance,
--
Jean-Baptiste Quenot
Roberto De Ioris
2011-02-14 19:50:15 UTC
Permalink
Post by Jean-Baptiste Quenot
Post by Roberto De Ioris
Post by Jean-Baptiste Quenot
Is it possible to monitor the size of the backlog queue (number of
incoming requests waiting for a worker to handle the request)?
This is already done in the master process if you use TCP sockets on Linux
(no way for UNIX socket). It alert you when the backlog is getting full,
but i can easily export this value to the uwsgi api.
Mon Feb 14 15:18:27 2011 - *** uWSGI listen queue of socket 4 full !!!
(129/128) ***
Mon Feb 14 15:18:28 2011 - *** uWSGI listen queue of socket 4 full !!!
(129/128) ***
Mon Feb 14 15:18:29 2011 - *** uWSGI listen queue of socket 4 full !!!
(129/128) ***
Mon Feb 14 15:18:30 2011 - *** uWSGI listen queue of socket 4 full !!!
(129/128) ***
Mon Feb 14 15:18:31 2011 - *** uWSGI listen queue of socket 4 full !!!
(129/128) ***
Mon Feb 14 15:18:32 2011 - *** uWSGI listen queue of socket 4 full !!!
(129/128) ***
Mon Feb 14 15:18:33 2011 - *** uWSGI listen queue of socket 4 full !!!
(129/128) ***
Mon Feb 14 15:18:34 2011 - *** uWSGI listen queue of socket 4 full !!!
(128/128) ***
SIGPIPE: writing to a closed pipe/socket/fd (probably the client
disconnected) on request /status !!!
write(): Broken pipe [plugins/python/wsgi_subhandler.c line 186]
I'd like to monitor the listen queue, in order to diagnose if we are
constantly "near the limit", or if it is an "accident". Can you
export the variable please?
If i remember correctly you run with full logging disabled (so reporting
it in the logs would be useless). Maybe the master can print the 'percent'
variation ? Something like

*** listen queue is now 10 % full
*** listen queue is now 20 % full
*** listen queue is now 10 % full
*** listen queue is now 30 % full

or you want to manage it in the code accessing something like
env['uwsgi.listen_queue'] ?
Post by Jean-Baptiste Quenot
BTW what's the status of the syslog support? Can I use --log-syslog
reliably, and get the above messages at warn or err level? This would
be really nice!
Oh my god, i have totally forgot it :P

Honestly i have to recheck those part of the code, but it is very simple.

It pratically creates a socketpair between workers and master. Every time
a workers write to fd 2 the data are sent to the master that send them to
the syslog.
--
Roberto De Ioris
http://unbit.it
Jean-Baptiste Quenot
2011-02-15 09:32:52 UTC
Permalink
Post by Roberto De Ioris
If i remember correctly you run with full logging disabled (so reporting
it in the logs would be useless). Maybe the master can print the 'percent'
variation ? Something like
*** listen queue is now 10 % full
*** listen queue is now 20 % full
*** listen queue is now 10 % full
*** listen queue is now 30 % full
or you want to manage it in the code accessing something like
env['uwsgi.listen_queue'] ?
Yes I'd like to manage it with something like
uwsgi.stats['listen_queue']. A 'stats' dictionary in the uwsgi module
sounds better than WSGI variables.
--
Jean-Baptiste Quenot
Roberto De Ioris
2011-02-21 06:27:14 UTC
Permalink
Post by Jean-Baptiste Quenot
Post by Roberto De Ioris
If i remember correctly you run with full logging disabled (so reporting
it in the logs would be useless). Maybe the master can print the 'percent'
variation ? Something like
*** listen queue is now 10 % full
*** listen queue is now 20 % full
*** listen queue is now 10 % full
*** listen queue is now 30 % full
or you want to manage it in the code accessing something like
env['uwsgi.listen_queue'] ?
Yes I'd like to manage it with something like
uwsgi.stats['listen_queue']. A 'stats' dictionary in the uwsgi module
sounds better than WSGI variables.
--
Jean-Baptiste Quenot
Sadly i cannot go (for now) with dictionaries as in threading mode they get
clobbered (and i do not want to add another lock in the stack ;) )

I will export a uwsgi.listen_queue() function that will return the value
taken from a shared area (no locking needed as the value is updated only
by master)
--
Roberto De Ioris
http://unbit.it
Jean-Baptiste Quenot
2011-05-31 13:23:36 UTC
Permalink
Post by Roberto De Ioris
Post by Jean-Baptiste Quenot
Post by Roberto De Ioris
If i remember correctly you run with full logging disabled (so reporting
it in the logs would be useless). Maybe the master can print the 'percent'
variation ? Something like
*** listen queue is now 10 % full
*** listen queue is now 20 % full
*** listen queue is now 10 % full
*** listen queue is now 30 % full
or you want to manage it in the code accessing something like
env['uwsgi.listen_queue'] ?
Yes I'd like to manage it with something like
uwsgi.stats['listen_queue'].  A 'stats' dictionary in the uwsgi module
sounds better than WSGI variables.
Sadly i cannot go (for now) with dictionaries as in threading mode they get
clobbered (and i do not want to add another lock in the stack ;) )
I will export a uwsgi.listen_queue() function that will return the value
taken from a shared area (no locking needed as the value is updated only
by master)
Hi Roberto,

I'm thinking again about monitoring the listen queue. There are
various problems:

1) listen_queue() will never give numbers above the maximum queue
size, so obviously it can only be useful to know that the queue is
full. So maybe providing listen_queue_full() boolean result instead
would make more sense.

2) as a matter of facts the error message that is printed in the logs
does not need to mention tcpi_unacked, as it will never exceed the
queue size + 1. For example, with only one worker, only one slot in
the listen queue and many concurrent requests, here is what the
messages look like:

Tue May 31 10:32:06 2011 - *** uWSGI listen queue of socket 4 full !!! (2/1) ***
Tue May 31 10:32:07 2011 - *** uWSGI listen queue of socket 4 full !!! (2/1) ***
Tue May 31 10:32:08 2011 - *** uWSGI listen queue of socket 4 full !!! (2/1) ***
Tue May 31 10:32:09 2011 - *** uWSGI listen queue of socket 4 full !!! (2/1) ***
Tue May 31 10:32:10 2011 - *** uWSGI listen queue of socket 4 full !!! (2/1) ***
Tue May 31 10:32:11 2011 - *** uWSGI listen queue of socket 4 full !!! (2/1) ***
Tue May 31 10:32:12 2011 - *** uWSGI listen queue of socket 4 full !!! (2/1) ***
Tue May 31 10:32:13 2011 - *** uWSGI listen queue of socket 4 full !!! (2/1) ***
Tue May 31 10:32:14 2011 - *** uWSGI listen queue of socket 4 full !!! (2/1) ***
Tue May 31 10:32:15 2011 - *** uWSGI listen queue of socket 4 full !!! (1/1) ***

3) I tried tweaking the error message to mention the tcpi_lost field,
but it is always zero.

So to sum up, when the queue is getting full, the sysadmin cannot
determine the severity of the downtime just by looking at the size of
the listen queue, because it remains constant at the maximum size (eg
at 128), and it is not reliable to check an instantaneous value
periodically. Knowing the number of dropped requests would be great,
but this information is not available on Linux (may be available on
FreeBSD). So I believe the only option is to add a new counter in the
master, that will be incremented everytime the listen queue has been
checked full.

I would monitor this counter and use a DERIVE data source in RRD to
graph the severity of my UWSGI app's down times.

Does it make sense?

And last but not least, querying the uwsgi python module involves
doing an HTTP request and getting a worker to serve this internal
request just for knowing the value of a counter. Or do you think of
another way to access the number? (sending a signal to the master,
like Java does for getting a stack trace, providing a unix socket
connection and fetch the data with socat, like haproxy does, or
providing a monitoring port that talks HTTP?)

Thanks in advance,
--
Jean-Baptiste Quenot
Roberto De Ioris
2011-06-01 08:13:38 UTC
Permalink
Post by Jean-Baptiste Quenot
Post by Roberto De Ioris
Post by Jean-Baptiste Quenot
Post by Roberto De Ioris
If i remember correctly you run with full logging disabled (so reporting
it in the logs would be useless). Maybe the master can print the 'percent'
variation ? Something like
*** listen queue is now 10 % full
*** listen queue is now 20 % full
*** listen queue is now 10 % full
*** listen queue is now 30 % full
or you want to manage it in the code accessing something like
env['uwsgi.listen_queue'] ?
Yes I'd like to manage it with something like
uwsgi.stats['listen_queue']. A 'stats' dictionary in the uwsgi module
sounds better than WSGI variables.
Sadly i cannot go (for now) with dictionaries as in threading mode they get
clobbered (and i do not want to add another lock in the stack ;) )
I will export a uwsgi.listen_queue() function that will return the value
taken from a shared area (no locking needed as the value is updated only
by master)
Hi Roberto,
I'm thinking again about monitoring the listen queue. There are
1) listen_queue() will never give numbers above the maximum queue
size, so obviously it can only be useful to know that the queue is
full. So maybe providing listen_queue_full() boolean result instead
would make more sense.
2) as a matter of facts the error message that is printed in the logs
does not need to mention tcpi_unacked, as it will never exceed the
queue size + 1. For example, with only one worker, only one slot in
the listen queue and many concurrent requests, here is what the
Tue May 31 10:32:06 2011 - *** uWSGI listen queue of socket 4 full !!! (2/1) ***
Tue May 31 10:32:07 2011 - *** uWSGI listen queue of socket 4 full !!! (2/1) ***
Tue May 31 10:32:08 2011 - *** uWSGI listen queue of socket 4 full !!! (2/1) ***
Tue May 31 10:32:09 2011 - *** uWSGI listen queue of socket 4 full !!! (2/1) ***
Tue May 31 10:32:10 2011 - *** uWSGI listen queue of socket 4 full !!! (2/1) ***
Tue May 31 10:32:11 2011 - *** uWSGI listen queue of socket 4 full !!! (2/1) ***
Tue May 31 10:32:12 2011 - *** uWSGI listen queue of socket 4 full !!! (2/1) ***
Tue May 31 10:32:13 2011 - *** uWSGI listen queue of socket 4 full !!! (2/1) ***
Tue May 31 10:32:14 2011 - *** uWSGI listen queue of socket 4 full !!! (2/1) ***
Tue May 31 10:32:15 2011 - *** uWSGI listen queue of socket 4 full !!! (1/1) ***
Hi, i report this as often users do not know that even if they create very large listen queue, the limit must be tuned at kernel side (via sysctl and so on).
In this way they can see that something is wrong.
Post by Jean-Baptiste Quenot
3) I tried tweaking the error message to mention the tcpi_lost field,
but it is always zero.
i have tried to check this field some time ago, but i think it is referring to tcp window size
and even for this purpose it not looks like very useful
Post by Jean-Baptiste Quenot
So to sum up, when the queue is getting full, the sysadmin cannot
determine the severity of the downtime just by looking at the size of
the listen queue, because it remains constant at the maximum size (eg
at 128), and it is not reliable to check an instantaneous value
periodically. Knowing the number of dropped requests would be great,
but this information is not available on Linux (may be available on
FreeBSD). So I believe the only option is to add a new counter in the
master, that will be incremented everytime the listen queue has been
checked full.
I would monitor this counter and use a DERIVE data source in RRD to
graph the severity of my UWSGI app's down times.
Does it make sense?
And last but not least, querying the uwsgi python module involves
doing an HTTP request and getting a worker to serve this internal
request just for knowing the value of a counter. Or do you think of
another way to access the number? (sending a signal to the master,
like Java does for getting a stack trace, providing a unix socket
connection and fetch the data with socat, like haproxy does, or
providing a monitoring port that talks HTTP?)
Look at this:

http://projects.unbit.it/uwsgi/wiki/ManagementFlag

I have added two flag for the backlog queue (the second one is what you asked for).

You need the admin plugin to use management flag:

python uwsgiconfig --plugin plugins/admin
then add "--plugins admin" to command line/config

(or you can embed it in the build profile)

Look at this script if you want to make a monitor app:

http://projects.unbit.it/uwsgi/browser/myadmin.py

(in the future i will add something like --admin-get and --admin-set as i did with --ping)


--
Roberto De Ioris
http://unbit.it

Loading...