Discussion:
[uWSGI] pyuwsgi and environ handling under macOS
Nate Coraor
2018-03-20 16:15:54 UTC
Permalink
Because of the... unique... way that macOS handles the standard `environ`
when running as a shared library[1], uWSGI segfaults under certain
conditions on macOS when built as a CPython extension (pyuwsgi).

What many projects faced with this issue do is something like:

#if defined(__APPLE__) && defined(UWSGI_AS_SHARED_LIBRARY)
#include <crt_externs.h>
#define environ (*_NSGetEnviron())
#else
extern char **environ;
#endif

uWSGI does something similar but instead of the preprocessor define,
assigns `environ` to the return of `_NSGetEnviron()` at runtime. However,
this approach doesn't work because after a `setenv()`, the address in
`environ` is no longer valid, and `_NSGetEnviron()` needs to be called
again. That's why the define works, since under that method, `environ`
always points at the address returned by `_NSGetEnviron()`.

I made a naive attempt at addressing the problem[2] by reinitializing
`environ` after the environment is manipulated, but it's not sufficient.
Rather than hunting through the code and reinitializing `environ`
everywhere, which is sure to be error prone and likely to introduce more
bugs in the future, I tried to just switch to using the define method.

Unfortunately, this doesn't work as the `uwsgi_server` and `uwsgi_app`
structs have an `environ` member, so there's a name conflict with the
defined `environ`. I have two possible solutions but I am not sure how
they'd be received by the uWSGI developers, so I wanted to get some
guidance before committing any effort:

1. Rename the struct member.
2. Replace access to `environ` throughout the code with a function call
returning either `environ` or `_NSGetEnviron()` as appropriate.

The first option seems the most future-proof as it avoids any cases in the
future where developers forget to use the function rather than `environ`
directly, but it might also be frustrating to have such a core name
changed. There are probably simpler solutions, however, as I am by no means
a C expert with a lot of tricks up my sleeve.

Thanks,
--nate

[1]
https://developer.apple.com/legacy/library/documentation/Darwin/Reference/ManPages/man7/environ.7.html
[2] https://github.com/unbit/uwsgi/pull/1680
Roberto De Ioris
2018-03-20 18:24:06 UTC
Permalink
Post by Nate Coraor
Because of the... unique... way that macOS handles the standard `environ`
when running as a shared library[1], uWSGI segfaults under certain
conditions on macOS when built as a CPython extension (pyuwsgi).
#if defined(__APPLE__) && defined(UWSGI_AS_SHARED_LIBRARY)
#include <crt_externs.h>
#define environ (*_NSGetEnviron())
#else
extern char **environ;
#endif
uWSGI does something similar but instead of the preprocessor define,
assigns `environ` to the return of `_NSGetEnviron()` at runtime. However,
this approach doesn't work because after a `setenv()`, the address in
`environ` is no longer valid, and `_NSGetEnviron()` needs to be called
again. That's why the define works, since under that method, `environ`
always points at the address returned by `_NSGetEnviron()`.
I made a naive attempt at addressing the problem[2] by reinitializing
`environ` after the environment is manipulated, but it's not sufficient.
Rather than hunting through the code and reinitializing `environ`
everywhere, which is sure to be error prone and likely to introduce more
bugs in the future, I tried to just switch to using the define method.
Unfortunately, this doesn't work as the `uwsgi_server` and `uwsgi_app`
structs have an `environ` member, so there's a name conflict with the
defined `environ`. I have two possible solutions but I am not sure how
they'd be received by the uWSGI developers, so I wanted to get some
1. Rename the struct member.
2. Replace access to `environ` throughout the code with a function call
returning either `environ` or `_NSGetEnviron()` as appropriate.
The first option seems the most future-proof as it avoids any cases in the
future where developers forget to use the function rather than `environ`
directly, but it might also be frustrating to have such a core name
changed. There are probably simpler solutions, however, as I am by no means
a C expert with a lot of tricks up my sleeve.
Thanks,
--nate
Hi Nate, this is an interesting issue, would you like to move it in a
github issue to reach wider audience ?

By the way, neither of the two solutions will be "zero-cost" (we need to
retain ABI backward compatibility) so i think we need to find another way

Thanks a lot
--
Roberto De Ioris
http://unbit.com
Nate Coraor
2018-03-20 20:08:52 UTC
Permalink
Hi Roberto,

Thanks for the response, I created an issue here:
https://github.com/unbit/uwsgi/issues/1762

--nate
Post by Roberto De Ioris
Post by Nate Coraor
Because of the... unique... way that macOS handles the standard `environ`
when running as a shared library[1], uWSGI segfaults under certain
conditions on macOS when built as a CPython extension (pyuwsgi).
#if defined(__APPLE__) && defined(UWSGI_AS_SHARED_LIBRARY)
#include <crt_externs.h>
#define environ (*_NSGetEnviron())
#else
extern char **environ;
#endif
uWSGI does something similar but instead of the preprocessor define,
assigns `environ` to the return of `_NSGetEnviron()` at runtime. However,
this approach doesn't work because after a `setenv()`, the address in
`environ` is no longer valid, and `_NSGetEnviron()` needs to be called
again. That's why the define works, since under that method, `environ`
always points at the address returned by `_NSGetEnviron()`.
I made a naive attempt at addressing the problem[2] by reinitializing
`environ` after the environment is manipulated, but it's not sufficient.
Rather than hunting through the code and reinitializing `environ`
everywhere, which is sure to be error prone and likely to introduce more
bugs in the future, I tried to just switch to using the define method.
Unfortunately, this doesn't work as the `uwsgi_server` and `uwsgi_app`
structs have an `environ` member, so there's a name conflict with the
defined `environ`. I have two possible solutions but I am not sure how
they'd be received by the uWSGI developers, so I wanted to get some
1. Rename the struct member.
2. Replace access to `environ` throughout the code with a function call
returning either `environ` or `_NSGetEnviron()` as appropriate.
The first option seems the most future-proof as it avoids any cases in
the
Post by Nate Coraor
future where developers forget to use the function rather than `environ`
directly, but it might also be frustrating to have such a core name
changed. There are probably simpler solutions, however, as I am by no means
a C expert with a lot of tricks up my sleeve.
Thanks,
--nate
Hi Nate, this is an interesting issue, would you like to move it in a
github issue to reach wider audience ?
By the way, neither of the two solutions will be "zero-cost" (we need to
retain ABI backward compatibility) so i think we need to find another way
Thanks a lot
--
Roberto De Ioris
http://unbit.com
_______________________________________________
uWSGI mailing list
http://lists.unbit.it/cgi-bin/mailman/listinfo/uwsgi
Loading...