Welcome! Log In Create A New Profile

Advanced

call ngx_localtime may cause deadlock?

Posted by georgexsh 
call ngx_localtime may cause deadlock?
March 04, 2010 05:09AM
I am writing a module based on nginx 0.7.65, ngx_localtime() is called in the module.
Problem is, if set "timer_resolution 100ms;" in config file, nginx workers(*4) will hangs randomly. then if "timer_resolution" command was turned off , everything works just fine.

here is the backtrace (coredump generated with gcore(1) on a hanged worker process):
#0  0x007347a2 in ?? ()
#1  0x00823a9e in pthread_cond_init@@GLIBC_2.3.2 () from /lib/tls/libc.so.6
#2  0x007c942d in __wcsmbs_named_conv () from /lib/tls/libc.so.6
#3  0x080611c6 in ngx_localtime (s=1267569890, tm=0x9ba2048) at src/os/unix/ngx_time.c:59
#4  0x08055223 in ngx_time_update (sec=1267569890, msec=5) at src/core/ngx_times.c:114
#5  0x0805e51a in ngx_timer_signal_handler (signo=14) at src/event/ngx_event.c:565
#6  0x00775898 in __open_catalog () from /lib/tls/libc.so.6
#7  0x007cc043 in __tzfile_read () from /lib/tls/libc.so.6
#8  0x007cb217 in tzset_internal () from /lib/tls/libc.so.6
#9  0x007c942d in __wcsmbs_named_conv () from /lib/tls/libc.so.6
#10 0x080611c6 in ngx_localtime (s=1267569889, tm=0x875ff4) at src/os/unix/ngx_time.c:59
#11 0x08088c9f in ngx_foo_time_str (r=0x9dcdc28, time_str=0xbfe18450, t=-4) at ./ngx_foo_module//uti.c:111
#12 0x08087d1a in ngx_http_foo_record (r=0x9dcdc28, request_arg=0x9dcddb0) at ./ngx_foo_module//foo.c:124
#13 0x0808788c in ngx_http_foo_handler (r=0x9dcdc28) at ./ngx_foo_module//ngx_http_foo_module.c:74
#14 0x0806d4f9 in ngx_http_core_content_phase (r=0x9dcdc28, ph=0x9bb9938) at src/http/ngx_http_core_module.c:1262

strace shows (not on the same worker, but similar) :
futex(0x2c2eec, FUTEX_WAIT, 2, NULL)

I guess this situation is a deadlock, happens beneath localtime_r() of glibc.
While my module is calling localtime_r, a time update signal trigged localtime at the same time, then deadlocked. Is this correct?

Now I employ ngx_gmtime() to get formated time instead of ngx_localtime(), works stable for weeks. Maybe ngx_localtime() should not be used outside ngx_time_update().
Maxim Dounin
Re: call ngx_localtime may cause deadlock?
March 04, 2010 07:18AM
Hello!

On Thu, Mar 04, 2010 at 05:09:33AM -0500, georgexsh wrote:

> I am writing a module based on nginx 0.7.65, ngx_localtime() is
> called in the module.
> Problem is, if set "timer_resolution 100ms;" in config file,
> nginx workers(*4) will hangs randomly. then if
> "timer_resolution" command was turned off , everything works
> just fine.
>
> here is the backtrace (coredump generated with gcore(1) on a
> hanged worker process):
>
> #0 0x007347a2 in ?? ()
> #1 0x00823a9e in pthread_cond_init@@GLIBC_2.3.2 () from /lib/tls/libc.so.6
> #2 0x007c942d in __wcsmbs_named_conv () from /lib/tls/libc.so.6
> #3 0x080611c6 in ngx_localtime (s=1267569890, tm=0x9ba2048) at src/os/unix/ngx_time.c:59
> #4 0x08055223 in ngx_time_update (sec=1267569890, msec=5) at src/core/ngx_times.c:114
> #5 0x0805e51a in ngx_timer_signal_handler (signo=14) at src/event/ngx_event.c:565
> #6 0x00775898 in __open_catalog () from /lib/tls/libc.so.6
> #7 0x007cc043 in __tzfile_read () from /lib/tls/libc.so.6
> #8 0x007cb217 in tzset_internal () from /lib/tls/libc.so.6
> #9 0x007c942d in __wcsmbs_named_conv () from /lib/tls/libc.so.6
> #10 0x080611c6 in ngx_localtime (s=1267569889, tm=0x875ff4) at src/os/unix/ngx_time.c:59
> #11 0x08088c9f in ngx_foo_time_str (r=0x9dcdc28, time_str=0xbfe18450, t=-4) at ./ngx_foo_module//uti.c:111
> #12 0x08087d1a in ngx_http_foo_record (r=0x9dcdc28, request_arg=0x9dcddb0) at ./ngx_foo_module//foo.c:124
> #13 0x0808788c in ngx_http_foo_handler (r=0x9dcdc28) at ./ngx_foo_module//ngx_http_foo_module.c:74
> #14 0x0806d4f9 in ngx_http_core_content_phase (r=0x9dcdc28, ph=0x9bb9938) at src/http/ngx_http_core_module.c:1262
>
>
> strace shows (not on the same worker, but similar) :
>
> futex(0x2c2eec, FUTEX_WAIT, 2, NULL)
>
> I guess this situation is a deadlock, happens beneath
> localtime_r() of glibc.
> While my module is calling localtime_r, a time update signal
> trigged localtime at the same time, then deadlocked. Is this
> correct?

Yes, your analysis is correct. With timer_resolution set and for
event methods which does not support timers (all but kqueue,
eventport) nginx uses localtime_r() in signal handler. This isn't
generally safe thing to do as POSIX doesn't list it as
async-signal-safe function. Any other usage of localtime_r() in
the same process may cause deadlock.

In fact it may even deadlock by itself as soon as new
signal arrives before localtime_r() will finish execution (though
unlikely).

I'm a bit surprised that this actually happens as nginx isn't
threaded and I see no reason for glibc to use locks. But anyway
it's not glibc's fault.

> Now I employ ngx_gmtime() to get formated time instead of
> ngx_localtime(), works stable for weeks. Maybe ngx_localtime()
> should not be used outside ngx_time_update().

You should use ngx_time()/ngx_timeofday() instead in your module.

Maxim Dounin

_______________________________________________
nginx mailing list
nginx@nginx.org
http://nginx.org/mailman/listinfo/nginx
Re: call ngx_localtime may cause deadlock?
March 04, 2010 07:26AM
On Thu, Mar 04, 2010 at 03:14:53PM +0300, Maxim Dounin wrote:

> Hello!
>
> On Thu, Mar 04, 2010 at 05:09:33AM -0500, georgexsh wrote:
>
> > I am writing a module based on nginx 0.7.65, ngx_localtime() is
> > called in the module.
> > Problem is, if set "timer_resolution 100ms;" in config file,
> > nginx workers(*4) will hangs randomly. then if
> > "timer_resolution" command was turned off , everything works
> > just fine.
> >
> > here is the backtrace (coredump generated with gcore(1) on a
> > hanged worker process):
> >
> > #0 0x007347a2 in ?? ()
> > #1 0x00823a9e in pthread_cond_init@@GLIBC_2.3.2 () from /lib/tls/libc.so.6
> > #2 0x007c942d in __wcsmbs_named_conv () from /lib/tls/libc.so.6
> > #3 0x080611c6 in ngx_localtime (s=1267569890, tm=0x9ba2048) at src/os/unix/ngx_time.c:59
> > #4 0x08055223 in ngx_time_update (sec=1267569890, msec=5) at src/core/ngx_times.c:114
> > #5 0x0805e51a in ngx_timer_signal_handler (signo=14) at src/event/ngx_event.c:565
> > #6 0x00775898 in __open_catalog () from /lib/tls/libc.so.6
> > #7 0x007cc043 in __tzfile_read () from /lib/tls/libc.so.6
> > #8 0x007cb217 in tzset_internal () from /lib/tls/libc.so.6
> > #9 0x007c942d in __wcsmbs_named_conv () from /lib/tls/libc.so.6
> > #10 0x080611c6 in ngx_localtime (s=1267569889, tm=0x875ff4) at src/os/unix/ngx_time.c:59
> > #11 0x08088c9f in ngx_foo_time_str (r=0x9dcdc28, time_str=0xbfe18450, t=-4) at ./ngx_foo_module//uti.c:111
> > #12 0x08087d1a in ngx_http_foo_record (r=0x9dcdc28, request_arg=0x9dcddb0) at ./ngx_foo_module//foo.c:124
> > #13 0x0808788c in ngx_http_foo_handler (r=0x9dcdc28) at ./ngx_foo_module//ngx_http_foo_module.c:74
> > #14 0x0806d4f9 in ngx_http_core_content_phase (r=0x9dcdc28, ph=0x9bb9938) at src/http/ngx_http_core_module.c:1262
> >
> >
> > strace shows (not on the same worker, but similar) :
> >
> > futex(0x2c2eec, FUTEX_WAIT, 2, NULL)
> >
> > I guess this situation is a deadlock, happens beneath
> > localtime_r() of glibc.
> > While my module is calling localtime_r, a time update signal
> > trigged localtime at the same time, then deadlocked. Is this
> > correct?
>
> Yes, your analysis is correct. With timer_resolution set and for
> event methods which does not support timers (all but kqueue,
> eventport) nginx uses localtime_r() in signal handler. This isn't
> generally safe thing to do as POSIX doesn't list it as
> async-signal-safe function. Any other usage of localtime_r() in
> the same process may cause deadlock.
>
> In fact it may even deadlock by itself as soon as new
> signal arrives before localtime_r() will finish execution (though
> unlikely).
>
> I'm a bit surprised that this actually happens as nginx isn't
> threaded and I see no reason for glibc to use locks. But anyway
> it's not glibc's fault.

Here is third-party module which probably uses threads:

#11 0x08088c9f in ngx_foo_time_str (r=0x9dcdc28, time_str=0xbfe18450, t=-4) at ./ngx_foo_module//uti.c:111
#12 0x08087d1a in ngx_http_foo_record (r=0x9dcdc28, request_arg=0x9dcddb0) at ./ngx_foo_module//foo.c:124
#13 0x0808788c in ngx_http_foo_handler (r=0x9dcdc28) at ./ngx_foo_module//ngx_http_foo_module.c:74

> > Now I employ ngx_gmtime() to get formated time instead of
> > ngx_localtime(), works stable for weeks. Maybe ngx_localtime()
> > should not be used outside ngx_time_update().
>
> You should use ngx_time()/ngx_timeofday() instead in your module.


--
Igor Sysoev
http://sysoev.ru/en/

_______________________________________________
nginx mailing list
nginx@nginx.org
http://nginx.org/mailman/listinfo/nginx
Re: call ngx_localtime may cause deadlock?
March 04, 2010 10:20AM
Igor Sysoev Wrote:
-------------------------------------------------------

> Here is third-party module which probably uses
> threads:
>
> #11 0x08088c9f in ngx_foo_time_str (r=0x9dcdc28,
> time_str=0xbfe18450, t=-4) at
> ./ngx_foo_module//uti.c:111
> #12 0x08087d1a in ngx_http_foo_record
> (r=0x9dcdc28, request_arg=0x9dcddb0) at
> ./ngx_foo_module//foo.c:124
> #13 0x0808788c in ngx_http_foo_handler
> (r=0x9dcdc28) at
> ./ngx_foo_module//ngx_http_foo_module.c:74
>


No, ngx_foo_module/* is my module which does't use threads ethier.

ref:
http://blogs.inxsasia.com/hiran/2009/07/stupid-futex-and-glibc.html
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=301511
Maxim Dounin
Re: call ngx_localtime may cause deadlock?
March 04, 2010 11:54AM
Hello!

On Thu, Mar 04, 2010 at 10:20:29AM -0500, georgexsh wrote:

> Igor Sysoev Wrote:
> -------------------------------------------------------
>
> > Here is third-party module which probably uses
> > threads:
> >
> > #11 0x08088c9f in ngx_foo_time_str (r=0x9dcdc28,
> > time_str=0xbfe18450, t=-4) at
> > ./ngx_foo_module//uti.c:111
> > #12 0x08087d1a in ngx_http_foo_record
> > (r=0x9dcdc28, request_arg=0x9dcddb0) at
> > ./ngx_foo_module//foo.c:124
> > #13 0x0808788c in ngx_http_foo_handler
> > (r=0x9dcdc28) at
> > ./ngx_foo_module//ngx_http_foo_module.c:74
> >
>
>
> No, ngx_foo_module/* is my module which does't use threads ethier.
>
> ref:
> http://blogs.inxsasia.com/hiran/2009/07/stupid-futex-and-glibc.html
> http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=301511

Ok, second link suggest that recent glibc uses futex even when
there are no threads in application.

This isn't really relevant though as underlying problem is in
nginx and should be fixed anyway. Only async-signal-safe
functions may be used in signal handlers, anything else leads to
undefined behaviour.

For now - use ngx_time() and don't use timer_resolution unless you
have kqueue or eventport.

Maxim Dounin

_______________________________________________
nginx mailing list
nginx@nginx.org
http://nginx.org/mailman/listinfo/nginx
Re: call ngx_localtime may cause deadlock?
March 04, 2010 02:58PM
On Thu, Mar 04, 2010 at 07:49:57PM +0300, Maxim Dounin wrote:

> Hello!
>
> On Thu, Mar 04, 2010 at 10:20:29AM -0500, georgexsh wrote:
>
> > Igor Sysoev Wrote:
> > -------------------------------------------------------
> >
> > > Here is third-party module which probably uses
> > > threads:
> > >
> > > #11 0x08088c9f in ngx_foo_time_str (r=0x9dcdc28,
> > > time_str=0xbfe18450, t=-4) at
> > > ./ngx_foo_module//uti.c:111
> > > #12 0x08087d1a in ngx_http_foo_record
> > > (r=0x9dcdc28, request_arg=0x9dcddb0) at
> > > ./ngx_foo_module//foo.c:124
> > > #13 0x0808788c in ngx_http_foo_handler
> > > (r=0x9dcdc28) at
> > > ./ngx_foo_module//ngx_http_foo_module.c:74
> > >
> >
> >
> > No, ngx_foo_module/* is my module which does't use threads ethier.
> >
> > ref:
> > http://blogs.inxsasia.com/hiran/2009/07/stupid-futex-and-glibc.html
> > http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=301511
>
> Ok, second link suggest that recent glibc uses futex even when
> there are no threads in application.
>
> This isn't really relevant though as underlying problem is in
> nginx and should be fixed anyway. Only async-signal-safe
> functions may be used in signal handlers, anything else leads to
> undefined behaviour.

Strickly speaking the generic signal handler calls localtime/_r() too:
ngx_time_update(0, 0);

And if timer_resolution bug can be fixed easy, nevertheless, I do not
know how to log local time in singal handler without localtime().


--
Igor Sysoev
http://sysoev.ru/en/

_______________________________________________
nginx mailing list
nginx@nginx.org
http://nginx.org/mailman/listinfo/nginx
Maxim Dounin
Re: call ngx_localtime may cause deadlock?
March 04, 2010 08:02PM
Hello!

On Thu, Mar 04, 2010 at 10:54:56PM +0300, Igor Sysoev wrote:

> On Thu, Mar 04, 2010 at 07:49:57PM +0300, Maxim Dounin wrote:

[...]

> > This isn't really relevant though as underlying problem is in
> > nginx and should be fixed anyway. Only async-signal-safe
> > functions may be used in signal handlers, anything else leads to
> > undefined behaviour.
>
> Strickly speaking the generic signal handler calls localtime/_r() too:
> ngx_time_update(0, 0);
>
> And if timer_resolution bug can be fixed easy, nevertheless, I do not
> know how to log local time in singal handler without localtime().

Well, I see 3 basic options:

1. Handle signals in the event loop (at least kqueue supports it,
for other methods we may use pipe() to deliver signals to event
loop safely). This is basically how libevent does this, btw.

2. Limit signal handlers to use only async-signal-safe functions,
and nothing more. This means that time won't be correct (no
localtime()), errno (if any) won't be translated to text (no
strerror_r()) etc.

3. Pretend we never talked about it and problems shouldn't happen
as generic signals are rare (and we already try to follow (2),
though not completely).

Maxim Dounin

_______________________________________________
nginx mailing list
nginx@nginx.org
http://nginx.org/mailman/listinfo/nginx
Sorry, only registered users may post in this forum.

Click here to login

Online Users

Guests: 78
Record Number of Users: 10 on August 27, 2010
Record Number of Guests: 177 on August 21, 2010
Powered by nginx    Powered by FreeBSD    PHP Powered    Powered by MySQL