EINTR on accept for Solaris 10's ndo2db

Ton Voon ton.voon at altinity.com
Wed Feb 27 15:26:43 CET 2008


Hi!

We've found a bug in ndo2db where ndo2db errors due to an accept  
problem on Solaris 10. It appears that the accept call in the parent  
gets an EINTR error when the child exits.

We've fixed it by placing a loop around the accept call to try again  
if EINTR is received. You can see this from the truss output:

24531:  open("/var/run/syslog_door", O_RDONLY)          = 6
24531:  door_info(6, 0x080466E0)                        = 0
24531:  getpid()                                        = 24531 [24522]
24531:  door_call(6, 0x08046718)                        = 0
24531:  close(6)                                        = 0
24531:  close(5)                                        = 0
24531:  mmap(0x00000000, 4096, PROT_READ|PROT_WRITE|PROT_EXEC,  
MAP_PRIVATE|MAP_ANON, -1, 0) = 0xD2480000
24531:  munmap(0xD2480000, 4096)                        = 0
24531:  _exit(0)
24522:      Received signal #18, SIGCLD, in accept() [caught]
24522:        siginfo: SIGCLD CLD_EXITED pid=24531 status=0x0000
24522:  accept(4, 0x08047AC0, 0x08047AAC, SOV_DEFAULT)  Err#4 EINTR
24522:  lwp_sigmask(SIG_SETMASK, 0x00000000, 0x00000000) = 0xFFBFFEFF  
[0x0000FFFF]
24522:  write(2, " I n   p a r e n t h a n".., 17)      = 17
24522:  waitid(P_ALL, 0, 0x08047700, WEXITED|WTRAPPED|WNOHANG) = 0
24522:  setcontext(0x080475E0)
24522:  write(2, " E I N T R   r e c e i v".., 14)      = 14
24522:  write(2, " :  ", 2)                             = 2
24522:  write(2, " I n t e r r u p t e d  ".., 23)      = 23
24522:  write(2, "\n", 1)                               = 1
24522:  accept(4, 0x08047AC0, 0x08047AAC, SOV_DEFAULT) (sleeping...)

Note the _exit(0) from pid 24531 followed by the SIGCLD signal, then  
the accept call with EINTR before processing in the parent handler  
[write(2, "In parenthan...") - a debug line we put in].

Two other things that we've noticed:
   * the ndo2db_handle_client_connection() has a check for EINTR too,  
but this edge cause could fall into the main processing if result=-1  
and errno=EINTR
   * the parent signal handler doesn't appear to be reset as  
subsequent child exits do not cause the parent to catch the SIGCLD  
(which probably explains why the socket is left lying around)

This is for ndoutils 1.4b3.

Ton

http://www.altinity.com
UK: +44 (0)870 787 9243
US: +1 866 879 9184
Fax: +44 (0)845 280 1725
Skype: tonvoon


-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: ndoutils_solaris_eintr_in_accept.patch.txt
URL: <https://www.monitoring-lists.org/archive/developers/attachments/20080227/6159e52a/attachment.txt>
-------------- next part --------------



-------------- next part --------------
-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
-------------- next part --------------
_______________________________________________
Nagios-devel mailing list
Nagios-devel at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-devel


More information about the Developers mailing list