[PATCH] NRPE wait for client to close

Leo Baltus Leo.Baltus at omroep.nl
Mon Apr 2 11:10:58 CEST 2012


Hi,

We have been investigating an issue with nrpe with regard to reuse of
port numbers and client hickups (check_nrpe).

It seems that the closing of the network sockets has a race condition
which can confuse firewalls, NAT-gateways etc. I have seen reports about
this issue in several forms:

- http://permalink.gmane.org/gmane.network.nagios.devel/3037 1)
- http://tracker.nagios.org/view.php?id=305
- http://sourceforge.net/mailarchive/message.php?msg_id=29054957

With my IPv6 patch the bug exposes itself more often if used 
in combination with -b (bind) on the client. I had to consult Stevens' Unix
network programming seriously to find this one.

It appears that if a client is connected with a server that the most
optimal way of closing the connection is that the client takes the
initiative to close() as most modern network protocols do. The server
also has en open socket so it has to close() is at some time too. The
one who closes first sends a FIN to the other side ultimately resulting
in its socket to become in TIME_WAIT state for a few minutes.

The nrpe server did a close() too at the end of its logic resulting in a
race of FIN packets being sent from both client and server resulting in
sometimes TIME_WAIT on the client and sometimes on the server side
depending on who won the race.

When making the next connection the connection table is consulted to
choose a new source port, thereby skipping all TIME_WAIT states.

Using IPV6 in combination with bind() on the client the closing of a
connection seems to take slightly longer, I don't know why. Making
the server win the race more often resulting in more TIME_WAIT states on
the server.

The client doesn't see these entries in its connection table
increasing the chance to reuse a source port which is still in TIME_WAIT
on the server. The server will refuse by sending a RST. Strangely the
connection table changes to SYN_RCVD and as soon as the SYN tables
overflows it starts reporting synflooding.

Long story short: semantically the server should wait with a read() on
a end-of-file (0 bytes read) and then close() its side of the connection.
The end-of-file is caused by the FIN of the client (the so called
active close) The close() of the server will then be a passive close so
only one FIN will be sent, resulting in only LISTEN on the server and
all TIME_WAIT's on the client.

The patch introduces a read with timeout of 10 seconds just in case you
have to deal with high latency. In practice however the server receives
the FIN instanly so this timeout is just in case the FIN gets lost.

I have taken the liberty to also remove some logic to force sending FIN
introduced in 2006 in 1).

The patch also introduces -Wall n the Makefile to find and fix several
compiletime errors.

-- 
Leo Baltus, internetbeheerder                         /\
NPO ICT Internet Services                            /NPO/\
Sumatralaan 45, 1217 GP Hilversum, Filmcentrum, west \  /\/
beheer at omroep.nl, 035-6773555                         \/
-------------- next part --------------
diff -ruN nrpe-2.12.fc8_64_ipv6/Makefile.in nrpe-2.12.fc8_64_wait_for_client_close/Makefile.in
--- nrpe-2.12.fc8_64_ipv6/Makefile.in	2007-03-14 16:30:05.000000000 +0100
+++ nrpe-2.12.fc8_64_wait_for_client_close/Makefile.in	2011-09-05 10:11:23.000000000 +0200
@@ -10,7 +10,7 @@
 SRC_INCLUDE=./include/
 
 CC=@CC@
-CFLAGS=@CFLAGS@ @DEFS@
+CFLAGS=@CFLAGS@ @DEFS@ -Wall -D_GNU_SOURCE
 LDFLAGS=@LDFLAGS@ @LIBS@
 
 prefix=@prefix@
diff -ruN nrpe-2.12.fc8_64_ipv6/include/nrpe.h nrpe-2.12.fc8_64_wait_for_client_close/include/nrpe.h
--- nrpe-2.12.fc8_64_ipv6/include/nrpe.h	2007-11-23 18:31:23.000000000 +0100
+++ nrpe-2.12.fc8_64_wait_for_client_close/include/nrpe.h	2011-09-05 10:11:23.000000000 +0200
@@ -54,6 +54,7 @@
 int my_system(char *,int,int *,char *,int);            	/* executes a command via popen(), but also protects against timeouts */
 void my_system_sighandler(int);				/* handles timeouts when executing commands via my_system() */
 void my_connection_sighandler(int);			/* handles timeouts of connection */
+void close_client_sighandler(int sig);			/* handles timeouts of closing the connection */
 
 void sighandler(int);
 void child_sighandler(int);
diff -ruN nrpe-2.12.fc8_64_ipv6/src/Makefile.in nrpe-2.12.fc8_64_wait_for_client_close/src/Makefile.in
--- nrpe-2.12.fc8_64_ipv6/src/Makefile.in	2007-08-13 19:10:07.000000000 +0200
+++ nrpe-2.12.fc8_64_wait_for_client_close/src/Makefile.in	2011-09-05 10:11:23.000000000 +0200
@@ -9,7 +9,7 @@
 SRC_INCLUDE=../include
 
 CC=@CC@
-CFLAGS=@CFLAGS@ @DEFS@
+CFLAGS=@CFLAGS@ @DEFS@ -Wall -D_GNU_SOURCE
 LDFLAGS=@LDFLAGS@ @LIBS@
 SOCKETLIBS=@SOCKETLIBS@
 LIBWRAPLIBS=@LIBWRAPLIBS@
diff -ruN nrpe-2.12.fc8_64_ipv6/src/check_nrpe.c nrpe-2.12.fc8_64_wait_for_client_close/src/check_nrpe.c
--- nrpe-2.12.fc8_64_ipv6/src/check_nrpe.c	2011-09-05 10:10:48.000000000 +0200
+++ nrpe-2.12.fc8_64_wait_for_client_close/src/check_nrpe.c	2011-09-05 10:11:23.000000000 +0200
@@ -51,7 +51,6 @@
 
 int process_arguments(int,char **);
 void alarm_handler(int);
-int graceful_close(int,int);
 
 
 
@@ -60,7 +59,7 @@
         u_int32_t packet_crc32;
         u_int32_t calculated_crc32;
 	int16_t result;
-	int rc;
+	int rc=0;
 	packet send_packet;
 	packet receive_packet;
 	int bytes_to_send;
@@ -248,7 +247,7 @@
 			SSL_CTX_free(ctx);
 	                }
 #endif
-		graceful_close(sd,1000);
+		close(sd);
 
 		/* recv() error */
 		if(rc<0){
@@ -264,7 +263,7 @@
 
 		/* receive underflow */
 		else if(bytes_to_recv<sizeof(receive_packet)){
-			printf("CHECK_NRPE: Receive underflow - only %d bytes received (%d expected).\n",bytes_to_recv,sizeof(receive_packet));
+			printf("CHECK_NRPE: Receive underflow - only %d bytes received (%d expected).\n",bytes_to_recv,(int)sizeof(receive_packet));
 			return STATE_UNKNOWN;
 		        }
 
@@ -446,36 +445,3 @@
 	exit(timeout_return_code);
         }
 
-
-/* submitted by Mark Plaksin 08/31/2006 */
-int graceful_close(int sd, int timeout){
-        fd_set in;
-        struct timeval tv;
-        char buf[1000];
-
-	/* send FIN packet */
-        shutdown(sd,SHUT_WR);  
-        for(;;){
-
-                FD_ZERO(&in);
-                FD_SET(sd,&in);
-                tv.tv_sec=timeout/1000;
-                tv.tv_usec=(timeout % 1000)*1000;
-
-		/* timeout or error */
-                if(1!=select(sd+1,&in,NULL,NULL,&tv))
-			break;   
-
-		/* no more data (FIN or RST) */
-                if(0>=recv(sd,buf,sizeof(buf),0))
-			break;
-		}
-
-#ifdef HAVE_CLOSESOCKET
-        closesocket(sd);
-#else
-	close(sd);
-#endif
-
-	return OK;
-	}
diff -ruN nrpe-2.12.fc8_64_ipv6/src/nrpe.c nrpe-2.12.fc8_64_wait_for_client_close/src/nrpe.c
--- nrpe-2.12.fc8_64_ipv6/src/nrpe.c	2011-09-05 10:10:48.000000000 +0200
+++ nrpe-2.12.fc8_64_wait_for_client_close/src/nrpe.c	2011-09-05 10:15:58.000000000 +0200
@@ -718,6 +718,30 @@
 }
 
 
+/* 
+ * wait for the client to close the connection
+ */
+int
+wait_for_client_to_close(int sock)
+{
+	int n;
+	/* set connection handler */
+	signal(SIGALRM,close_client_sighandler);
+	alarm(10);
+
+	char buf[MAX_INPUT_BUFFER];
+
+	if (n = read (sock, buf, sizeof(buf)) < 0 ) {
+		syslog(LOG_DEBUG,"read failed: %s\n", strerror(errno));
+		return -1;
+	} else if ( n == 0 ) {
+		return 0;
+	} else {
+		syslog(LOG_DEBUG,"Extraneous data received");
+		return -1;
+	}
+} 
+
 /* wait for incoming connection requests */
 void wait_for_connections(void){
         struct addrinfo *ai;
@@ -732,21 +756,17 @@
 	int i;
 	int r;
 
-	struct sockaddr_in myname;
-	struct sockaddr_in *nptr;
 	struct sockaddr_storage addr;
 	int rc;
-	int sock, new_sd;
+	int new_sd;
 	socklen_t addrlen;
 	pid_t pid;
 	int flag=1;
-	fd_set fdread;
-	struct timeval timeout;
+	struct timeval;
 	int retval;
 #ifdef HAVE_LIBWRAP
 	struct request_info req;
 #endif
-	
 	add_listen_addr(&listen_addrs, address_family, 
 		(strcmp(server_address,"")==0)?NULL:server_address, server_port);
 
@@ -789,7 +809,7 @@
                 if (ai->ai_family == AF_INET6) {
                         if (setsockopt(listen_sock, IPPROTO_IPV6, IPV6_V6ONLY,
                             &flag, sizeof(flag)) == -1)
-                                error("setsockopt IPV6_V6ONLY: %s",
+                                syslog(LOG_ERR, "setsockopt IPV6_V6ONLY: %s",
                                     strerror(errno));
                 }
 #endif
@@ -882,10 +902,12 @@
 			// }}}
 // {{{
 			/* child process should handle the connection */
-			if(pid=fork() == 0) {
+			if((pid=fork()) == 0) { // child
+				/* child does not need to listen for connections */
+				close_listen_socks();
 
 				/* fork again so we don't create zombies */
-				if(pid=fork() == 0) {
+				if((pid=fork()) == 0) { //child
 
 // {{{
 				/* handle signals */
@@ -893,9 +915,6 @@
 				signal(SIGTERM,child_sighandler);
 				signal(SIGHUP,child_sighandler);
 
-				/* grandchild does not need to listen for connections */
-				close_listen_socks();
-
 				/* find out who just connected... */
 				addrlen=sizeof(addr);
 				rc=getpeername(new_sd, (struct sockaddr *)&addr, &addrlen);
@@ -962,14 +981,16 @@
 					/* handle the client connection */
 					handle_connection(new_sd);
 				
-					/* log info to syslog facility */
+					/* wait for client to close connection */
+					if (wait_for_client_to_close(new_sd) == -1 ) {
+						close(new_sd);
+						exit(STATE_CRITICAL);
+					}
 					if(debug==TRUE)
 						syslog(LOG_DEBUG,"Connection from %s closed.",ipstr);
-				
-					/* close socket prior to exiting */
 					close(new_sd);
-				
 					exit(STATE_OK);
+				
     			 	} else {
 					/* first child returns immediately, grandchild is inherited by INIT process -> no zombies... */
 					exit(STATE_OK);
@@ -998,10 +1019,7 @@
 	char *temp_buffer=NULL;
 	char *temp_ptr=NULL;
 	int result=0;
-        struct hostent *myhost;
-	char **pptr=NULL;
 	char *save_connecting_host=NULL;
-	struct in_addr addr;
         int gaierr;
         struct addrinfo hints, *res, *ai;
 
@@ -1069,7 +1087,6 @@
         }
 
 
-
 /* handles a client connection */
 void handle_connection(int sock){
         u_int32_t calculated_crc32;
@@ -1083,7 +1100,7 @@
 	char processed_command[MAX_INPUT_BUFFER];
 	int result=STATE_OK;
 	int early_timeout=FALSE;
-	int rc;
+	int rc=0;
 	int x;
 #ifdef DEBUG
 	FILE *errfp;
@@ -1554,6 +1571,11 @@
 	exit(STATE_CRITICAL);
 	}
 
+/* handle closing connection */
+void close_client_sighandler(int sig) {
+	syslog(LOG_ERR,"Cliented waited too long to close. Exiting...");
+	exit(STATE_CRITICAL);
+}
 
 /* drops privileges */
 int drop_privileges(char *user, char *group){
diff -ruN nrpe-2.12.fc8_64_ipv6/src/utils.c nrpe-2.12.fc8_64_wait_for_client_close/src/utils.c
--- nrpe-2.12.fc8_64_ipv6/src/utils.c	2011-09-05 10:10:48.000000000 +0200
+++ nrpe-2.12.fc8_64_wait_for_client_close/src/utils.c	2011-09-05 10:11:23.000000000 +0200
@@ -121,6 +121,43 @@
 # define NI_MAXHOST 1025
 #endif
 
+/* Creates a socket for the connection. */
+int my_create_socket(struct addrinfo *ai, const char *bind_address)
+{
+        int sock, gaierr;
+        struct addrinfo hints, *res;
+
+        sock = socket(ai->ai_family, ai->ai_socktype, ai->ai_protocol);
+        if (sock < 0)
+                fprintf(stderr,"socket: %.100s\n", strerror(errno));
+	fcntl(sock, F_SETFD, FD_CLOEXEC);
+
+        /* Bind the socket to an alternative local IP address */
+        if (bind_address == NULL)
+                return sock;
+
+        memset(&hints, 0, sizeof(hints));
+        hints.ai_family = ai->ai_family;
+        hints.ai_socktype = SOCK_STREAM; //ai->ai_socktype;
+        hints.ai_protocol = IPPROTO_TCP; //ai->ai_protocol;
+        hints.ai_flags = AI_PASSIVE;
+        gaierr = getaddrinfo(bind_address, NULL, &hints, &res);
+        if (gaierr) {
+                fprintf(stderr, "getaddrinfo: %s: %s\n", bind_address,
+                    gai_strerror(gaierr));
+                close(sock);
+                return -1;
+        }
+        if (bind(sock, res->ai_addr, res->ai_addrlen) < 0) {
+                fprintf(stderr, "bind: %s: %s\n", bind_address, strerror(errno));
+                close(sock);
+                freeaddrinfo(res);
+                return -1;
+        }
+        freeaddrinfo(res);
+        return sock;
+}
+
 /* opens a connection to a remote host */
 int my_connect(const char *host, struct sockaddr_storage * hostaddr, u_short port, int address_family, const char *bind_address){
         int gaierr;
@@ -174,7 +211,7 @@
 
         /* Return failure if we didn't get a successful connection. */
         if (sock == -1) {
-		error("connect to host %s port %s: %s",
+		fprintf(stderr,"connect to host %s port %s: %s",
 			host, strport, strerror(errno));
 		return -1;
 	}
@@ -182,42 +219,6 @@
 	return sock;
 }
 
-/* Creates a socket for the connection. */
-int my_create_socket(struct addrinfo *ai, const char *bind_address)
-{
-        int sock, gaierr;
-        struct addrinfo hints, *res;
-
-        sock = socket(ai->ai_family, ai->ai_socktype, ai->ai_protocol);
-        if (sock < 0)
-                fprintf(stderr,"socket: %.100s\n", strerror(errno));
-
-        /* Bind the socket to an alternative local IP address */
-        if (bind_address == NULL)
-                return sock;
-
-        memset(&hints, 0, sizeof(hints));
-        hints.ai_family = ai->ai_family;
-        hints.ai_socktype = ai->ai_socktype;
-        hints.ai_protocol = ai->ai_protocol;
-        hints.ai_flags = AI_PASSIVE;
-        gaierr = getaddrinfo(bind_address, NULL, &hints, &res);
-        if (gaierr) {
-                fprintf(stderr, "getaddrinfo: %s: %s\n", bind_address,
-                    gai_strerror(gaierr));
-                close(sock);
-                return -1;
-        }
-        if (bind(sock, res->ai_addr, res->ai_addrlen) < 0) {
-                fprintf(stderr, "bind: %s: %s\n", bind_address, strerror(errno));
-                close(sock);
-                freeaddrinfo(res);
-                return -1;
-        }
-        freeaddrinfo(res);
-        return sock;
-}
-
 void add_listen_addr(struct addrinfo **listen_addrs, int address_family, char *addr, int port)
 {
         struct addrinfo hints, *ai, *aitop;
-------------- next part --------------
------------------------------------------------------------------------------
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here 
http://p.sf.net/sfu/sfd2d-msazure
-------------- next part --------------
_______________________________________________
Nagios-devel mailing list
Nagios-devel at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-devel


More information about the Developers mailing list