Archive:KEEP ALIVE

From SEPsesam

Template:Copyright SEP AG en

THE CONTENT OF THIS PAGE IS OUTDATED
SEP AG has discontinued support for obsolete SEP sesam versions. Instructions are still available for these SEP sesam products, however, SEP AG accepts no responsibility or liability for any errors or inaccuracies in the instructions or for the incorrect operation of obsolete SEP sesam software. It is strongly recommended that you update your SEP sesam software to the latest version. For the latest version of SEP sesam documentation, see documentation home.

Sesam and KEEP_ALIVE

Often customers have a Firewall between the SEP sesam Master Server and the target backup client. For example, a customer may want to back up a server located in a DMZ or possibly an office located in a remote site.

If there is a longer time when the backup command from the SEP sesam server is issued and nothing is sent the target server may be backing up a large amount of data and not return the expected reply/answer to the server. It is possible that the firewall was forced to close the connection, leading to a break in the backup task and subsequent failure. To prevent this from happening SEP sesam uses the Keep Alive function.

With the release of Version 4.0 SEP sesam supports the STPD Options on the server side. In the file

<SESAM_INSTALL_DIR>/var/ini/stpd.ini 

in the flag [STPD_Thread] the option KEEP_ALIVE has to be set to TRUE.

Using this option SEP sesam establishes a TCP connection for the STPD with the option SO_KEEPALIVE. This TCP stack option sends the relevant operating system a periodic "KEEPALIVE" using the TCP connection. This is extremely practical for performing backup tasks for clients that are behind a firewall where the timeouts have been set on the TCP connections.

To understand how TCP Keep-Alives are defined and how they function please referr to RFC1122 for further information. Activating the "KEEPALIVE" function is not the only thing that needs to be addressed. You will have to take additional steps to fully implement "KeepAlive" in the operating system on the target.

For additional information also see http://www.starquest.com/Supportdocs/techStarLicense/SL002_TCPKeepAlive.shtml


KEEP ALIVE for Linux

This document describes how KEEP_ALIVE is implemented for Linux:

http://tldp.org/HOWTO/html_single/TCP-Keepalive-HOWTO/

There are multiple sysctl parameters that control "KeepAlive". These can be checked using:

root@cefix:~/Desktop# sysctl -a  | grep keep
net.ipv4.tcp_keepalive_time = 7200
net.ipv4.tcp_keepalive_probes = 9
net.ipv4.tcp_keepalive_intvl = 75

It is important that the values are set to net.ipv4.tcp_keepalive_time 7200 seconds, or two (2) hours.

The description of these values are as follows:

tcp_keepalive_time
the interval between the last data packet sent (simple ACKs are not considered data) and the first keepalive
probe
after the connection is marked to need keepalive, this counter is not used any further.

and:

This means that the keepalive routines wait for two hours (7200 secs) before sending the first keepalive probe, and then resend it every 75 seconds. If no ACK response is received for nine consecutive times the connection is marked as broken.

In this event a package is sent via the CTRL connection because nothing transacts (transpires) over this connection the Linux Kernel will then send the first "KEEPALIVE" after 7200 seconds (2 Hours).

This time delay can best be set using:

sysctl -w net.ipv4.tcp_keepalive_time=60

and set to a value UNDER the timeout setting of the firewall!

KEEP Alive for Windows

Windows uses similar keepalive methodology as Linux. Also, the default value of 2 hours is the same. Further information can be found at: http://msdn.microsoft.com/en-us/library/ms819735.aspx

To change the timeouts take the following steps:

You must set a new value in the Windows Registry-Editor tree

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters 

in the registry of target server (Client).

Enter the value KeepAliveTime of the type REG_DWORD. Enter the number in milliseconds that should be allowed between the KeepAlive messages.

Sample Values:

* 120000 Decimal  =  2 Minutes
* 300000 Decimal  =  5 Minutes
* 600000 Decimal  = 10 Minutes
* 7200000 Decimal =  2 Hours  (Standard value, when the KeepAliveTime-Entry is selected)

In Practice

It is a problem when Firewalls close connections to soon, e.g. some Firewalls close after 900 seconds. In this case no data can be transferred and keepalive cannot be used. The connection will be closed by the Firewall before the Linux Kernel can send the first keepalive command!

This often happens in customer environments. If you have to use the sysctl option set it to 900 so the backup function can begin!