Source:Troubleshooting Tips for Backup

From SEPsesam
Revision as of 23:12, 29 April 2018 by THU (talk | contribs)


Other languages:

Copyright © SEP AG 1999-2024. All rights reserved.

Any form of reproduction of the contents or parts of this manual is allowed only with the express written permission from SEP AG. When compiling and designing user documentation SEP AG uses great diligence and attempts to deliver accurate and correct information. However, SEP AG cannot issue a guarantee for the contents of this manual.

Docs latest icon.png Welcome to the latest SEP sesam documentation version 4.4.3 to 4.4.3 Tigon V2. For previous documentation version(s), check Documentation archive.


Overview

Analyzing SEP sesam log files is very useful for detecting the operation(s) that caused errors or malfunctions, for example, in case of unsuccessful backup.

SEP sesam creates two protocols or log files for each backup day: the status file (<date of day>.status) and the day log (<date of day>.prt). An error log (<date of day>) is a subset of the day log, where only error messages are recorded.

Log files can be printed or sent by email. The directory where the log files are stored is <SEPsesam>/VAR/prot. You can check backup logs (state, day or error) in the GUI (Main Selection -> Logging -> State/Day Log/Error Log).

Tips for backup troubleshooting

In the case of an unsuccessful backup, you should follow these tips:

  • Find out when the problem occurred using the day log (.prt) and the status log (.status). The day log shows the causal progression of all SEP sesam activities of the backup day. The files with a file extension ending in .prt.err contain just the error messages from the day log.
  • Display the directory files chronologically (with ls -lart on Linux).
  • Log files should be read backward from the end of file. If a backup has failed, the indication of errors and their causes may usually be found at the end of the respective log file.
  • Compare non-working and working backups:
    • Check when was the last successful backup of this task.
    • Detect the differences between not and bck logs by comparing two different backups.
    • Find out if there were any changes in the network or on the client.
  • The values of database calls in DB_ACCESS have the following explanations:
    1. result = 1: The database access is OK.
    2. msg > 0: Amount of the result > 0.
  • If the data throughput is very low and a backup is not running, it may be possible that the communication between hardware and RDS has stopped. Use netstat to check if the connection over the STP ports (11001, 11002, etc.) still exists and check if RDS is still reachable.
  • If a process attempts to write to the hardware device and hangs, using the command kill -9 on Linux will not help because the process is waiting for I/O and the kernel won't be able to stop it. The only solution is to restart the server. These processes usually only take split seconds, however, they hang if there are any hardware problems.
  • SEP sesam does not use kernel functions nor does it access the kernel while processing. All calls are only done via GLIBC (GNU C Library). The command that goes the deepest into the system is slu (SCSI Loader Utility). It accesses the SCSI interface directly. Only loader and tape mover commands are affected by this. If a backup is running, there is no direct access to the kernel or the hardware with SEP sesam. For details on command, see Using slu topology for detecting devices.