Source:Troubleshooting Si3 Deduplication
Si3 Deduplication
Unable to establish connection to S3 data store
Problem
Si3 NG data store may be unable to establish secure connection to S3 storage with the following error:
Error: Could not access data store. Server Status: 2023-03-30 10:17:10: ERROR Not started due to error: S3 is not connected Server Status: 2023-03-30 10:17:10: ERROR Not started due to error: S3 is not connected
Cause
In case Si3 NG data store connects to a storage provider that uses a self-signed certificate, this certificate is not recognized as trustworthy by default because it is not issued by a trusted certificate authority. This can result in connection being denied and log files in /var/opt/sesam/var/log/sms
may contain a log message similar to this:
[...default-dispatcher-6] [1;31mERROR[0;39m [36mS3[0;39m - Unexpected error: {}, cause: {} software.amazon.awssdk.core.exception.SdkClientException: Unable to execute HTTP request: javax.net.ssl.SSLHandshakeException: General OpenSslEngine problem
⇒ Solution
To solve this problem use the keytool utility to import the public.crt certificate to the server certificate store. This will allow the Si3 server to recognize and trust the S3 storage provider's certificate, and establish a secure connection.
- Obtain the public certificate. Note that you can export it from the browser.
- Locate the cacerts file on your server. This is the location of your JVM certificate keystore.
- Import the public.crt certificate into the JVM's certificate keystore with the following command:
- on Linux:
keytool -import -trustcacerts -keystore /var/lib/ca-certificates/java-cacerts -storepass changeit -noprompt -alias <storage backend endpoint URL> -file /<path_to_certificate>/public.crt
- on Windows:
C:\Program Files\ojdkbuild\java-11-openjdk-11.0.15-1\bin>keytool -import -trustcacerts -keystore "C:\Program Files\ojdkbuild\java-11-openjdk-11.0.15-1\lib\security\cacerts" -storepass changeit -noprompt -alias <storage backend endpoint URL> -file <path_to_certificate>\public.crt
Issues with S3 or S3-compatible storage
Problem
- Si3 NG data store using S3 or S3-compatible storage can experience various issues, depending on cloud storage provider. These issues can affect backups, migrations, and replications. In addition, sanity state check of Si3 NG could report errors that have similar root cause.
Cause
- Some cloud storage providers (for example, Wasabi) have request rate restrictions (how many HTTP(S) requests are allowed per second). Also on local storage with S3 option enabled, when multiple RDSs access the same local S3 storage, this can generate a lot of IOPS (I/Os per second).
⇒ Solution
- You can adjust the settings on the affected Si3 NG data store:
- In the Main selection -> Components, click Data Stores to display the data store contents frame.
- Right-click the selected Si3 NG data store and then click Properties.
- Double-click a drive to open Drive Properties dialog, and then in Options field enter as follows:
dedup.s3.timeoutInSeconds=1200,dedup.s3.page.workers=2,dedup.maxAsyncRequests=50
- This will increase the timeout period, active page workers and request rate.
Si3 remains in "shutting down" state
Problem
- Manually stopping Garbage Collection (GC) fails and consequently Si3 remains in the "shutting down" state.
⇒ Solution
- Restart the Si3 daemon by using sm_main restart sds. For more details on stopping and starting the SEP sesam services, see How to Start and Stop SEP sesam.
Si3 deduplication may not work with NFSv4
Problem
- Si3 deduplication may not work with Network File System version 4 (NFSv4).
Cause
- SEP sesam operations, such as backup, restore and migration, may fail due to Java problems with NFSv4.
⇒ Solution
- To avoid this problem, connect your backup devices via NFSv3.
Repairing corrupted Si3 NG data store
You can repair the Si3 store when pages or objects get corrupted.
- First determine the scope of corruption:
- To get the list of corrupted objects use:
sm_dedup_interface -d <datastore> corruptedobjects
- To get the list of corrupted pages use:
sm_dedup_interface -d <datastore> corruptedpages
- To get the list of corrupted objects use:
- Use the following command to replace the page in /pages directory with an older version from /pages-trash directory:
sm_dedup_interface -d <datastore> repair pages
The pages in trash contain all chunks deleted on previous GC. The oldest version of a page takes priority. - Use the following command to search for and recover the missing chunks in /pages-trash directory:
sm_dedup_interface -d <datastore> repair start
During the repair process a new page is created, which contains all chunks from the current page (page affected by 'missing chunks' issue) and all chunks found in the trash.
Cleanup of unrecoverable Si3 store
Warning | |
You should use the commands described in this section only in case the corrupted store cannot be recovered. |
When corruptions in the Si3 store persist, the initial page version has already been purged from trash or there were fatal errors during backup or restore. In this case broken pages or missing chunks cannot be recovered.
Cleanup can be performed by deleting unrecoverable objects manually or by using the automatic cleanup function.
- Deleting objects
When there are only a few unrecoverable objects, delete each object with the following commands:
sm_dedup_interface -d <datastore> delete corruted_object_id_1 ... sm_dedup_interface -d <datastore> delete corruted_object_id_Nth
In case of many corruptions you can delete all corrupted objects using the following command:
sm_dedup_interface -d <datastore> fsck purge
- Garbage collection
When you have deleted all unrecoverable objects, run garbage collection (gc):
sm_dedup_interface -d <datastore> gc start
- Automatic cleanup function
To start an automatic cleanup function, use the following command:
sm_dedup_interface ... fsck purge auto
The automatic cleanup function runs the following sequence of commands: PCCK start -> OCCK start -> Delete all corrupted objects -> GC start.
Logging
The logging function uses a relatively powerful logback library. For more information, see Logback Project. Note that this information is intended for advanced users only.
- Logging info
- gv_rw_ini:sm_sds.xml (/var/opt/sesam/var/ini/sm_sds.xml)
- /var/opt/sesam/var/log/sms contains two log files:
- sm_dedup_server_info-<drive>.log: Log level INFO and higher.
- sm_dedup_server-<drive>.log: Log level DEBUG and higher. This file can become quite large.
- sm_dedup_gc-<drive>.log: garbage collection log.
- sm_dedup_fsck-<drive>.log: file system check log.
- Auto rotation if the log file size reaches 100 MB.
Files and directories
- Objects
For every SEP sesam saveset, three objects (files) are stored in the Si3 store:
- <ssid>.data
- <ssid>.info
- <ssid>.info2
The .data and .info files are identical to those of a normal data store. The .info2 file is required for the data to be appended to a Si3 object. All database information that is not available before a backup is completed is written to this file.
- Directories