Block repositories
These kinds of repositories do provide a block storage device to a Windows or Linux repository server. A block storage device can be a local disk (Direct Attached Storage) or a LUN provided via a Storage Area Network like iSCSI or Fibre Channel.
Block sizes
As can be seen from the field, optimal value for the stripe size is often between 128KB and 256KB. However, it is highly recommended to test this prior to deployment whenever possible.
During the backup process, data blocks are processed in chunks and stored inside backup files in the backup repository. You can customize the block size during the job configuration using the storage optimization setting of the backup job.
By default, Veeam’s block size is set to local target, which is 1MB before compression. Since compression ratio is very often around 2x, with this block size, Veeam will write around 512KB or less to the repository per Veeam block.
This value can be used to better configure storage arrays; especially low-end storage systems can greatly benefit from an optimized stripe size.
There are three layers where the block size can be configured: Veeam block size for the backup files, the filesystem and the storage volumes.
Let’s use a quick example:
The Veeam block size (after compression) of 512KB is going to be written in the underlying filesytem, which has a block size of 64k. It means that one block will consume eight blocks at the filesytem level, but no block will be wasted, as the two are aligned. If possible, set the block size at the filesytem layer as close as possible to the expected Veeam block size.
For the storage array below configure a strip(e) size of 128K or 256K for best performance. This aligns best with the Veeam I/O profile and reduces write overhead.
For more information, refer to this blog post at virtualtothecore.com
Volume sizes
Since ReFS and XFS support has been released, we have seen more and more successful implementations of high volumes size filesystems without any reported issues on the field.
Even if filesystems size limits are way beyond petabytes, we do advise to limit volume size to 500 TB.
This limit is not technical, but helps mitigating filesystems failure domains and keep operations such as SOBR evacuation or repository rescan humanly manageable. It also prevents any locking issue due to unmanageable amount of data. For example, keep in mind that the evacuation of a 500 TB SOBR extent at 1 GB/s read throughput would necessitate around 6 days.
An other important point is to avoid filling filesystems above 80 to 90 percents, which seems to cause performance drop. This means a 500 TB filesystem should not be used to more than 400 TB maximum.
To go for bigger repositories, we recommend using a Scale-Out Backup Repository with multiple extents.
RAID controller caching
To get the best out of a synthetic backup and enhance the performance, it is recommended to use a write-back cache. Read and write request processing with write-back cache utilization is shown in the figure below.
Windows or Linux?
There are only small differences between Linux and Windows which result from using different schedulers for I/O and compute. The main difference for block-based repositories is the choice of available file systems.
XFS considerations
XFS Data Block Sharing (Reflink) provides the same benefits as ReFS in terms of speed and space consumption. Veeam leverages it to implement the Fast Clone functionality. Since all transformation tasks are done via metadata operations, synthetic full backups get a huge performance boost and they don’t take up any additional capacity.
To make use of the XFS reflink feature for fast cloning the filesystem must be formatted with 4k block size. This has no impact on other best practices like RAID strip size nor on performance.
While XFS can scale up to 8 Exabyte you should check the supported limits from your Linux vendor (e.g. RHEL8/9 support for XFS file systems is limited to 1PB) and consider other best practice factors like RAID rebuild and evacuation times when sizing your filesystem.
Using LVM with XFS is fine if you need more flexibility for volume management.
You can also consider XFS/Fast Clone if your data is encrypted, as Veeam will know which metadata/datablocks inside the encrypted backup-files correspond to which source-datablocks.
As with ReFS, XFS can be affected by fragmentation. Again there are tools that defragment XFS, but again they should be used with caution.
NTFS or ReFS?
You can use both Microsoft filesystems for a Veeam repository. Both have advantages and disadvantages during different backup situations.
ReFS allows to use the Veeam Fast Clone feature which allows very fast synthetic operations for full backups and merges.
To work around this fragmentation you can schedule regular active full backups, but they will require 100% space on the disk compared to synthethic fulls which just take the additional space of an incremental backup.
NTFS
When using NTFS please make sure that
- the volume is formatted with 64KB block size
- you use the “Large File” switch
/L
to format the volume to avoid file size limits
The following command will quick format volume D
accordingly: format D: /FS:NTFS /L /A:64K /Q
ReFS
ReFS is using linked clone technology. This is perfect for synthetic operations and will save dramatic IOs and throughput during operations like merges or creating synthetic fulls.
Follow these best practices when using ReFS:
- Format the volume with 64KB block size
- Configure 256KB block size on LUNs (Storage or RAID controller)
- Never bring linked clone space savings into your calculation for required storage space
- “All ReFS supported configurations must use Windows Server Catalog certified hardware” - please contact your hardware vendor
- Never use any shared LUN concept with ReFS and a Veeam Repository
- Check the existing driver version of ReFS:
- The minimum should be
ReFS.sys 10.0.14393.2457
on Windows Server 2016. - The minimum should be
ReFS.sys 10.0.17763.1369
on Windows Server 2019. - General recommendation: fully patched with the most recent Windows Updates.
- The minimum should be
Maximum path length limitation
Regardless if using NTFS or ReFS, Windows operating systems used to have a limit of 260 characters as the maximum length of a file’s path name. Although this limitation has been removed in newer versions starting with Windows 10 version 1607, this feature is still disabled by default within the Windows API (see Maximum Path Length Limitation). Veeam is capable of handling long path names, but this needs to be enabled in the OS first. That’s why it is recommended to enable the LongPathsEnabled
setting on all Windows servers hosting Veeam roles, either within the registry or via group policy.
Path | HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\FileSystem |
Key | LongPathsEnabled |
Type | REG_DWORD (32-bit) |
Value |
1 = enable the option
|
The group policy controlling this registry key can be found at
Computer Configuration > Administrative Templates > System > Filesystem > Enable Win32 long paths
Windows server deduplication
Deduplication will take away some of the benefits of block storage repositories but provides efficient use of disk resources.
Follow the recommendations provided in the configuration guidelines above; here is the summary:
- Use the latest available Windows version (minimum requirement is Windows 2012 R2) and apply all patches (some roll-ups contain improvements to deduplication). Having most up to date system is critical for ensuring data safety.
- Format the disk using the command line
/L
option (for “large size file records”) and 64KB cluster size (use parameters/Q /L /A:64K
) - Follow compression and deduplication guidelines for non-integrated deduplication storage.
- (For Windows Server 2016 and later) the Virtualized Backup Server deduplication profile is to be preferred
- Modify garbage collection schedule to run daily rather than weekly.
- Use backup jobs configured to perform Active full with Incrementals.
- If possible, spread active full backups over the entire week.
- If Windows deduplication is used on REFS with Windows Server 2016 or newer, then Veeam Backup & Replication turns off block cloning automatically for performance reasons. It is recommended to keep this default setting.
- Large files take a long time to deduplicate and will have to be fully reprocessed if the process is interrupted.
- Configure deduplication process to run once a day, and for as long as possible.
References
- Alliance Partner Integrations & Qualifications - Disk Backup Target
- Helpcenter - Fast Clone
- Microsoft ReFS Overview
- Deduplication Appliances
- Virtualized Backup Server
- Maximum Path Length Limitation