Link Menu Expand (external link) Document Search Copy Copied

NAS cache repository

The cache repository was introduced in Veeam Backup & Replication v10 and holds a cache of folders information in the structure of the protected NAS share. It is used for NAS backup only and it is crucial to accelerate the incremental backup process, allowing to identify any object that has changed since the last backup. It is also the controlling unit of a NAS backup job.

Whenever a NAS backup job runs, it is the cache repository that requests the File Proxy to retrieve data from the source share. The file proxy sends back the top level structures hashes, the cache repository loads them into memory and initiates a CRC check on the data stored in the cache (generated with the previous job run). If there is a match to the cache value, no file has to be processed and the file proxy will be instructed to iterate and repeat this process on deeper levels of the share to find folders with a mismatch.

Note that there can be only one cache repository per source file share.

Best practice

Ideally, the cache repository should sit as close as possible to the File Proxy and source data. The bigger the share, the more the traffic between file proxies and cache repository. At the best, they should be one hop away from each other.

No disk sizing is required unless using backup to Object Storage: usually the size of the cache files does not exceed 4GB and due to the optimized size of the cache, data will be loaded into memory when needed. This also makes fast disks not necessary.

If you have multiple proxy or repository servers, you can distribute the load over all machines.

The cache repository is always a standard repository. It cannot be part of a Scale-out Backup Repository, but you can create an extra standard repository on a machine that also hosts Scale-out Backup Repository extents.

NAS Backup and Object Storage repository

Starting with v12, the NAS Backups can be sent directly to Object Storage. The main difference in compare with backup to other kind of repositories, is that the Active Metadata is kept in the cache repository, and a copy of this Metadata is kept in the Object Storage with the backup data. For other kind of repositories, the metadata files are kept in the repository itself (See NAS Backup Repository).

Because of this, it’s recommended to provision enough disk space for metadata in Cache repository. The recommendation is to provision at least 5% of space of source data, and using SSD disks to provide a proper performance.

Sizing Guidelines

Sizing Formula
Cores The cache repository needs 2/3rd of the File Proxy calculated cores, or simply “File Proxy CPU # cores” / 1.5
Memory The cache repository requires 4GB of RAM per cache repository core.
Disk (when backup to Object Storage is used) Source Data x 5%

Let’s review the same examples discussed for File Proxy sizing

Example 1: 3PB source data, 10M files, 40 NFS shares, 2% change rate, 8 hours backup window.

Sizing Formula Size
Required File Proxy resources 24 cores and 48GB of RAM  
Cache repository required core count 24 / 1.5 16 Cores
Cache repository required RAM 16 x 4 64 GB
Cache repository required disk (if using backup to Object Storage) 3PB x 5% 153,6TB

Example 2: 50TB source data, 300M files, 2 SMB shares, 3% change rate, 8 hours backup window.

Sizing Formula Size
Required File Proxy resources 8 cores and 16GB of RAM  
Cache repository required core count 8 / 1.5 6 Cores
Cache repository required RAM 6 x 4 24 GB
Cache repository required disk (if using backup to Object Storage) 50TB x 5% 2.5TB

References


Back to top

Copyright © 2023 Solutions Architects, Veeam Software.