Unstructured Data Backup

businessIntelligence_unstructuredManaging the backup of unstructured data presents a considerable challenge for any organization. The modern world generates enormous volumes of unstructured data every day. Yet, in the face of this enormous data flow, there is a dedicated workforce, you, protecting this data in backups.

If you are the lucky one responsible for providing the availability and recovery of data for your organization’s unstructured data, this blog post is for you.

First, let’s define what unstructured data is:

A text-heavy data without a predefined structure or model. Unlike databases, these type of data makes it difficult to understand and manage for backups, as they do not follow a pattern or structure. This absence of a pattern or structure adds to the complexity of search, management and backup. 

If you are following the Veeam Software news, you may have noticed that two weeks ago Veeam released RTM version 10, a long-awaited release.

Version 10 introduced many features, and one of the most mentioned features is the File Shares Backup.

So let’s get started, beginning with:

Veeam File Shares Backup Architecture

As we know, Veeam R&D works hard to simplify their features workflow; but at the same time, they add intelligence to the underlying algorithm they are implementing. File Shares backup features are no different, and the concept is as follows:

Veeam v10 introduces a new server role called File Backup Proxy. By default, this server role deploys with the deployment of the Veeam Backup server; but you can deploy more to optimize performance and concurrent tasks. In this case, Veeam Backup Server distributes the workload between the available file proxy servers, speeding up the backup.

Deploying an additional File Backup Proxy is a straightforward task, as shown in the following screenshot:

Screen Shot 2020-02-09 at 7.37.56 am

This File Backup proxy is assigned to the File Backup Job to process the files and folders on the file share. It then constructs an index file of the entire folder tree structure with a CRC value.
After the index tree and CRC values are created, this file is stored in the cache repository under File Backup cache folder:

vShareBackup

The Index tree and the CRC value are compared with a new value every time the file backup job run. If there is a difference, the cache repository instructs the file proxy to read and backup the changed data only from the file share. It does this by creating data packages of the backup files; each data package is 64 MB in size.

Why Veeam FileShares Backup?

As a technologist, I’m sure you are asking yourself, “Why is there this excitement about File Share backup that Veeam v10 is introducing?”

I don’t blame you for asking this question. This is something other backup applications have offered years ago as the basic functionality of their product. Well, let me tell you why I and others are excited about this old-new feature.

In the previous architecture section, I highlighted the underlying architecture of the Veeam File Share Backup. To summarize the reason for all this excitement, let’s break it down to two reasons:

  • The intelligence the File Share Proxy Server Role introduces; and
  • The way the product scans the shares and picks up new changes.

So, in summary; speed, as mentioned above, Veeam R&D introduced a new server role called the File Backup Proxy. This server role is the horsepower of the entire solution. When a file share backup job runs, this server is assigned to the job. It processes the File Share files and folders by creating the index tree file and CRC value for the folders and files. On the next incremental run, a new file index tree and CRC value are created and then compared with the old one. If there are files or folder changes, the cache server instructs the file proxy server to read the changed data and run it to backup. This method is smarter and faster than the traditional method, where the server scans each file and folder to check for file or directory attribute changes since the last backup.

The second reason is how the File Proxy Server works when deploying multiple proxies. The multiple proxies work cooperatively to parallel stream backups of the File Share; this also speeds up the backup process.

Prepare for File Shares Backup

To get started with the new File Shares backup, you must first add a file share to the Veeam infrastructure under Inventory. The share types supported by Veeam are File Shares, SMB (Server Message Block), and NFS (Network File System).

Screen Shot 2020-02-09 at 9.24.54 am

On the first screen of the guided wizard, you enter your share URL and credentials:

Screen Shot 2020-02-09 at 9.28.02 am

Click the Advanced button to reveal the advanced options related to the File Shares backup the product offers.

On the Processing stage, you set the File Proxy, and the Cache Repository introduced on the architecture section. You can also fine-tune the backup I/O control from here:

Screen Shot 2020-02-09 at 9.31.35 am

After you finish configuring your Shares folder, you are ready for the next step.

Create File Shares Backup Job

Creating a File Share Backup job is no different than creating other Veeam backup jobs. It starts by choosing a new backup, and then selecting the File Share Backup Job type:

Screen Shot 2020-02-09 at 9.39.54 am

After assigning a name and selecting the Shared folder you wish to backup, you can choose the Repository where the backup is saved and set a retention policy.

At this point in the setup, you will notice that Veeam offers great retention options of your unstructured data. See the following screenshot:

Screen Shot 2020-02-09 at 9.44.41 am

And if you prefer to encrypt your backup, and want to choose the files and folder ACL Handling during the backup, and more, you can click on the Advanced Settings option to choose these settings:

Screen Shot 2020-02-09 at 9.53.35 am

Test

To test the capability of the File Share backup and its available setting options, I used a FreeNAS appliance containing years of files. The files are of varying file types, from photos, documents, presentations and more; a real unstructured data scenario you can probably find in your organization.

I configured a test run in Veeam File Share backup on one folder to see the outcome. Here is my finding:

First, the Full backup on the folder I selected took 3 minutes and 50 seconds to run; it processed 2707 files in 216 folders in a backup file size of 8.9 GB.

Screen Shot 2020-02-09 at 9.15.49 am

Then I created a subfolder under another subfolder. I then created a new .txt file and ran another test backup, this time incremental. The following screenshot shows you the result:

Screen Shot 2020-02-09 at 9.17.49 am

The Job duration is 0:34 seconds and backed up only ONE Folder and ONE file.

I think the screenshot explains it well enough. It demonstrated a super-intelligent algorithm, and it was fast. It didn’t require a rescan of unchanged folders or files to check for changes, as is required for traditional File Share backup products.

File Restore

Veeam offers three restoration types of File Shares:

a.   Entire file share;

b.   Rollback to a point in time; and

c.   Files and folders.

See the following screenshot:

Screen Shot 2020-02-09 at 10.31.35 am

To test the restore functionality, I deleted the folder and the file I created in the previous backup demonstration. I then set up and ran a new File and Shares restore to recover the deleted folders and file. In no time at all, the file and the folder were restored to their original location:

Screen Shot 2020-02-09 at 10.35.26 am

I got the same result when I restored the entire backup to a different folder and server, using the Entire File Share restore option:

Screen Shot 2020-02-09 at 10.44.01 am

Mine is not an ideal infrastructure for testing; but from the screenshot above, you can see that the restore of the entire share backup took only 4 minutes and 15 seconds.

Summary

On this blog post, I covered some of the fundamentals of the new Veeam File Share Backups; but I can tell you, there are many other great options that Veeam File Share offers. Options such as retention and archiving of your unstructured data, and more. I am sure I will be revisiting this on one of my future blogs.

The new Veeam File Share backup feature is an excellent product, and easy and straight forward to use. I am sure the way Veeam R&D architects have designed the solution, it will revolutionize how unstructured data will be backed up in the years to come.

If you want to learn more about this topic, or suggest any areas you would like to see demonstrated, drop me a note in the reply section of the blog and I will try to accommodate your request.

Leave a Reply