Have your static files and storage space reached a critical point? Is there a cheaper way to archive the data without paying a high price for an archive solution? Or, is there a free way to solve this challenge? Read on to see how you can use your backup tools to solve your archiving challenges?
We are living in an era of continuous data being generated and consumed at an incredible rate.
The management of this data explosion is a major concern for all IT managers looking for new ways to control it. One of the best strategies to meet the challenge of gaining control of your data is to archive the Static Data.
Static Data is historical data that is no longer going to be changed. That data that is not needed for the daily operation of the business; but is still important for accounting, compliance or archives.
As the backup administrator, you are responsible for the retention of, and providing ready access to, your static data; it is your historical records. You shouldn’t have to run backups on your static data every time you run a backup; it was the same yesterday, and will be again tomorrow. Historical data repositories can be huge, and you will find when shopping for an archiving solution that it can be very expensive.
In this post, I am going to show you a workaround, and show you how my company used Veeam Backup to archive our static data and reduced our storage footprint.
As a mining company, we run several Common Internet File System (CIFS) on Network Attached Storages (NASs). Each file share contains images, diagrams, maps, and so on. These files are generated by each project; they are big files and consume a lot of storage space.
Our company management was shopping around for a solution to archive these files, but quickly realised the challenges. The products cost lots of money, take up expensive resources, but more importantly, introduce additional layers of complexity. However, in spite of all these hurdles, one of our backup engineers came up with an innovative solution.
Our backup engineer suggested we use our Veeam backup solution to back up the historical files to Tape, Repository, Virtual Tape Library (VTL), etc… and then delete those files from the CIFS on the NASs.
His solution was tested and proven up to the job for our needs. The solution requires the following items:
- Linux VM to mount the CIFS Shares;
- A temporary VMDK file to host the Share files;
- Linux Unison;
- Pre and Post scripts; and
- the Veeam Backup and Replication product.
Linux Server Preparation
The first step to the Linux Server Preparation is to download and install Ubuntu Linux Server, though any other Linux version can be used. After downloading and installing the Linux Server, follow the steps below to complete the configuration:
1. Add a VMDK file (the size must be the same, or larger, than the CIFS share).
2. Attach the disk to the Linux server.
3. Log in to the Linux shell
4. Run “sudo fdisk -l.” to get the address of the disk.
5. Initialise the disk: /dev/sdc: “sudo fdisk /dev/sdc”
6. To create the disk partition, press “n”, and then choose primary “p”, and then “1”; we need only one partition.
7. Press “Enter”. When the initialization is complete, press “q” to exit.
Format the New Disk
1. Format the new disk we just created. Run:
sudo mkfs -t ext4 /dev/sdc
The /dev/sdc string is the address of the new disk we acquired from the first step.
2. Create a mount point:
sudo mkdir /mnt/newdisk
3. Automatic mount at boot:
4. Add the following line:
/dev/sdc /mnt/newdisk ext4 defaults 0 2
5. Install the Unison File Synchronizer package by running:
apt-get -y install unison
6. mount the CIFS share to Linux Server:
mount -t cifs -o username=USERNAME //Shareaddress/shares /mnt/share
1. After completing the above steps, we now have a new disk where we can host the files we want to archive, and we have a mount point for the CIFS share folder. Our intention is to use Unison to sync the shares to the temporary disk (newdisk), so that Veeam can backup, and then remove, the old files from the CIFS share to free up storage space.
2. For this example under the CIFS folder, we have two group of files:
a. one group with the time stamp of 10/12/2010; and
b. the second group with the time step of 21/5/2016.
The Demonstration Run
The demonstration of our solution is as follows:
1. Run Unison to synchronise the first group of files with the time stamp of 10/12/2010 to the new disk.
2. Backup the newdisk with the files using Veeam.
3. Delete the first group of files from the source storage (CIFS).
4. Clean up
1. Configure the unison sync to sync the shares from the CIFS to newdisk.
The configuration file is under root, <.unison> folder , filename “defaults.prf”, by typing:
sudo vi defaults.prf
After the data synchronises to the “newdisk”, it is time to clean up the NEW files and keep any of those files we need to archive. In our example, keep the group files with the dates from 10/12/2010. To do this, you need to run the following commands.
2. Set the historical date of the old files; the commands shown below will keep all the files older than 10/12/2010, and then delete any files newer than this date.
touch -t 1012101730 /tmp/test
find . -newer /tmp/test -delete
3. Backup with Veeam
5. Optional: Run Unison to clean up the source storage; keep only the new files on the CIFS:
find . ! -newer /tmp/test -delete
From the previous steps, you can see that it is easy to backup and clean up files using the time stamp of the files. Now lets script all the commands together and let Veeam run the job to archive the old files, and then delete them, while keeping the new files.
In this example, we will have two groups of files:
a. Group A: old files 10/12/2010 17:30
b. Group B: newer files 21/5/2016 14:49
The pre-script will sync the files from the CIFS share to the “newdisk”,
a. Sync the CIFS with “newdisk”
b. Clean up Group B filesc.
The post-script, will:
Clean up Group A from CIFS after backup
1. Let’s create the following scripts on the Linux server by starting the following command:
* The command :q saves and quits vi prescript.
touch -t 1012101730 /tmp/oldfiles
find . -newer /tme/oldfiles -delete
2. The next step to create a post script is optional:
find . ! -newer /tmp/oldfiles -delete
3. After the scripts are created, copy them to the Veeam backup server.
**Important note: Do not open these script with an MS Windows editor.
Veeam Job Creation
1. Name the Job. In this example it is “FileArchivesJob”.
2. Choose the VM and the Exclusions:
3. In this procedure, we are going to backup only the “newdisk” vmdk file.
4. Add the scripts under the Application processing:
5. Provide a root username. In this example, it is veeambackup.
Note: In the event you are having difficulty authenticating with the Linux server using the provided username and password, you can create a user for this exercise with the following commands:
sudo adduser veeambackup
sudo gpasswd -a veeambackup sudo
After those commands, modify the sudoers file with visudo to add the following lines
veeambackup ALL=(ALL) NOPASSWD: ALL
veeambackup ALL=(ALL) NOPASSWD: ALL
The above workaround allowed our mining company to reduce the size of our backup runs by archiving our old data to tape, disk and VTL.
In addition, the above workaround allowed us to backup our CIFS share in daily bases without needing to acquire additional software.
I hope our experience, and our workaround, will help you solve a similar challenge. Please don’t hesitate to provide your feedback. Share and contact me for more info.
** Important note: You MUST test all your scripts and workflows before you implement any of these workarounds on your production environment. This is especially important when you are deploying the post script to delete the old files from the CIFS share.