Backing up to the Virtual Tape System
1. Background and Philosophy
1.1 Purpose
This script is designed to back laptops up to the EECIS smb2.eecis.udel.edu Virtual Tape System, though it could be used to back up to any machine with rsync and SSH access. It does not attempt to perform file "versioning", rather, it relies on the underlying server file system to keep older versions of files: smb2.eecis.udel.edu uses ZFS. This script merely duplicates the files from the local machine to
the remote machine.
1.2 Robustness in simplicity
A backup system is, ideally, simple enough to be robust. This script re-uses existing tools to the greatest extent possible. It is written in bash, the remote connection is made with ssh, and the
synchronization is performed using rsync. These are all tools with substantial history in the UNIX arena. OSX Macs have them installed as part of the base system, and they can be installed under Windows using the Cygwin environment.
1.3 Client-side periodic invocation
The script is intended to be invoked approximately once per hour by the operating-system-dependent periodic-command scheduler (cron for Macs and Task Scheduler for Windows). The script checks a time-stamp file to see when a backup was last performed, and if it has been long enough (12 hours by default) a backup is
made.
Laptops are off and on the net, so by trying frequently, there's a good chance that an attempt will eventually be made when the laptop is able to make a net connection to the server. The rsync software only transfers the parts of files that have changed so after the first, full copy, updates should be fast even over a relatively slow link. Further, it will do an excellent job of picking up where it left off if it is not able to finish the backup before the laptop is shut down or disconnected from the network.
2. Configuration
2.1 Location and types of files
All files connected to the script should be kept under the EECISbak
directory as directed in installation. Files with "sys
" in the filename should not be modified by end users since updates to this software may overwrite them.
Note for Windows laptops: Files that end in ".txt
" use DOS/Windows-style line breaks and those without use UNIX-style line breaks. Under Windows, either may be modified; if you prefer to use Windows-native software for text-file editing, you will probably find the ".txt
" versions easier to work with, and the reverse if you
prefer to use Emacs or Vi under Cygwin. Changes will be propagated from the .txt
to the non-.txt
or vice-versa automatically when the EECISbak.sh script runs.
Configuring default behavior
The config
file is "sourced" by the main script and can be used to set the environment for the backup. While complex shell-script logic can be included, it's best to use simple "VARIABLE=value" lines. The
sysconfig
file shows default values (all commented-out with #); if required, this file could be copied to the "config" file and then modified.
Choosing files and folders to back-up
Configuring which files are included is done by the include
, sysinclude
, exclude
, and sysexclude
files. Note that "includes" and "excludes" may be listed in any file by using "+ " and "- " respectively (see the sysexclude
file for examples). Each file and directory on the system is matched against the list of patterns in these files, and the first match is applied, so if the following two lines are found:
+ /Users/roosen/
- /Users/*
then the directory /Users/roosen
and all contents will be included (unless excluded later), and all other contents of /Users/
will be excluded. If the two lines are reordered, however:
- /Users/*
+ /Users/roosen/
Then since /Users/roosen/
will match the first line, it will be excluded. Lines ending in /
will only match directories, and lines beginning with /
will be matched against the full file pathname.
3. Installation
3.1 Windows
Install Cygwin
The backup script depends on several Cygwin packages. Cygwin is easily obtained and installed, but you must be running as an administrator. If Cygwin is already installed, make sure the required packages have been installed.
- Begin by going to http://cygwin.com and click on the "Install Cygwin Now" button in the upper-right.
- Run the setup.exe you just downloaded to install (or reconfigure) Cygwin.
- Accept all the defaults (keep clicking next). Select
http://mirror.mcs.anl.gov
as the mirror and click next.
- In the "select packages" panel, click view until it shows the Full view. Locate and ensure that bash, cygutils, openssh, and rsync are selected for installation. Click next, and wait.
Install EECISbak package
Download http://www.eecis.udel.edu/~roosen/EECISbak.zip. Unzip in your folder under Documents and Settings so that all the files are stored in a EECISbak folder.
Files include:
EECISbak.sh
: this is the main backup script (bash shell).
EECISbak.bat
: a DOS script that invokes the above for ease of running directly from Windows.
EECISbka.bat
: a DOS script that invokes the above for ease of running non-interactively under Windows using Scheduled Tasks
readme.txt
: this help
sysconfig
, sysconfig.txt
: configuration files for use by system staff in "UNIX" and "DOS" formats. Copy to config
or config.txt
for your own modifications.
sysexclude
, sysexclude.txt
: rsync exclude files for system staff defaults in "UNIX" and "DOS" formats.
crontab
: sample crontab entry for use under MacOS.
host_key
: smb2.eecis.udel.edu public SSH key.
Invoke the script for the first time.
Run the EECISbak.bat script (you can just double-click on it in Windows Explorer). When it finishes the first time, it will write a file called authorized_keys.txt
. Post the contents of that file into an EECIS Help Request ticket asking for authorization to back up. If you have more than one laptop to back up, note that they need to be given different names in the server authorized_keys
.
Configure files to back-up
Look at the sysexclude.txt
and modify the exclude.txt
file to fit your environment as necessary. Note that full paths should be prefixed by /cygdrive/c/
to indicate files under C:
, /cygdrive/d/
for files under D:
, etc. Path components should be separated by "/" instead of "\"!
Test run and initial backup.
Once your Help Request has been answered, run the EECISbak.bat once to ensure that backups will go through and to perform a baseline backup of your system. This may take a considerable amount of time and require substantial network traffic (later runs will be much faster).
Configure automatic periodic execution
The EECISbka.bat
[sic] script should be set to run hourly to insure that regular backups will occur.
- Select Start->All Programs->Accessories->System Tools->Scheduled tasks, then double-click Add Scheduled Task. A "wizard" will start up. Browse to the
EECISbka.bat
file as the program to be run. The basic wizard doesn't let you set it to run hourly, so just choose daily. Otherwise, accept all the defaults, and be sure to enter your username and password (of your local/laptop account!) where asked. In the last page, select Open Advanced Properties for this Task when I finish.
- In the Advanced Properties pane, select the schedule tab. Make sure it starts Daily at 12PM. Hit the Advanced button. Click on Repeat Task and set it to every 1 hour with a duration of 24-hours. Click OK.
- In the Settings tab, un-check both boxes under Power Mangement.
Check logs
The script generates an activity log in EECISbak\log-YYYYMMDDHHMM.txt
(with the year, month, day, hour, and minute substituted appropriately). You should check those logs occasionally to make sure all is working properly.
3.2 Mac OS X
All required supporting utilities are installed by default. To install the backup script:
Install EECISbak package
Download http://www.eecis.udel.edu/~roosen/EECISbak.zip and unzip it into your home directory so that the script will be found in "~/EECISbak/EECISbak.sh
". Contents are listed above.
Customize the list of files to back up.
Modify ~/EECISbak/exclude
to fit your needs (it is just a text file).
Run the script to generate authorized_keys.
Open a terminal (Applications->Utilitied->Terminal.app) and run ~/EECISbak/EECISbak.sh. When it finishes the first time, it will write a file called "authorized_keys.txt
". Post the contents of that file into an EECIS Help Request ticket asking for authorization to back up. If you have more than one laptop to back up, note that they need to be given different names in the server authorized_keys
.
Test run and initial backup.
Once your Help Request has been answered, run EECISbak.sh once more from the terminal to ensure that backups will go through and to perform a baseline backup of your system. This may take a considerable amount of time and require substantial network traffic (later runs will be much faster).
Configure automatic periodic execution.
If you already have a crontab
file (run "crontab -l" to check), add the line in EECISbak/crontab
to it. Otherwise, run "crontab ~/EECISbak/crontab".
Check logs
The script generates an activity log in EECISbak/log-YYYYMMDDHHMM.txt
(with the year, month, day, hour, and minute substituted appropriately). You should check those logs occasionally to make sure all is working properly.
4. Command-line switches
A few command-line switches are available to change the way the script operates.
-i
: run in "interactive mode". Rsync logs are printed to the terminal and the script pauses at the end of operation for user acknowledgement.
-f
: ignore exiting time stamp. Usually the script will check for the existence of a time stamp file and exit immediately is insufficient time has passed. This flag will force the script to continue.
-c <filename>
: Read basic configuration from <filename>. There's probably no reason to use this.
The EECISbak.bat
always uses -i -f
.
5. Restoring lost files.
To restore files from the Virtual Tape System, you can either mount the VTS on your computer or recover files from /backups/
USERNAME on any EECIS UNIX system (where USERNAME is your EECIS login name).
5.1 Mounting the VTS on a client
Mounting the VTS on a Windows client
Right click My Computer from the Start menu and select Map Network Drive. Choose an unused drive letter and enter \\smb2.eecis.udel.edu\
USERNAME as the Folder. Note that USERNAME must be your EECIS login user name, and if your login name on your laptop is different, you'll need to select different user name and enter the appropriate information in the popup.
Mounting the VTS on a Mac client
Select Go->Connect to Server from the Finder menu bar, and enter smb://smb2.eecis.udel.edu/
USERNAME as the server address. Be sure to select Registered User and enter your EECIS user name and password as your name and password (your "full name" may be incorrectly pre-filled).
5.2 Older file versions
Regardless of how you locate your backups, you should normally see two folders. One folder will be named for your machine, and is where the most recent backup files are found (e.g., Z:\MACHINENAME\cygdrive\c
). The other folder is named Archive
and points to an hierarchy of archived copies (e.g., Z:\Archive\snapshot
). The hierarchy is as follows: 12
and 18
are the previous Noon and 6pm copies, Mon
through Sun
are the Noon copies from those days this week, w0
through w7
are the Friday Noon copies from the previous 8 weeks, and the m0
through m2
are the end-of-month Noon copies for the previous 3 months. Older versions are not stored.
6. Quirks
Under Windows, a window pops up every time the .bat
file is run even if there is no output. I'm sure there's a better way to handle running the script that doesn't do that.
Cron is outdated for MacOS: launchd is the "right way to do it".
Comments
To add a comment, click the link below. You are free to contribute anonymously, but it is preferred that you sign your comments with your name. Simply add ~~
~~
to the end of your comment to sign it. Regardless of whether you sign your comment, your username will be visible on the History page.
(Add Your Own)