Difference between revisions of "Rsync - synchronizes files and directories from one location to another"

From NAS-Central Buffalo - The Linkstation Wiki
Jump to: navigation, search
m (LinkStations)
 
(7 intermediate revisions by 5 users not shown)
Line 1: Line 1:
{{Articles|LS1|LS2|HG|FreeLink|OpenLink|software}}
+
{{Articles|LS1|LS2|HG|FreeLink|OpenLink|software|ipkg}}
 
<table align=right><tr><td>http://upload.wikimedia.org/wikipedia/en/1/17/Newrsynclogo.jpg</td></tr></table>
 
<table align=right><tr><td>http://upload.wikimedia.org/wikipedia/en/1/17/Newrsynclogo.jpg</td></tr></table>
 
=Background=
 
=Background=
Line 21: Line 21:
 
#Make sure you have installed the [[Precompiled C development environment, running on the LS]] first.  
 
#Make sure you have installed the [[Precompiled C development environment, running on the LS]] first.  
 
#Get the source, make and install.
 
#Get the source, make and install.
  wget http://samba.org/ftp/rsync/rsync-2.6.9.tar.gz  
+
  wget http://samba.org/ftp/rsync/rsync-3.0.2.tar.gz
  tar xfzv rsync-2.6.9.tar.gz  
+
  tar xfzv rsync-3.0.2.tar.gz  
 
  cd rsync-2.6.9
 
  cd rsync-2.6.9
 
  ./configure  
 
  ./configure  
Line 33: Line 33:
 
From the Yahoo! Linkstation General Group <ref> From the Yahoo! Linkstation General Group: [http://tech.groups.yahoo.com/group/LinkStation_General/message/5376 rsync binaries anyone / SSH without password as root INTO LinkStation]</ref>
 
From the Yahoo! Linkstation General Group <ref> From the Yahoo! Linkstation General Group: [http://tech.groups.yahoo.com/group/LinkStation_General/message/5376 rsync binaries anyone / SSH without password as root INTO LinkStation]</ref>
 
  wget http://ls.jcedata.net/rsync
 
  wget http://ls.jcedata.net/rsync
  chmod a x rsync
+
  chmod a+x rsync
 
  cp rsync /usr/bin
 
  cp rsync /usr/bin
 
==FreeLink==
 
==FreeLink==
Line 49: Line 49:
 
*Install rsync
 
*Install rsync
 
  ipkg install rsync
 
  ipkg install rsync
 +
 +
==LinkStations==
 +
The manufacturer (Buffalo) of Linkstations has installed sshd and rsync software on these machines, but has not enabled it, probably because of difficulties in maintaining warranty when a backup device is converted to use as a general computer. It is relatively easy to enable this software if you are willing to void your warranty; see:
 +
*[[Enabling Rsync on a Linkstation Mini]]
  
 
=Examples=
 
=Examples=
 
*Create a backup named with todays date (formatted yyyy-mm-dd)
 
*Create a backup named with todays date (formatted yyyy-mm-dd)
  rsync -a /SOURCE/ /DEST/`date %Y-%m-%d`/
+
  rsync -a /SOURCE/ /DEST/`date +%Y-%m-%d`/
  
 
*A script to create a backup, named by date, which will save space by creating hard links to files which are already backed up.  It requires a symbolic link to the most recently created backup dir (similar to the last line of the script)
 
*A script to create a backup, named by date, which will save space by creating hard links to files which are already backed up.  It requires a symbolic link to the most recently created backup dir (similar to the last line of the script)
 
  #!/bin/sh
 
  #!/bin/sh
  BACKUP_DATE=`date %Y-%m-%d`
+
  BACKUP_DATE=`date +%Y-%m-%d`
 
  rsync -a --delete --link-dest=/DEST/most-recent-backup /SOURCE/ /DEST/$BACKUP_DATE/
 
  rsync -a --delete --link-dest=/DEST/most-recent-backup /SOURCE/ /DEST/$BACKUP_DATE/
 
  rm /BACKUP/most-recent-backup
 
  rm /BACKUP/most-recent-backup
Line 73: Line 77:
  
  
*Here is another one-liner which, using cron, gives me a weekly backup of my LS-HG's data partition hda3 onto a USB hard drive
+
*Here is another one-liner which, using cron, gives me a weekly backup of my LS-HG's data partition hda3 onto a USB hard drive & logs everything.  Use it either as a one-liner, or edit into separate lines and use as a bash script.  It does the following:
 +
**prints a header with a timestamp
 +
**executes the backup & shows the rsync log, along with a record of what files are updated, and the average transfer speed of the backup
 +
**prints a footer with timestamp, and then a blank line
 +
**all the output is appended, so that the log is cumulative (I also have logrotation enabled)
 +
echo ========FreeLink /dev/hda3 JFS backup started $(date) ====== >> /var/log/weeklybackuplog.txt  ;
 +
rsync -av /mnt/share/shareddirs/ /mnt/share/usb0/backups/shareddirs >> /var/log/weeklybackuplog.txt ;
 +
echo  ======end of backup run at $(date) =======  >> /var/log/weeklybackuplog.txt ;
 +
echo >>  /var/log/weeklybackuplog.txt
 +
 
 +
 
 +
= Using keygen and other utils for automatic, unattended login via ssh =
 +
If you want to have your rsync happen automatically, then set up keys for ssh.  Read the following for instructions:
 +
 
 +
http://freebsd.peon.net/quickies/21/
 +
 
 +
http://www.debianadmin.com/ssh-your-debian-servers-without-password.html
 +
 
 +
=Editing Rsync rsnapshot backup for easy LS archiving=
 +
 
 +
rsnapshot is a script that uses rsync to create a space saving archive of data (see http://www.rsnapshot.org/ for more details).
 +
Usually you configure rsnapshot to do periodical backups of certain disk areas. The backup destination may even be a remote site.
 +
Anybody who has a remote Unix server available does not need to read the following lines. You are better off doing a straight rsnapshot backup to your Unix machine.
 +
 
 +
But...  if you only have a windows PC available, there are some limitations: Due to the different nature of Unix and Windows filesystems (ext3 vs. NTFS), a 1:1 backup from the LS to a windows PC is not possible. Even if you e. g. run Cygwin, some permissions and ownerships will get lost, if you simply rsnapshot to your PC.
 +
 
 +
The following script works this around. It assumes, that you have an rsync receiver available at the backup target. The Cygwin tools are a very good choice for providing this.
 +
What rsync_rsnapshot does, is to package all rsnapshot backups on the Unix side (your LS) and to transfer only a single TAR file to the windows PC.
 +
This has one major drawback that needs to be mentioned: rsync has a great feature to only transfer data that has been changed. This rsync feature does a great job comparing single files. It is way more inefficient to have the delta algorithm only on a compressed tar file. On the other hand, we usually have a fast network connection between the LS and the backup destination. So transferring larger volumes of data does not matter too much.
 +
 
 +
You need to configure rsnapshot to create a local backup in the first. The following script needs to be parameterized setting some global variables:
 +
 
 +
* RS_ROOT: The root folder for your archive on the LS (you may need to calculate a considerable amount of space for this.
 +
* TAR_FILE: The name of the tar file on the local (LS) destination.
 +
* RSYNC_SERVER_DEST: The FQDN of the PC running the Cygwin tools with a listening SSHD and RSYNC installed (any Unix box would be fine although).
 +
* RS_MOST_RECENT: The subfolder name of the rsnapshot tree that has the largest update rate. If you´re doing daily backups, this is usually "daily.0", if running weekly backups, it should be "weekly.0".
 +
* RSYNC_USER_DEST: The name of the user on the destination box.
 +
* RSYNC_FOLDER_DEST: The destination folder on the Cygwin box.
 +
* RSYNC_SSH_DSA: Name of the key file for the DSA private ssh key.
 +
 
 +
<pre>
 +
#!/bin/bash
 +
RS_ROOT='/share/backup/rsnapshot'
 +
RS_MOST_RECENT='weekly.0'
 +
RS_LOCK_FILE='/var/run/rsnapshot.pid'
 +
MY_LOCK_FILE='/var/run/rsync_rsnapshot.pid'
 +
MY_NAME=`basename $0`
 +
TAR_FILE='/share/backup/rsnapshot/rs_nas1.tgz'
 +
LOG_TAG='rsync_rsnapshot'
 +
RSYNC_SERVER_DEST='dest3'
 +
RSYNC_USER_DEST='sshrsync'
 +
RSYNC_FOLDER_DEST='/cygdrive/n/backup/nas/'
 +
RSYNC_SSH_DSA='dest3_backup.id_dsa'
 +
 
 +
# check for existence of own lockfile
 +
if [ -a "${MY_LOCK_FILE}" ]; then
 +
        # check if pid from lockfile does still exist
 +
        LOCK_PID=`cat ${MY_LOCK_FILE}`
 +
        LOCK_PGM=`ps -p ${LOCK_PID} -o comm=`
 +
 
 +
        # check if the lock pid is valid
 +
        if [ "${LOCK_PGM}" = "${MY_NAME}" ]; then
 +
                logger -p syslog.info -t ${LOG_TAG} "previous run is still active: ${LOCK_PID}"
 +
                exit;
 +
        else
 +
                # write log record
 +
                logger -p syslog.error -t ${LOG_TAG} "removing old lock file with dead or wrong process: ${LOCK_PGM}"
 +
                rm ${MY_LOCK_FILE}
 +
        fi
 +
fi
 +
 
 +
# create lockfile
 +
echo $$ > ${MY_LOCK_FILE}
 +
 
 +
# check if everything went ok
 +
LOCK_PID=`cat ${MY_LOCK_FILE}`
 +
if [ $$ -ne ${LOCK_PID} ]; then
 +
        # lockfile could not be created properly
 +
        logger -p syslog.error -t ${LOG_TAG} "could not create lock file, conflict with pid=${LOCK_PID}"
 +
        exit 1;
 +
fi
 +
 
 +
 
 +
# check if tar file exists or if folder has newer timestamp than the tar file
 +
if [ ${RS_ROOT}/${RS_MOST_RECENT} -nt ${TAR_FILE} ]; then
 +
        logger -p syslog.info -t ${LOG_TAG} "start tar file creation"
 +
 
 +
        # exit if rsnapshot pid file exists
 +
        if [ -a "${RS_LOCK_FILE}" ]; then
 +
                # write log record
 +
                logger -p syslog.warning -t ${LOG_TAG} "rsnapshot run in progress, exiting..."
 +
                rm ${MY_LOCK_FILE}
 +
                exit;
 +
        fi
 +
 
 +
        # generate tar file (overwrite old one if it exists
 +
        nice tar --numeric-owner --one-file-system --preserve --exclude=${TAR_FILE} -czPf ${TAR_FILE} ${RS_ROOT}
 +
        chmod 600 ${TAR_FILE}
 +
 
 +
        # write log record
 +
        logger -p syslog.info -t ${LOG_TAG} "new tar file created"
 +
fi
 +
 
 +
# rsync tar archive to destination server
 +
logger -p syslog.info -t ${LOG_TAG} "start/check for rsync transfer"
 +
RSYNC_OUT=`nice rsync -e "ssh -i $HOME/.ssh/${RSYNC_SSH_DSA}" -av --delete-excluded --timeout=30 --partial --whole-file ${TAR_FILE} ${RSYNC_USER_DEST}@${RSYNC_SERVER_DEST}:${RSYNC_FOLDER_DEST}`
 +
 
 +
# write log record
 +
if [ $? -eq 0 ]; then
 +
        logger -p syslog.info -t ${LOG_TAG} ${RSYNC_OUT}
 +
else
 +
        logger -p syslog.error -t ${LOG_TAG} "rsync failed ($?)"
 +
        rm ${MY_LOCK_FILE}
 +
        exit $?
 +
fi
 +
 
 +
# remove lock file
 +
rm ${MY_LOCK_FILE}
 +
</pre>
 +
 
 +
The script tries to connect the backup destination all day log. Whenever it is up and running, it starts transferring the latest tar file. So the level of safety depends on the availability of your backup box.
 +
 
 +
This gives you a convenient solution to create a historized archive of your data on the LS and to transfer the whole archive to another machine, usually your working PC. Personally I use rsync_rsnapshot to keep an up-to-date mirror of my system partition. For very large data volumes, an rsync approach tends to be slower in backup performance and needs to be questioned. rsync_rsnapshot has it´s major domain in providing archive and history information for fast changing environments.
 +
 
 +
= References =
 +
<references/>

Latest revision as of 16:57, 9 March 2009

Newrsynclogo.jpg

Contents

Background

rsync[1] is a free software computer program for Unix systems which synchronizes files and directories from one location to another while minimizing data transfer using delta encoding when appropriate. An important feature of rsync not found in most similar programs/protocols is that the mirroring takes place with only one transmission in each direction.[2] [3]

rsync can copy or display directory contents and copy files, optionally using compression and recursion.

rsyncd, the rsync protocol daemon, uses the default TCP port of 873. rsync can also be used to synchronize local directories, or via a remote shell such as RSH or SSH. In the latter case, the rsync client executable must be installed on the near as well as the far host (the computer running the remote shell daemon). There also exists a utility called rdiff[4], which can be used for incremental backups.

The Mac OS X filesystem has a special version, rsyncX[5][6], which allow transferring resource forks. To run rsync on Microsoft Windows, the Cygwin package is necessary[7] to provide the expected system interfaces. A package combination is available that include rsync, cygwin, and an installer, making it easier and more familiar to Windows users[8].

There are several well written tutorials on using rsync[9][10][11][12]

If you are interested in fully automated, ssh-secured backups from one machine to another, then see the 3rd example below, and read this easy Debian Admin HOW-TO on setting up ssh/rsa keys so that you computer can securely back itself up to your Linkstation while you are not there.

Installation

Compile from source

On any distribution FreeLink or OpenLink

  1. Make sure you have installed the Precompiled C development environment, running on the LS first.
  2. Get the source, make and install.
wget http://samba.org/ftp/rsync/rsync-3.0.2.tar.gz
tar xfzv rsync-3.0.2.tar.gz 
cd rsync-2.6.9
./configure 
make 
su root 
make install

Raw Binaries

PowerPC

From the Yahoo! Linkstation General Group [13]

wget http://ls.jcedata.net/rsync
chmod a+x rsync
cp rsync /usr/bin

FreeLink

Use apt-get to install rsync

apt-get install rsync

OpenLink (Ipkg)

PowerPC

ipkg install rsync

MIPSel

Alexander Skwar has created a fairly extensive selection of Ipkg packages for the MIPSel (LS2) LinkStation, Install Ipkg and enable his feed:

ipkg install rsync

LinkStations

The manufacturer (Buffalo) of Linkstations has installed sshd and rsync software on these machines, but has not enabled it, probably because of difficulties in maintaining warranty when a backup device is converted to use as a general computer. It is relatively easy to enable this software if you are willing to void your warranty; see:

Examples

  • Create a backup named with todays date (formatted yyyy-mm-dd)
rsync -a /SOURCE/ /DEST/`date +%Y-%m-%d`/
  • A script to create a backup, named by date, which will save space by creating hard links to files which are already backed up. It requires a symbolic link to the most recently created backup dir (similar to the last line of the script)
#!/bin/sh
BACKUP_DATE=`date +%Y-%m-%d`
rsync -a --delete --link-dest=/DEST/most-recent-backup /SOURCE/ /DEST/$BACKUP_DATE/
rm /BACKUP/most-recent-backup
ln -s /DEST/$BACKUP_DATE /DEST/most-recent-backup


  • This one-liner is incorporated into my crontab, giving me a daily backup of my KuroboxHG's mp3's as it rips them, and a log of the backup. Use it either as a one-liner, or edit into separate lines and use as a bash script. It does the following:
    • prints a header with a timestamp
    • executes a secure, ssh-keyed machine-to-machine transfer, gives an rsync log, along with a record of what files are updated, and the average transfer speed of the backup
    • prints a footer with timestamp, and then a blank line
    • all the output is appended, so that the log is cumulative (I also have logrotation enabled)
echo ========kurohg mediarippertunes backup started $(date) ====== >> /var/log/musicbackuplog.txt  ;
rsync -av --rsh=ssh /mnt/mediarippertunes/ 10.0.1.10:/mnt/share/mediarippertunes >> /var/log/musicbackuplog.txt ;
echo  ======== end of backup run at $(date) ========  >> /var/log/musicbackuplog.txt ;
echo    >>  /var/log/musicbackuplog.txt     


  • Here is another one-liner which, using cron, gives me a weekly backup of my LS-HG's data partition hda3 onto a USB hard drive & logs everything. Use it either as a one-liner, or edit into separate lines and use as a bash script. It does the following:
    • prints a header with a timestamp
    • executes the backup & shows the rsync log, along with a record of what files are updated, and the average transfer speed of the backup
    • prints a footer with timestamp, and then a blank line
    • all the output is appended, so that the log is cumulative (I also have logrotation enabled)
echo ========FreeLink /dev/hda3 JFS backup started $(date) ====== >> /var/log/weeklybackuplog.txt  ;
rsync -av /mnt/share/shareddirs/ /mnt/share/usb0/backups/shareddirs >> /var/log/weeklybackuplog.txt ;
echo  ======end of backup run at $(date) =======  >> /var/log/weeklybackuplog.txt ;
echo >>  /var/log/weeklybackuplog.txt


Using keygen and other utils for automatic, unattended login via ssh

If you want to have your rsync happen automatically, then set up keys for ssh. Read the following for instructions:

http://freebsd.peon.net/quickies/21/

http://www.debianadmin.com/ssh-your-debian-servers-without-password.html

Editing Rsync rsnapshot backup for easy LS archiving

rsnapshot is a script that uses rsync to create a space saving archive of data (see http://www.rsnapshot.org/ for more details). Usually you configure rsnapshot to do periodical backups of certain disk areas. The backup destination may even be a remote site. Anybody who has a remote Unix server available does not need to read the following lines. You are better off doing a straight rsnapshot backup to your Unix machine.

But... if you only have a windows PC available, there are some limitations: Due to the different nature of Unix and Windows filesystems (ext3 vs. NTFS), a 1:1 backup from the LS to a windows PC is not possible. Even if you e. g. run Cygwin, some permissions and ownerships will get lost, if you simply rsnapshot to your PC.

The following script works this around. It assumes, that you have an rsync receiver available at the backup target. The Cygwin tools are a very good choice for providing this. What rsync_rsnapshot does, is to package all rsnapshot backups on the Unix side (your LS) and to transfer only a single TAR file to the windows PC. This has one major drawback that needs to be mentioned: rsync has a great feature to only transfer data that has been changed. This rsync feature does a great job comparing single files. It is way more inefficient to have the delta algorithm only on a compressed tar file. On the other hand, we usually have a fast network connection between the LS and the backup destination. So transferring larger volumes of data does not matter too much.

You need to configure rsnapshot to create a local backup in the first. The following script needs to be parameterized setting some global variables:

  • RS_ROOT: The root folder for your archive on the LS (you may need to calculate a considerable amount of space for this.
  • TAR_FILE: The name of the tar file on the local (LS) destination.
  • RSYNC_SERVER_DEST: The FQDN of the PC running the Cygwin tools with a listening SSHD and RSYNC installed (any Unix box would be fine although).
  • RS_MOST_RECENT: The subfolder name of the rsnapshot tree that has the largest update rate. If you´re doing daily backups, this is usually "daily.0", if running weekly backups, it should be "weekly.0".
  • RSYNC_USER_DEST: The name of the user on the destination box.
  • RSYNC_FOLDER_DEST: The destination folder on the Cygwin box.
  • RSYNC_SSH_DSA: Name of the key file for the DSA private ssh key.
#!/bin/bash
RS_ROOT='/share/backup/rsnapshot'
RS_MOST_RECENT='weekly.0'
RS_LOCK_FILE='/var/run/rsnapshot.pid'
MY_LOCK_FILE='/var/run/rsync_rsnapshot.pid'
MY_NAME=`basename $0`
TAR_FILE='/share/backup/rsnapshot/rs_nas1.tgz'
LOG_TAG='rsync_rsnapshot'
RSYNC_SERVER_DEST='dest3'
RSYNC_USER_DEST='sshrsync'
RSYNC_FOLDER_DEST='/cygdrive/n/backup/nas/'
RSYNC_SSH_DSA='dest3_backup.id_dsa'

# check for existence of own lockfile
if [ -a "${MY_LOCK_FILE}" ]; then
        # check if pid from lockfile does still exist
        LOCK_PID=`cat ${MY_LOCK_FILE}`
        LOCK_PGM=`ps -p ${LOCK_PID} -o comm=`

        # check if the lock pid is valid
        if [ "${LOCK_PGM}" = "${MY_NAME}" ]; then
                logger -p syslog.info -t ${LOG_TAG} "previous run is still active: ${LOCK_PID}"
                exit;
        else
                # write log record
                logger -p syslog.error -t ${LOG_TAG} "removing old lock file with dead or wrong process: ${LOCK_PGM}"
                rm ${MY_LOCK_FILE}
        fi
fi

# create lockfile
echo $$ > ${MY_LOCK_FILE}

# check if everything went ok
LOCK_PID=`cat ${MY_LOCK_FILE}`
if [ $$ -ne ${LOCK_PID} ]; then
        # lockfile could not be created properly
        logger -p syslog.error -t ${LOG_TAG} "could not create lock file, conflict with pid=${LOCK_PID}"
        exit 1;
fi


# check if tar file exists or if folder has newer timestamp than the tar file
if [ ${RS_ROOT}/${RS_MOST_RECENT} -nt ${TAR_FILE} ]; then
        logger -p syslog.info -t ${LOG_TAG} "start tar file creation"

        # exit if rsnapshot pid file exists
        if [ -a "${RS_LOCK_FILE}" ]; then
                # write log record
                logger -p syslog.warning -t ${LOG_TAG} "rsnapshot run in progress, exiting..."
                rm ${MY_LOCK_FILE}
                exit;
        fi

        # generate tar file (overwrite old one if it exists
        nice tar --numeric-owner --one-file-system --preserve --exclude=${TAR_FILE} -czPf ${TAR_FILE} ${RS_ROOT}
        chmod 600 ${TAR_FILE}

        # write log record
        logger -p syslog.info -t ${LOG_TAG} "new tar file created"
fi

# rsync tar archive to destination server
logger -p syslog.info -t ${LOG_TAG} "start/check for rsync transfer"
RSYNC_OUT=`nice rsync -e "ssh -i $HOME/.ssh/${RSYNC_SSH_DSA}" -av --delete-excluded --timeout=30 --partial --whole-file ${TAR_FILE} ${RSYNC_USER_DEST}@${RSYNC_SERVER_DEST}:${RSYNC_FOLDER_DEST}`

# write log record
if [ $? -eq 0 ]; then
        logger -p syslog.info -t ${LOG_TAG} ${RSYNC_OUT}
else
        logger -p syslog.error -t ${LOG_TAG} "rsync failed ($?)"
        rm ${MY_LOCK_FILE}
        exit $?
fi

# remove lock file
rm ${MY_LOCK_FILE}

The script tries to connect the backup destination all day log. Whenever it is up and running, it starts transferring the latest tar file. So the level of safety depends on the availability of your backup box.

This gives you a convenient solution to create a historized archive of your data on the LS and to transfer the whole archive to another machine, usually your working PC. Personally I use rsync_rsnapshot to keep an up-to-date mirror of my system partition. For very large data volumes, an rsync approach tends to be slower in backup performance and needs to be questioned. rsync_rsnapshot has it´s major domain in providing archive and history information for fast changing environments.

References

  1. Wikipedia: rsync
  2. rsync homepage
  3. rsync algorithm
  4. rdiff-backup homepage
  5. RsyncX - Frontend for rsync under Mac OS X
  6. Fixing rsync on MacOS X 10.4 (Tiger) - http://www.onthenet.com.au/~q/rsync/
  7. Rsync for Windows - using Cygwin
  8. NasBackup rsync Windows GUI
  9. Tutorial: Using rsync
  10. Tutorial: Mirroring with rsync
  11. Tutorial: Backing up files with rsync
  12. Tutorials (with screenshots) for setup of Rsync on Windows, Rsync on Linux/Unix/BSD and Rsync on Mac OS X
  13. From the Yahoo! Linkstation General Group: rsync binaries anyone / SSH without password as root INTO LinkStation