How To Use Rsync Between Computers

If you are new to Rsync, please visit our How To Use Rsync – The Basics post. In it we break down what Rsync is and its basic usage. It will provide you with a good background to understand the details of using Rsync between computers.

Rsync Between Remote Computers

Although Rsync does a great job of synchronizing files between local folders it really shines when working between remote computers. And if you are familiar with using ssh from the command line, you will find it relatively easy to use Rsync.

The basic command is pretty simple, and so long as you have ssh available and rsync installed on the remote machine(s) this format will work.

From a remote source:

rsync [options] [user]@[source computer]:[source folder] [destination folder].

Or to a remote source:

rsync [options] [source folder] [user]@[destination computer]:[destination folder].

Or between two remote computers:

rsync [options] [user]@[source computer]:[source folder] [user]@[destination computer]:[destination folder]

Rsync Between Remote Computers with SSH Examples

rsync user@192.168.1.1:~/source/file /home/user/destination/
rsync /home/bdoga/source/file user@192.168.1.2:~/destination/
rsync user@192.168.1.1:~/source/file user@192.168.1.2:~/destination/

In these examples the “file” will be placed in the destination directory on either the local or remote computers. Also for the remote machines you will notice that a single “:” colon was used. This indicates that rsync should use a remote shell, typically SSH to make the connection. And it will fire up rsync on the remote side of the connection to handle the details. Additionally you can force the connection to use an rsync daemon by specifying a “::” double colon instead.

Using the native rsync protocol alone is a little faster, because it doesn’t have any SSH connection overhead. But it also is not an encrypted connection, so there are trade offs to either option. I typically just use the SSH option since I typically have SSH already available and configured on my servers.

Some more useful options

I already discussed the “-a” archive option, in my Rsync Basics post. But it is my goto option for ensuring an exact copy, permissions and all is made. Now that we are connecting to a remote machine, the “-z” (zip) option gets the chance to shine a bit. When you are transferring data over the internet you may not always have a fast connection. The Zip option will ensure that, potentially, much less bandwidth is required to transfer your data.

Another option that is sometimes useful with remote connections is the “-P” (–Progress –Partial) option. This will display the current progress of the file that is being copied. And it will keep “partial” copies of files if a transfer gets interrupted during the sync. In my opinion the Progress that is displayed is great if you are transferring larger files. But if you are moving lots of little files the output is not very useful. And the overhead to produce the Progress output can cause some noticeable slowdown in a transfer.

One additional par of options are the –include, and –exclude options. They are pretty self explanatory, in that they allow you to include or exclude specific files from your sync. These options can be used to fine tune what you are copying from a directory, and ensure you only get what you want. –include ‘R‘ –exclude ‘

More Remote Computer Rsync Examples

rsync -avzhP user@192.168.1.1:/home/user/source/ /home/user/destination/
rsync -avzhP --include '*.sql' --exclude 'dbname*.sql' user@192.168.1.1:/home/user/source/ /home/user/destination/

In the above example only .sql files would be copied from the source. But no .sql files where the file name started with “dbname” would be copied. Or you could add multiple entries to ensure you got all the files you needed in one go.

rsync -avzhP --include '*.html' --include '*.php' user@192.168.1.1:/home/user/source/ /home/user/destination/

In this next example, all .html and .php files will be copied. But no other files.

Conclusion

Rsync continues to be a super useful utility in your systems administration toolkit. Now that you have a good understanding of its usage you are ready to tackle some of Rsync’s more advanced features. Or learn how other programs like Rdiff-backup build upon it to create an awesome tools. And a big thanks to some other sites which we have referenced over the years. Check them out here, and here.

How To Use Rsync – The Basics

Rsync is one of the most useful tools for a systems administrator. Regardless of what your specific roll or responsibility is. At some point you are going to need to copy the data from one place to another. And Rsync is the tool which will help ensure you quickly and accurately make a copy of your data. So in this post I hope to convey how to use Rsync, but focusing on the basic uses that I find most helpful each day.

What is Rsync

Rsync was initially built as a basic clone of “rcp” (Remote Copy) but with a handful of additional features. That handful of additional features has expanded over the years and made Rsync an indispensable tool. This simple tool can be used to copy files between directories on a local computer. Or you can use it to copy files to and from remote systems. My favorite part of Rsync is its ability to quickly compare the source and target locations. This ensures that only new, updated, or other file changes are transferred. Helping you save time and bandwidth when copying larger numbers of files.

So How do I Use Rsync?

The basic command is pretty simple, rsync [options] [source] [destination], and in this simple form you can easily copy data between local directories. ie:

rsync /home/bdoga/source/file /home/bdoga/destination/

This command will take “file” and place it inside the “/home/bdoga/destination/” directory. If you instead would like to copy all of the contents of one directory into another you simply need to add the “-r” (recursive) option. ie:

rsync -r /home/bdoga/source/ /home/bdoga/destination/

Thus all of the contents of “/home/bdoga/source/” will now be copied into “/home/bdoga/destination”. But it is important to note, that if a file with an identical name exists in the destination, it will be overwritten. In addition the “-r” option does not preserve ownership, permissions, or access/modification timestamps. But that is where the next option comes in “-a” (archive).

It is also important to note that if you want to copy just the contents of the source directory, you must end with a trailing “/”. If you fail to add the trailing “/” Rsync will copy the specified directory as well as the contents into the destination. Rather than just the contents of the directory.

The most useful options

The archive option not only copies the files recursively, but also preserves the file permissions and timestamps. I find this the most useful option because when I want to copy a source directory I typically want to be able to restore it with the permissions intact.

Another option that is sometimes useful, depending on the scenario is the “-z” (zip) option. It instructs Rsync to compress the files being copied to ensure they use less bandwidth. Not always useful when copying files over a Gigabit or faster lan, but can be helpful over a slower internet connection.

The next most useful option I frequently use is “-v” (verbose) which tells Rsync to give you more information about the files being transferred. This can be useful to see exactly what is being transferred. It also lets you know exactly what was and was not copied if there is an issue.

And then there is the “-h” (Human Readable) option which makes sure that all numbers/sizes are printed in an easily readable format. For instance rather than reporting that 856342348 bytes were transferred, it would report 816.67 MB were transferred.

And all of these options can be used together as needed. As in this example which will recursively transfer the files while preserving their permissions and timestamps. Also giving verbose output and zipping the files during transfer.

rsync -avzh /home/bdoga/source/ /home/bdoga/destination/

Sample Command Output

 
 root@bdoga:~/test# ls -lah test1
 total 4.0K
 drwxr-xr-x 3 root  root    76 Dec 21 18:33 .
 drwxr-xr-x 4 root  root    32 Dec 21 17:45 ..
 -rw-r--r-- 1 bdoga bdoga    7 Dec 21 17:47 bob
 -rw-r--r-- 1 bdoga bdoga    0 Dec 21 17:46 doug
 drwxr-xr-x 2 root  root    18 Dec 21 17:46 subdir
 -rw-r--r-- 1 bdoga bdoga  10M Dec 21 18:33 test.img
 -rw-r--r-- 1 bdoga bdoga 100M Dec 21 18:33 test2.img
 
 root@bdoga:~/test# rsync -avh ./test1/ ./test2
 sending incremental file list
 ./
 bob
 doug
 test.img
 test2.img
 subdir/
 subdir/file
 
 sent 115.37M bytes  received 122 bytes  46.15M bytes/sec
 total size is 115.34M  speedup is 1.00

 root@bdoga:~/test# rm -rf test2/*
 root@bdoga:~/test# rsync -avzh ./test1/ ./test2
 sending incremental file list
 ./
 bob
 doug
 test.img
 test2.img
 subdir/
 subdir/file
 
 sent 112.61K bytes  received 122 bytes  25.05K bytes/sec
 total size is 115.34M  speedup is 1,023.21
 
 root@bdoga:~/test# ls -lah test2
 total 111M
 drwxr-xr-x 3 root  root    76 Dec 21 18:33 .
 drwxr-xr-x 4 root  root    32 Dec 21 17:45 ..
 -rw-r--r-- 1 bdoga bdoga    7 Dec 21 17:47 bob
 -rw-r--r-- 1 bdoga bdoga    0 Dec 21 17:46 doug
 drwxr-xr-x 2 root  root    18 Dec 21 17:46 subdir
 -rw-r--r-- 1 bdoga bdoga  10M Dec 21 18:33 test.img
 -rw-r--r-- 1 bdoga bdoga 100M Dec 21 18:33 test2.img 

The above command output shows the contents of the source and destination directories. And also shows the difference between running rsync with and without the “-z” option.

Conclusion

Rsync will become a super useful part of your systems administration toolkit. Now that you have a basic understanding of how to use Rsync you are ready to see how to connect to a remote computer. Or learn how other programs like Rdiff-backup build upon it to create an awesome tools. And a big thanks to some other sites which we have referenced over the years. Check them out here, and here.

Recursively Count the Number of Files in a Directory

Why would you want to recursively count the number of files or folders in a directory? There could be a lot of different reasons. For myself, I had a client that repeatedly added new directories to a folder. Some of those directories had unique contents in them, and some were copies of other folders. The folders contained text documents, zip files, images, database files, you name it it was in there. Running a recursive ‘du’ command on the root folder showed a size of approximately 50GB. And it was obvious that there were thousands of folders and subfolders to check.

One might think of trying to use ‘ls’ (list) to get count the number of files in a directory. But running an ‘ls’ command alone will only show you the files in the directory. It won’t count the files for you. You can pair it with the ‘wc’ (word count) command and get a count of the number of lines returned. Using a command like this will give you the number of files in your current working directory:

ls -1 | wc -l

But that will only give us the number of files and folders in the current directory. So it will not give you an accurate picture of the number of files or folders in subfolders of your current working directory.

How To Recursively Count the Number of Files in a Directory

So since the “ls” command won’t give us a recursive listing of files or folders we will have to turn to the “find” utility to fulfill that requirement. Find searches recursively through a directory tree to find specific filenames or attributes you want to search for. We can use its versatility to fulfill the searching requirement of our command. For example the following command will search recursively through your current directory tree to hunt for all files and return a list of those files.

find . -type f

And likewise you can do the same to specify searching for only directories.

find . -type d

Or removing the “-type” option will return all files and folders in this folder and its children.

find .

So now that we have the list of all folders or files in this directory and its subdirectories we can count them up by adding our old friend “wc” again. Thus with a command like this we can get the full list of all the files in your current working directory and its children:

find . -type f | wc -l

or for directories only:

find . -type d | wc -l

Now you can quickly count the files and folders in a given directory to easily assess how many files you are dealing with.

A special thanks to these sites that I referenced when searching this topic myself. And may have some more details for you. You can visit those sites Here and Here.

Make a Full Disk Backup with DD

Recently I had a drive that was showing the early warning signs of failure. So I decided I had better make a backup copy of the drive. And then subsequently push that image onto another drive to avoid failure. Consequently I found that the drive was fine. It was the SATA cable that was failing. But the process helped remind me of what a useful tool dd is. Subsequently it refreshed my knowledge of how to use this remarkable tool. And finally helped remind me how to make a full disk backup with dd.

What is DD?

DD stands for “Data Definition”, it has been around since about 1974. It can be used to read write and convert data between filesystems, folders and other block level devices. As a result dd can be used effectively for copying the content of a partition, obtaining a fixed amount of random data from /dev/random, or performing a byte order transformation on data.

So Lets Make a Full Disk Backup with DD

I will start with the command I used to make a full disk backup with dd. And then give you a breakdown of the different command elements to help you understand what it is doing.

dd if=/dev/sdc conv=sync,noerror status=progress bs=64K | gzip -c > backup_image.img.gz

The command options break down like this:

if=/dev/sdc this defines the “input file” which in this case is the full drive “/dev/sdc”. You could do the same with a single partition like “/dev/sdc1”, but I want all the partitions on the drive stored in the same image.

conv=sync,noerror the “sync” part tells dd to pad each block with nulls, so that if there is an error and the full block cannot be read the original data will be preserved. The “noerror” portion prevents dd from stopping when an error is encountered. The “sync” and “noerror” options are almost always used together.

status=progress tells the command to regularly give an update on how much data has been copied. Without this option the command will still run but it won’t give any output until the command is complete. So making a backup of a very large drive could sit for hours before letting you know it is done. With this option a line like this is constantly updated to let you know how far along the process has gone.

1993998336 bytes (2.0 GB, 1.9 GiB) copied, 59.5038 s, 33.5 MB/s

bs=64K specifies that the “Block Size” of each chunk of data processed will be 64 Kilobytes. The block size can greatly affect the speed of the copy process. A larger block size will typically accelerate the copy process unless the block size is so large that it overwhelms the amount of RAM on your computer.

Making a compressed backup image file

At this point you could use the “of=/dev/sdb” option to output the contents directly to another drive /dev/sdb. But I opted to make an image file of the drive, and piping the dd output through gzip allowed me to compress the resulting image into a much smaller image file.

| gzip -c pipes the output of dd into the gzip command and writes the compressed data to stdout. Other options could be added here to change the compression ratio, but the default compression was sufficient for my needs.

> backup_image.img.gz redirects the output of the gzip command into the backup_image.img.gz file.

With that command complete I had copied my 115GB drive into a 585MB compressed image. Most of the drive had been empty space, but without the compression the image would have been 115GB. So this approach can make a lot of sense if you are planning on keeping the image around. If you are just copying from one drive to another then no compression is needed.

So there you have it, the process of making a full disk backup with dd. But I guess that is only half the story, so now I will share the command I used to restore that image file to another drive with dd.

Restoring a Full Drive Backup with DD

Fortunately the dd restore process is a bit more straightforward than the backup process. So without further adieu here is the command.

gunzip -c backup_image.img.gz | dd of=/dev/sdc status=progress

gunzip -c backup_image.img.gz right off the bat “gunzip” starts decompressing the file “backup_image.img.gz” and the “-c” sends the decompressed output to stdout.

| dd of=/dev/sdc pipes the output from gunzip into the dd command which is only specifying the “output file” of “/dev/sdc”.

status=progress again this option displays some useful stats about how the dd process is proceeding.

Once the has completed the transfer you should be good to go. But a couple caveats to remember. First the drive you restore to should be the same size or larger than the backup drive. Second, if the restore drive is larger, you will end up with empty space after the restore is complete. ie: 115GB image restored to a 200GB drive will result in the first 115GB of the drive being usable, and 85GB of free space at the end of the drive. So you may want to expand the restored partition(s) to fill up the extra space on the new drive with parted, or a similar tool. Lastly, if you use a smaller drive for the restore dd will not warn you that it won’t fit, it will just start copying and will fail when it runs out of space.

Conclusion

DD is an amazing tool that has been around for a while. And it continues to be relevant and useful each day. It can get you out of a bind and save your data, so give it a whirl and see what it can help you with today.

Here are a couple resources that I referenced to help me build my dd command. A guide on making a full metal backup with dd. And a general DD usage guide.