Skip to content

Linux: Backup and Recovery

Basics

. refers to the current directory

.. refers to the parent directory of the current directory

~ refers to the home directory

Archiving

Archiving combines multiple files into one package, making them easier to backup, transfer, or organize. tar and cpio are popular tools used for archiving.

tar

tar packages multiple files or directories into a single archive file. The syntax is tar [OPTIONS] <ARCHIVE NAME> [FILE1 FILE2, DIR1...]. Common options includes:

  • -c to create an archive
  • -x to extract files
  • -t to list the contents of an archive
  • -v for verbose output
  • -r to append files to an existing archive
  • -f to specify te archive file name
  • -z for gzip
  • -j for bzip2
  • -J for xz

ex:

tar -czvf backup.tar.gz data/ # to create an archive of the data/ directory using gzip

cpio (copy in/out)

cpio get the list of file to archive from another command like find or ls. The general syntax using find is find [FILES] | cpio -ov > [ARCHIVE NAME].cpio. The following are the main 3 modes:

  • -o to create an archive (copy-out)
  • -i to extract and archive (copy-in)
  • -p to copy files (copy-pass)

additional options:

  • -d to create directories as needed
  • -v for verbose output
  • -u to override existing files
  • -t to list archive content

ex:

find /configs -type f | cpio -o > config_bk.cpio # to create an archive

cpio -id < backup.cpio # to extract an   archive

cpio -it < backup.cpio # to list the content of the archive

find data/ -name "*.conf" | cpio -pvd /backups/configs # to copy files

Compression Tools

Compression tools helps shrink files size.

gzip

gzip is widely used for its speed and simplicity. It uses the .gz format. For backup it is recommended to use tar + gzip (-cvfz). Common options include

  • -d to decompress files
  • -f to override files without asking
  • -n to skip storing the original file name and timestamp
  • -N to save the original file name and timestamp
  • -q for quiet mode
  • -r to compress directories recursively
  • -l to show statistics
  • -t to test the integrity of the compressed file
  • -v for verbose mode
  • -1...-9 to specify compression level

ex:

gzip myfile.txt # to compress a file and delete the original

gzip -k myfile.txt # to compress a file and keep the original

gzip -k myfile1.txt myfile2.txt myfile3.txt # to compress a file and keep the original

gzip -vr /var/log/ # to compress the content of the folder with verbose output

gzip -9 image.iso # to compress with maximum level (levels range 1-9 default is 6)

zcat myfile.txt.gz # to view compress file content

gunzip myfile.txt.gz # to uncompress an archive

bzip2

bzip2 offers a better compression but slower to complete compare to gzip. The syntax is bzip2 [OPTIONS] <FILE NAME>

  • bzip2 is used for compressing files
  • bunzip2 to uncompress files
  • bzcat to view content of a compressed file without extracting it
  • bzip2recover to attempt to recover data from a damaged archive
  • bzless and bzmore to scroll through compressed text files one page at a time

ex:

bzip2 myfile.txt # to compress a file and delete the original file

bzip2 -k myfile.txt # to compress a file and keep the original file

bunzip2 myfile.txt.bz2 # to decompress a file

bzip2 -t myfile.txt.bz2 # to test the integrity of a compressed file

bcat myfile.txt.bz2 # to list the content of a compressed file

xz

xz is a newer compression tool that offers a higher compression but is even slower than gzip and bzip2. It is great for archiving files that do not change often. The syntax is xz [OPTIONS] <FILE NAME>. Command options include:

  • -d to decompress an compressed archive
  • -f to override files
  • -q for quite mode
  • -v for verbose mode
  • -t to test compressed file

ex:

xz myfile.txt # to compress a file and delete the original file

xz -k myfile.txt # to compress a file and keep the original file

xz -d myfile.txt.xz # to decompress a file

unxz myfile.txt.xz # to decompress a file

xz -t myfile.txt.xz # to test the integrity of a compressed file

xz -l myfile.txt.xz # to list the content of a compressed file

7-Zip

7-Zip is used where compatibility with Windows system is needed. It is more flexible because it handles multiple archive format like .7z, .zip, and .tar. It is usually available through the p7zip package. Common options include:

  • -a to add files to an archive
  • -x to extract files from an archive
  • -l to list archive content
  • -t to test an archive
  • -d to delete files from an archive

ex:

7z a backup.7z file1 file2 data/ # to create a compressed archive

7z x backup.7z # to extract a compressed file

7z l backup.7z # to list the content of a compressed file

7z t backup.7z # to test a compressed file

7z a -mx=9 backup.7z image.iso

Data Recovery

dd (data duplicator)

dd copy data at the block level and is useful for creating exact images of disks or partitions. It is commonly used for disk cloning, creating bootable USB drive, doing backup and restore, and wiping disks. The basic syntax is dd if=<INPUT FILE> of=<OUTPUT FILE> [OPTIONS]. Common options include:

  • if= input file/device
  • of= output file/device
  • bs= block size. The default is 512 bytes
  • count= number of blocks to copy
  • skip= number of input blocks to skip
  • seek= number of output blocks to skip before writing
  • status=progress to show progress
  • conv=noerror,sync to copy pass read error in bad blocks.

ex:

dd if=image.iso of=/dev/sdb1 bs=4M status=progress # to create a bootable USB drive

dd if=/dev/sda of=diskA.img bs=1M status=progress # to create a disk image

dd if=diskA.img of=/dev/sda bs=1M status=progress # restore data from an image

dd if=/dev/zero of=/dev/sdb bs=1M status=progress # to completely erase a disk

dd if=/dev/zero of=test_file bs=1G count=1 oflag=dsync # to test the write speed of a disk

ddrescue

ddrescue is used to recover data from damaged drives. The basic syntax is dd [OPTIONS] <INPUT FILE> <OUTPUT FILE> <LOG FILE>

ex:

ddrescue /dev/sdb damaged.img rescue.log # to attempt rescuing /dev/sdb

rsync

rsync is used to synchronize files and directories over the network. After the first copy, it copies only differential changes in subsequent copy. The basic syntax is rsync [OPTIONS] <SOURCE> <DESTINATION>. Important options are:

  • -r# to copy recursively
  • -a# to copy in archive mode preserving permissions, symblinks, and timestampts
  • -n# to see what would be copied (Dry run)
  • -z# to enable compression during transfer
  • -h# to see a human-readable output
  • -v# for verbose mode
  • --progress# to show progress
  • --delete# to remove files in destination that are not present in the source

ex:

rsync -avh /home/user/ /mnt/backup/user # to copy user directory with all attributes preserved

rsync -avh user@server:/data/ /home/user/data/ # to sync from remote server to local

rsync -avh --bwlimit=4000 /home/user/ user@server:/backup/ # with a bandwidth limit = 4000KB/s

Compressed File Operations

zcat

zcat displays the full content of a compressed file.

zcat myfile.txt.gz # to show the content of the compressed file

zless

zless allows scrolling through the content of a compressed file interactively

zless myfile.txt.gz # to show the content of the compressed file in a scrollable mode

zgrep

zgrep allows searching through compressed data. The syntax is zgrep [OPTIONS] <SEARCH PATTERN> <FILE NAME> Common options include:

  • -i # to make the search case-insensitive
  • -n # to show line numbers
  • -v to show lines that do not match the query

ex:

zgrep "ERROR" logs.gz # to search for lines containing text 'ERROR'

zgrep -i "failed password" /var/log/auth.log.1.gz # to find all login attempts