Backups, yet again #
Continuing to talk about backups I want to share a script I have written for the sole purpose of copying the content of mass storage devices that contain media files. This applies for example to SD cards used in digital cameras, but it can be used for any removable block device.
Automation #
Use of metadata #
The script that I called auto_media_backup.sh
works on a simple principle: it
uses the metadata of the original file to compute the destination directory
of the copied file, like this:
- get the base destination directory from the configuration file (variable
DST_PATH
) - get the device’s UUID using
$ lsblk -o name,uuid
(variableuuid
) - get year and month of the media file either using the
EXIF data, if available, or retrieving
some of the filesystems’ metadata
related to the last access or last modification (variables
year
,month
andfilename
)
I opted for the ${DST_PATH}/${uuid}/${year}/${month}/${filename}
scheme.
Block device UUID monitoring #
Another important aspect is that when I insert the card in the reader the synchronization starts automatically:
[...] udevadm monitor --udev -s block [...]
To avoid synchronizing every removable block device I opted for a whitelisting method based on the UUID we computed earlier. A code extract should clarify the situation:
[...]
for uuid in ${WHITELIST_UUID}; do
if [ "${uuid}" = "$(get_uuid "${devname}")" ]; then
sync_started_message "${uuid}"
[...]
Logging #
There are four possible ways to know if the operation started and how it finished. Each logging system is independent from the other so it possibile to activate or deactivate them singularly. These systems are:
- logging to
stdout
- logging to a text file
- emailing using the program
mail
- beeping using the program
beep
If you use the mail facility make sure for it to be already configured and
working. In my case I used the
s-nail
package along with
msmtp
as the
SMTP client.
Listening for beeps is a very immediate action and it does not require to be in front of the computer’s monitor or checking the email. Think about how microwaves, fridges, ovens, washing machines and other domestic appliances work.
The script logs when a synchronization starts, finishes or fails and does not know about its progress.
Loop #
The script will continue to monitor for new devices when in idle and does not quit even if the previous synchronization has failed. If you need to edit the configuration file be sure to reload the script.
Synchronization #
Rsync was an obvious choice for this case because it provides fast, incremental and secure file copying just with a bunch of options.
The script #
I saved the following code listing as auto_media_backup.sh
. Due to the fact
that a small part of the script was took and modified from the Arch
Wiki,
this code is licensed under the GNU Free Documentation License
1.3 or later.
Other small parts were took from a github
gist which was put
in the public domain.
#!/bin/bash
# auto_media_backup.sh
#
# Copyright (C) 2018 Franco Masotti <franco.masotti@live.com>.
# Permission is granted to copy, distribute and/or modify this document
# under the terms of the GNU Free Documentation License, Version 1.3
# or any later version published by the Free Software Foundation;
# with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts.
# A copy of the license is included in the section entitled "GNU
# Free Documentation License".
# Put the full path if you are calling the script from a different directory.
. ./auto_media_backup.conf
# Check that all the software is installed.
# All the tools used in this script should be pretty standard.
check_tools()
{
which \
bash \
mail \
lsblk \
grep \
awk \
udevadm \
udisksctl \
stat \
stdbuf \
rsync \
rm \
beep \
exiftool
} 1>&2
get_uuid()
{
local kernel_device_name="${1}"
lsblk -o name,uuid | grep "${kernel_device_name/\/dev\//}" \
| awk '{print $2}'
}
pathtoname()
{
local kernel_device_name="${1}"
udevadm info -p /sys/"${kernel_device_name}" \
| awk -v FS== '/DEVNAME/ {print $2}'
}
get_mountpoint()
{
local kernel_device_name="${1}"
udisksctl info --block-device "${kernel_device_name}" \
| grep "MountPoints" | awk '{$1=""; print $0}' | sed -e 's/^[ \t]*//'
}
log()
{
local message="${1}"
local log_file_preamble="${2}"
local log_email_subject="${3}"
local log_beep_args="${4}"
if [ "${LOG_TO_BEEP}" = "true" ]; then
beep $log_beep_args
fi
if [ "${LOG_TO_STDOUT}" = "true" ]; then
echo -e "${message}"
fi
if [ "${LOG_TO_FILE}" = "true" ]; then
echo -e "${log_file_preamble} ${message}" >> "${LOG_FILE}"
fi
if [ "${LOG_TO_EMAIL}" = "true" ]; then
echo "${message}" | mail -r "${EMAIL_SENDER}" -s "${log_email_subject}" "${EMAIL_RECIPIENT}"
fi
}
sync_started_message()
{
local uuid="${1}"
message="Starting sync for "${uuid}""
log "${message}" \
"[$(date '+%Y-%m-%d, %T') START auto_media_backup]" \
"Media transfer starting" \
"-f 1000 -l 400 -D 20"
}
sync_completed_message()
{
local uuid="${1}"
local start="${2}"
local end="${3}"
local number_of_files="${4}"
message="Sync successful for "${uuid}", in $(( ($end-$start) ))s. $synced_files synced files"
log "${message}" \
"[$(date '+%Y-%m-%d, %T') OK auto_media_backup]" \
"Media transfer complete" \
"-r 3 -f 2000 -l 400 -D 20"
}
sync_failed_message()
{
local uuid="${1}"
local retval="${2}"
message="Sync for "${uuid}" failed with ${retval}"
log "${message}" \
"[$(date '+%Y-%m-%d, %T') FAILED auto_media_backup]" \
"Media transfer failed" \
"-f 500 -r 4 -d 200"
}
# Since we are going to use the shoot date to organize the directories
# it needs to be computed using EXIF data.
# If the file does not have EXIF data with coherent date field. Use
# filesystem timestamps instead.
# See https://en.wikipedia.org/wiki/Comparison_of_file_systems#Metadata
# for a list of supported timestamps for the most common filesystems.
# 0. Use EXIF data DateTimeOriginal
# 1. Use EXIF data MediaCreateDate
# 2. Use last modification time
# 3. Use last access time
# 4. Bail out
compute_media_date()
{
local media="${1}"
computed_date="$(exiftool -quiet -s -s -s -tab -dateformat "%Y-%m-%d" \
-DateTimeOriginal "${media}")"
if [ -z "${computed_date}" ]; then
computed_date="$(exiftool -quiet -s -s -s -tab -dateformat "%Y-%m-%d" \
-MediaCreateDate "${media}")"
fi
if [ -z "${computed_date}" ]; then
computed_date="$(stat -c%y "${media}" | awk '{print $1}')"
fi
if [ -z "${computed_date}" ]; then
computed_date="$(stat -c%x "${media}" | awk '{print $1}')"
fi
if [ -z "${computed_date}" ]; then
return 1
fi
echo "${computed_date}"
}
compute_media_dst_top()
{
local computed_date="${1}"
local dst_base_path="${2}"
local month="$(date -d ${computed_date} "+%m")"
local year="$(date -d ${computed_date} "+%Y")"
echo ""${dst_base_path}"/"${year}"/"${month}""
}
compute_media_dst()
{
local media_base_dst="${1}"
echo "${media_base_dst}/"$(basename "${media}")""
}
rsync_media()
{
local src="${1}"
local dst="${2}"
mkdir -p "${dst_top}"
rsync -ab --backup-dir="${dst_top}"_backup "${src}" "${dst}"
}
# See https://en.wikipedia.org/wiki/Design_rule_for_Camera_File_system
get_media()
{
local src="${1}"
local dst_base_path="${2}"
# See https://gist.github.com/jvhaarst/2343281
# which was put in public domain.
num_of_files=0
for media in $(find "${src}" -not -wholename "*._*" -iname "*.JPG" \
-or -iname "*.JPEG" -or -iname "*.CRW" -or -iname "*.THM" -or -iname "*.RW2" \
-or -iname '*.ARW' -or -iname "*AVI" -or -iname "*MOV" -or -iname "*MP4" \
-or -iname "*MTS" -or -iname "*PNG"); do
dst_top="$(compute_media_dst_top "$(compute_media_date "${media}")" "${dst_base_path}")"
dst="$(compute_media_dst "${dst_top}" "${media}")"
rsync_media "${media}" "${dst}" || return 1
num_of_files=$(($num_of_files+1))
done
echo "${num_of_files}"
}
# DANGEROUS. Disabled by default.
delete_media()
{
local src="${1}"
echo "NOOP. Source file deletion is disabled because the author does not \
want to be liable for any data loss"
echo "Edit the \"delete_media\" function to enable it"
# rm -rf "${src}"/*
}
loop()
{
local event="${1}"
local devpath="${2}"
if [ "${event}" = add ]; then
devname=$(pathtoname "${devpath}")
for uuid in ${WHITELIST_UUID}; do
if [ "${uuid}" = "$(get_uuid "${devname}")" ]; then
sync_started_message "${uuid}"
start=$(date +%s)
udisksctl mount --block-device "${devname}" --no-user-interaction \
|| { sync_failed_message "${uuid}" "$?"; return 1; }
mountpoint="$(get_mountpoint "${devname}")"
final_dst_path=""${DST_PATH}"/"${uuid}""
synced_files=$(get_media "${mountpoint}" "${final_dst_path}") \
|| { sync_failed_message "${uuid}" "$?"; return 1; }
# Just in case.
sync
if [ "${DELETE_SRC_MEDIA_ON_SYNC_SUCCESS}" = "true" ]; then
delete_media "${mountpoint}"
fi
udisksctl unmount --block-device "${devname}" --no-user-interaction
end=$(date +%s)
sync_completed_message "${uuid}" "${start}" "${end}" "${synced_files}"
fi
done
fi
}
# Original source for the automount part:
# https://wiki.archlinux.org/index.php/Udisks#udevadm_monitor
check_tools || exit 1
printf "%s\n" "Monitoring for these UUIDs:"
printf "%s\n" "==========================="
for uuid in ${WHITELIST_UUID}; do
printf "%s\n" "${uuid}"
done
# Note that this is a loop.
stdbuf -oL -- udevadm monitor --udev -s block | while read -r -- _ _ event devpath _; do
loop "${event}" "${devpath}"
done
The configuration file #
You must save and edit the following as auto_media_backup.conf
.
This file must be placed in the same directory as the script.
Variables should be self explanatory.
Please note that the script does not contain any parsing concerning these variables. You, as the user, must take care of this.
# Put your UUIDs separated by a white space character
WHITELIST_UUID="0000-0001 0000-0002"
DST_PATH="/home/user/media_backup"
LOG_TO_STDOUT="true"
LOG_TO_FILE="true"
LOG_FILE="/var/log/auto_media_backup.log"
LOG_TO_EMAIL="true"
EMAIL_SENDER="Computer <your.email@address.com>"
# You can use aliases if your mailing system is set up correctly.
EMAIL_RECIPIENT="all"
LOG_TO_BEEP="true"
# If you want to "recycle" your devices set the following to true.
# See the script for a disclaimer and how to really enable it.
DELETE_SRC_MEDIA_ON_SYNC_SUCCESS="false"
Running #
To run the script in the background simply do:
$ chmod +x auto_media_backup.sh
$ ./auto_media_backup.sh &
Future steps #
- More photo compression and/or scaling to save precious space.
- Same for videos.
~
Till next time.