Backup Storage

Mixtape uses a standardized backup storage layout that it may be helpful to understand. This documents outlines the basics.

File system location

By default, Mixtape backups are stored and accessed from the following default location:

/backup/<host>/mixtape

The --backup-dir=<dir> option (to all commands) allows replacing the /backup part of this location. Similarily, the --mixtape-dir=<dir> option replaces the whole location with a specified directory.

Directory layout

Below is an example of a small Mixtape backup directory:

+- config
+- index/
|   +- index.2017-01-19-0846.txt.xz
|   +- index.2017-01-20-1149.txt.xz
+- data/
    +- 33d/
    |   +- 847/
    |       +- image.jpg
    +- 68e/
    |   +- 24f/
    |       +- long.txt.xz
    +- files/
        +- 2017-03/
            +- files.2017-01-19-0846.tar.xz
            +- files.2017-01-20-1149.tar.xz

Configuration file

The config file in the Mixtape directory lists all the backup paths to be stored (one per line). Paths should be absolute, e.g. /opt/subdir.

By prefixing a line with -, the specified path will be excluded from the backup. If a line prefixed with - doesn’t contain a / path separator, all files with the specified name are excluded.

Below is a default config file containing only named excludes of .git and others:

# Configuration of backup includes and excludes.
#
# /dir             includes '/dir' into backup
# - /dir/subdir    excludes '/dir/subdir' from backup
# - name           excludes all 'name' files and dirs

- .git
- .hg
- .svn

The config file can be edited with any text editor. Changes take effect on the next run of mixtape-backup. Changes will not have any effect on previous backups.

Index files

The index files contains a list of all files and directories backed up at a specific point in time. The files are tab-separated text files with columns for all the file meta-data saved, including the sha256sum of each file and the location inside the backup data directory.

Below is a small excerpt from an example index file (with a few line breaks added for readability):

drwxr-xr-x	root	root	2017-01-19 08:44:48	4	/test
-rw-r--r--	root	root	2017-01-18 08:01:16	4	/test/README.txt
	dcf1ab049b0a5c9bad7555a64bc3ea1d625c658d432bef956178b063d665b172
	files/2017-03/files.2017-03-03-0647.tar.xz
drwxr-xr-x	root	root	2017-01-19 08:44:53	4	/test/subdir
-rw-r--r--	root	root	2017-01-19 08:34:01	264	/test/subdir/loremipsum.txt
	68e24fba182bb7272ba855b59be46f9fb59fb5e60757d4fd6c6c481e39f0e507
	68e/24f/loremipsum.txt.xz

A new index file is created on each run of mixtape-backup, which (depending on backup frequency) may lead to many such files being present. Storing too many index files will slow down searches and use more storage. Use the mixtape-gc tool or plain rm to remove older index files. Note that mixtape-gc also removes files from the data directory that are no longer referenced by any index.

Data files

The data directory contains copies of the actual files backed up. Smaller files (less than 256 KB in size) are stored into tar.xz archives under the data/files/<year>-<month>/ directory. Only new or modified files are stored into each archive, so the archive from the initial backup is usually much larger than the other archives. See below for an example archive listing:

-rw-r--r-- 1 root root 481280 Feb 14 02:07 files.2017-02-14-0207.tar.xz
-rw-r--r-- 1 root root    456 Feb 15 02:07 files.2017-02-15-0207.tar.xz
-rw-r--r-- 1 root root   3304 Feb 17 02:07 files.2017-02-17-0207.tar.xz

Larger files are stored in a data/<sha-part-1>/<sha-part-2>/ directory based on the sha256sum of their content. As the files are modified, multiple copies will be stored under different directories.

Large files are normally also compressed, unless the file extension implies that the file is already compressed (e.g. .jpg, .zip, etc).