Move most documentation into a doc/ subdirectory.

author Michael Vrable <mvrable@cs.ucsd.edu>

Thu, 13 Sep 2007 20:18:19 +0000 (13:18 -0700)

committer Michael Vrable <mvrable@turin.ucsd.edu>

Thu, 13 Sep 2007 20:18:19 +0000 (13:18 -0700)
author Michael Vrable <mvrable@cs.ucsd.edu>
Thu, 13 Sep 2007 20:18:19 +0000 (13:18 -0700)
committer Michael Vrable <mvrable@turin.ucsd.edu>
Thu, 13 Sep 2007 20:18:19 +0000 (13:18 -0700)
diff --git a/design.txt b/design.txt

deleted file mode 100644 (file)

index e515359..0000000
--- a/design.txt
+++ /dev/null
@@ -1,111 +0,0 @@
-This document aims to describe the goals and constraints of the LBS
-design.
-
-========================================================================
-
-OVERALL GOALS: Efficient creation and storage of multiple backup
-snapshots of a filesystem tree.  Logically, each snapshot is
-self-contained and stores the state of all files at a single point in
-time.  However, it should be possible for snapshots to share the same
-underlying storage, so that data duplicated in many snapshots need not
-be stored multiple times.  It should be possible to delete old
-snapshots, and recover (most of) the storage associated with them.  It
-must be possible to delete old backups in any order; for example, it
-must be possible to delete intermediate backups before long-term
-backups.  It should be possible to recover the files in a snapshot
-without transferring significantly more data than that stored in the
-files to be recovered.
-
-CONSTRAINTS: The system should not rely upon a smart server at the
-remote end where backups are stored.  It should be possible to create
-new backups using a single primitive: StoreFile, which stores a string
-of bytes at the backup server using a specified filename.  Thus, backups
-can be run over any file transfer protocol, without requiring special
-software be installed on the storage server.
-
-========================================================================
-
-DESIGN APPROACHES
-
-STORING INCREMENTAL BACKUPS
-
-One simple approach is to simply store a copy of every file one the
-remote end, and construct a listing which tells where each file in the
-source ends up on the remote server.  For subsequent backups, if a file
-is unchanged, the listing can simply point to the location of the file
-from the previous backup.  Deleting backups is simple: delete the
-listing file for a particular snapshot, then garbage collect all files
-which are no longer referenced.
-
-This approach does not as efficiently handle partial changes to large
-files.  If a file is changed at all, it needs to be transferred in its
-entirety.  One approach is to represent intra-file changes by storing
-patches.  The original file is kept, and a smaller file is transferred
-that stores the differences between the original and the new.  Some care
-is needed, however.  A series of small changes could accumulate over
-many snapshots.  If each snapshot refers to the original file, much data
-will be duplicated between the patches in different snapshots.  If each
-patch can refer to previous patches as well, a long chain of patches can
-build up, which complicates removing old backups to reclaim storage.
-
-An alternative approach is to break files apart into smaller units
-(blocks) and to represent files in a snapshot as the concatenation of
-(possibly many) blocks.  Small change to files can be represented by
-replacing a few of the blocks, but referring to most blocks used in the
-old file directly.  Some care is needed with this approach as
-well--there is additional overhead needed to specify even the original
-file, since the entire list of blocks must be specified.  If the block
-size is too small, this can lead to a large overhead, but if the block
-size is too large, then sharing of file data may not be achieved.  In
-this scheme, data blocks do not depend on other data blocks, so chains
-of dependencies do not arise as in the incremental patching scheme.
-Each snapshot is independent, and so can easily be removed.
-
-One minor modification to this scheme is to permit the list of blocks to
-specify that only a portion of a block should be used to reconstruct a
-file; if, say, only the end of a block is changed, then the new backup
-can refer to most of the old block, and use a new block for the small
-changed part.  Doing so does allow the possibility that a block might be
-kept around even though a portion of it is being used, leading to wasted
-space.
-
-
-DATA STORAGE
-
-The simplest data storage format would place each file, patch, or block
-in a separate file on the storage server.  Doing so maximizes the
-ability to reclaim storage when deleting old snapshots, and minimizes
-the amount of extra data that must be transferred to recover a snapshot.
-Any other format which combines data from multiple files/patches/blocks
-together risks having needed data grouped with unwanted data.
-
-However, there are reasons to consider grouping, since there is overhead
-associated with storing many small files.  In any transfer protocol
-which is not pipelined, transferring many small files may be slower than
-transferring the same quantity of data in larger files.  Small files may
-also lead to more wasted storage space due to internal fragmentation.
-Grouping files together gives the chance for better compression, taking
-advantage of inter-file similarity.
-
-Grouping is even more important if the snapshot format breaks files
-apart into blocks for storage, since the number of blocks could be far
-larger than the number of files being backed up.
-
-========================================================================
-
-SELECTED DESIGN
-
-At a high level, the selected design stores snapshots by breaking files
-into blocks for storage, and does not use patches.  These data blocks,
-along with the metadata fragments (collectively, the blocks and metadata
-are referred to as objects) are grouped together for storage purposes
-(each storage group is called a segment).
-
-TAR is chosen as the format for grouping objects together into segments
-rather than inventing a new format.  Doing so makes it easy to
-manipulate the segments using other tools, if needed.
-
-Data blocks for files are stored as-is.  Metadata is stored in a text
-format, to make it more transparent.  (This should make debugging
-easier, and the hope is that this will make understanding the format
-simpler.)
diff --git a/doc/design.txt b/doc/design.txt

new file mode 100644 (file)

index 0000000..e515359
--- /dev/null
+++ b/doc/design.txt
@@ -0,0 +1,111 @@
+This document aims to describe the goals and constraints of the LBS
+design.
+
+========================================================================
+
+OVERALL GOALS: Efficient creation and storage of multiple backup
+snapshots of a filesystem tree.  Logically, each snapshot is
+self-contained and stores the state of all files at a single point in
+time.  However, it should be possible for snapshots to share the same
+underlying storage, so that data duplicated in many snapshots need not
+be stored multiple times.  It should be possible to delete old
+snapshots, and recover (most of) the storage associated with them.  It
+must be possible to delete old backups in any order; for example, it
+must be possible to delete intermediate backups before long-term
+backups.  It should be possible to recover the files in a snapshot
+without transferring significantly more data than that stored in the
+files to be recovered.
+
+CONSTRAINTS: The system should not rely upon a smart server at the
+remote end where backups are stored.  It should be possible to create
+new backups using a single primitive: StoreFile, which stores a string
+of bytes at the backup server using a specified filename.  Thus, backups
+can be run over any file transfer protocol, without requiring special
+software be installed on the storage server.
+
+========================================================================
+
+DESIGN APPROACHES
+
+STORING INCREMENTAL BACKUPS
+
+One simple approach is to simply store a copy of every file one the
+remote end, and construct a listing which tells where each file in the
+source ends up on the remote server.  For subsequent backups, if a file
+is unchanged, the listing can simply point to the location of the file
+from the previous backup.  Deleting backups is simple: delete the
+listing file for a particular snapshot, then garbage collect all files
+which are no longer referenced.
+
+This approach does not as efficiently handle partial changes to large
+files.  If a file is changed at all, it needs to be transferred in its
+entirety.  One approach is to represent intra-file changes by storing
+patches.  The original file is kept, and a smaller file is transferred
+that stores the differences between the original and the new.  Some care
+is needed, however.  A series of small changes could accumulate over
+many snapshots.  If each snapshot refers to the original file, much data
+will be duplicated between the patches in different snapshots.  If each
+patch can refer to previous patches as well, a long chain of patches can
+build up, which complicates removing old backups to reclaim storage.
+
+An alternative approach is to break files apart into smaller units
+(blocks) and to represent files in a snapshot as the concatenation of
+(possibly many) blocks.  Small change to files can be represented by
+replacing a few of the blocks, but referring to most blocks used in the
+old file directly.  Some care is needed with this approach as
+well--there is additional overhead needed to specify even the original
+file, since the entire list of blocks must be specified.  If the block
+size is too small, this can lead to a large overhead, but if the block
+size is too large, then sharing of file data may not be achieved.  In
+this scheme, data blocks do not depend on other data blocks, so chains
+of dependencies do not arise as in the incremental patching scheme.
+Each snapshot is independent, and so can easily be removed.
+
+One minor modification to this scheme is to permit the list of blocks to
+specify that only a portion of a block should be used to reconstruct a
+file; if, say, only the end of a block is changed, then the new backup
+can refer to most of the old block, and use a new block for the small
+changed part.  Doing so does allow the possibility that a block might be
+kept around even though a portion of it is being used, leading to wasted
+space.
+
+
+DATA STORAGE
+
+The simplest data storage format would place each file, patch, or block
+in a separate file on the storage server.  Doing so maximizes the
+ability to reclaim storage when deleting old snapshots, and minimizes
+the amount of extra data that must be transferred to recover a snapshot.
+Any other format which combines data from multiple files/patches/blocks
+together risks having needed data grouped with unwanted data.
+
+However, there are reasons to consider grouping, since there is overhead
+associated with storing many small files.  In any transfer protocol
+which is not pipelined, transferring many small files may be slower than
+transferring the same quantity of data in larger files.  Small files may
+also lead to more wasted storage space due to internal fragmentation.
+Grouping files together gives the chance for better compression, taking
+advantage of inter-file similarity.
+
+Grouping is even more important if the snapshot format breaks files
+apart into blocks for storage, since the number of blocks could be far
+larger than the number of files being backed up.
+
+========================================================================
+
+SELECTED DESIGN
+
+At a high level, the selected design stores snapshots by breaking files
+into blocks for storage, and does not use patches.  These data blocks,
+along with the metadata fragments (collectively, the blocks and metadata
+are referred to as objects) are grouped together for storage purposes
+(each storage group is called a segment).
+
+TAR is chosen as the format for grouping objects together into segments
+rather than inventing a new format.  Doing so makes it easy to
+manipulate the segments using other tools, if needed.
+
+Data blocks for files are stored as-is.  Metadata is stored in a text
+format, to make it more transparent.  (This should make debugging
+easier, and the hope is that this will make understanding the format
+simpler.)
diff --git a/doc/format.txt b/doc/format.txt

new file mode 100644 (file)

index 0000000..78f32c3
--- /dev/null
+++ b/doc/format.txt
@@ -0,0 +1,254 @@
+                       Backup Format Description
+                  for an LFS-Inspired Backup Solution
+
+NOTE: This format specification is not yet complete.  Right now the code
+provides the best documentation of the format.
+
+This document simply describes the snapshot format.  It is described
+from the point of view of a decompressor which wishes to restore the
+files from a snapshot.  It does not specify the exact behavior required
+of the backup program writing the snapshot.
+
+This document does not explain the rationale behind the format; for
+that, see design.txt.
+
+
+DATA CHECKSUMS
+==============
+
+In several places in the LBS format, a cryptographic checksum may be
+used to allow data integrity to be verified.  At the moment, only the
+SHA-1 checksum is supported, but it is expected that other algorithms
+will be supported in the future.
+
+When a checksum is called for, the checksum is always stored in a text
+format.  The general format used is
+    <algorithm>=<hexdigits>
+
+<algorithm> identifies the checksum algorithm used, and allows new
+algorithms to be added later.  At the moment, the only permissible value
+is "sha1", indicating a SHA-1 checksum.
+
+<hexdigits> is a sequence of hexadecimal digits which encode the
+checksum value.  For sha1, <hexdigits> should be precisely 40 digits
+long.
+
+A sample checksum string is
+    sha1=67049e7931ad7db37b5c794d6ad146c82e5f3187
+
+
+SEGMENTS & OBJECTS: STORAGE AND NAMING
+======================================
+
+An LBS snapshot consists, at its base, of a collection of /objects/:
+binary blobs of data, much like a file.  Higher layers interpret the
+contents of objects in various ways, but the lowest layer is simply
+concerned with storing and naming these objects.
+
+An object is a sequence of bytes (octets) of arbitrary length.  An
+object may contain as few as zero bytes (though such objects are not
+very useful).  Object sizes are potentially unbounded, but it is
+recommended that the maximum size of objects produced be on the order of
+megabytes.  Files of essentially unlimited size can be stored in an LBS
+snapshot using objects of modest size, so this should not cause any real
+restrictions.
+
+For storage purposes, objects are grouped together into /segments/.
+Segments use the TAR format; each object within a segment is stored as a
+separate file.  Segments are named using UUIDs (Universally Unique
+Identifiers), which are 128-bit numbers.  The textual form of a UUID is
+a sequence of lowercase hexadecimal digits with hyphens inserted at
+fixed points; an example UUID is
+    a704eeae-97f2-4f30-91a4-d4473956366b
+This segment could be stored in the filesystem as a file
+    a704eeae-97f2-4f30-91a4-d4473956366b.tar
+The UUID used to name a segment is assigned when the segment is created.
+
+Filters can be layered on top of the segment storage to provide
+compression, encryption, or other features.  For example, the example
+segment above might be stored as
+    a704eeae-97f2-4f30-91a4-d4473956366b.tar.bz2
+or
+    a704eeae-97f2-4f30-91a4-d4473956366b.tar.gpg
+if the file data had been filtered through bzip2 or gpg, respectively,
+before storage.  Filtering of segment data is outside the scope of this
+format specification, however; it is assumed that if filtering is used,
+when decompressing the unfiltered data can be recovered (yielding data
+in the TAR format).
+
+Objects within a segment are numbered sequentially.  This sequence
+number is then formatted as an 8-digit (zero-padded) hexadecimal
+(lowercase) value.  The fully qualified name of an object consists of
+the segment name, followed by a slash ("/"), followed by the object
+sequence number.  So, for example
+    a704eeae-97f2-4f30-91a4-d4473956366b/000001ad
+names an object.
+
+Within the segment TAR file, the filename used for each object is its
+fully-qualified name.  Thus, when extracted using the standard tar
+utility, a segment will produce a directory with the same name as the
+segment itself, and that directory will contain a set of
+sequentially-numbered files each storing the contents of a single
+object.
+
+NOTE: When naming an object, the segment portion consists of the UUID
+only.  Any extensions appended to the segment when storing it as a file
+in the filesystem (for example, .tar.bz2) are _not_ part of the name of
+the object.
+
+There are two additional components which may appear in an object name;
+both are optional.
+
+First, a checksum may be added to the object name to express an
+integrity constraint: the referred-to data must match the checksum
+given.  A checksum is enclosed in parentheses and appended to the object
+name:
+    a704eeae-97f2-4f30-91a4-d4473956366b/000001ad(sha1=67049e7931ad7db37b5c794d6ad146c82e5f3187)
+
+Secondly, an object may be /sliced/: a subset of the bytes actually
+stored in the object may be selected to be returned.  The slice syntax
+is
+    [<start>+<length>]
+where <start> is the first byte to return (as a decimal offset) and
+<length> specifies the number of bytes to return (again in decimal).  It
+is invalid to select using the slice syntax a range of bytes that does
+not fall within the original object.  The slice specification should be
+appended to an object name, for example:
+    a704eeae-97f2-4f30-91a4-d4473956366b/000001ad[264+1000]
+selects only bytes 264..1263 from the original object.
+
+Both a checksum and a slice can be used.  In this case, the checksum is
+given first, followed by the slice.  The checksum is computed over the
+original object contents, before slicing.
+
+
+FILE METADATA LISTING
+=====================
+
+A snapshot stores two distinct types of data into the object store
+described above: data and metadata.  Data for a file may be stored as a
+single object, or the data may be broken apart into blocks which are
+stored as separate objects.  The file /metadata/ log (which may be
+spread across multiple objects) specifies the names of the files in a
+snapshot, metadata about them such as ownership and timestamps, and
+gives the list of objects that contain the data for the file.
+
+The metadata log consists of a set of stanzas, each of which are
+formatted somewhat like RFC 822 (email) headers.  An example is:
+
+    name: etc/fstab
+    checksum: sha1=11bd6ec140e4ec3110a91e1dd0f02b63b701421f
+    data: 2f46bce9-4554-4a60-a4a2-543637bd3989/000001f7
+    group: 0 (root)
+    mode: 0644
+    mtime: 1177977313
+    size: 867
+    type: -
+    user: 0 (root)
+
+The meanings of all the fields are described later.  A blank line
+separates stanzas with information about different files.  In addition
+to regular stanzas, the metadata listing may contain a line containing
+an object reference prefixed with "@".  Such a line indicates that the
+contents of the referenced object should be fetched and parsed as a
+metadata listing at this point, prior to continuing to parse the current
+object.
+
+Several common encodings are used for various fields.  The encoding used
+for each field is specified in the field listing that follows.
+    encoded string: An arbitrary string (octet sequence), with bytes
+        optionally escaped by replacing a byte with %xx, where "xx" is a
+        hexadecimal representation of the byte replaced.  For example,
+        space can be replaced with "%20".  This is the same escaping
+        mechanism as used in URLs.
+    integer: An integer, which may be written in decimal, octal, or
+        hexadecimal.  Strings starting with 0 are interpreted as octal,
+        and those starting with 0x are intepreted as hexadecimal.
+
+Common fields (required in all stanzas):
+    name [encoded string]: Full path of the file archived.
+    user [special]: The user ID of the file, as an integer, optionally
+        followed by a space and the corresponding username, as an
+        escaped string enclosed in parentheses.
+    group [special]: The group ID which owns the file.  Encoding is the
+        same as for the user field: an integer, with an optional name in
+        parentheses following.
+    mode [integer]: Unix mode bits for the file.
+    type [special]: A single character which indicates the type of file.
+        The type indicators are meant to be consistent with the
+        characters used to indicate file type in a directory listing:
+            -   regular file
+            b   block device
+            c   character device
+            d   directory
+            l   symlink
+            p   pipe
+            s   socket
+    mtime [integer]: Modification time of the file.
+
+Optional common fields:
+    links [integer]: Number of hard links to this file, generally only
+        reported if greater than 1.
+    inode [string]: String specifying the inode number of this file when
+        it was dumped.  If "links" is greater than 1, then searching for
+        other files that have an identical "inode" value can be used to
+        determine which files should be hard-linked together when
+        restoring.  The inode field should be treated as an opaque
+        string and compared for equality as such; an implementation may
+        choose whatever representation is convenient.  The format
+        produced by the standard tool is <major>/<minor>/<inode> (where
+        <major> and <minor> specify the device of the containing
+        filesystem and <inode> is the inode number of the file).
+
+Special fields used for regular files:
+    checksum [string]: Checksum of the file contents.
+    size [integer]: Size of the file, in bytes.
+    data [reference list]: Whitespace-separated list of object
+        references.  The referenced data, when concatenated in the
+        listed order, will reconstruct the file data.  Any reference
+        that begins with a "@" character is an indirect reference--the
+        given object includes a whitespace-separated list of object
+        references which should be parsed in the same manner as the data
+        field.
+
+
+SNAPSHOT DESCRIPTOR
+===================
+
+The snapshot descriptor is a small file which describes a single
+snapshot.  It is one of the few files which is not stored as an object
+in the segment store.  It is stored as a separate file, in plain text,
+but in the same directory as segments are stored.
+
+The name of snapshot descriptor file is
+    snapshot-<scheme>-<timestamp>.lbs
+<scheme> is a descriptive text which can be used to distinguish several
+logically distinct sets of snapshots (such as snapshots for two
+different directory trees) that are being stored in the same location.
+<timestamp> gives the date and time the snapshot was taken; the format
+is %Y%m%dT%H%M%S (20070806T092239 means 2007-08-06 09:22:39).
+
+The contents of the descriptor are a set of RFC 822-style headers (much
+like the metadata listing).  The fields which are defined are:
+    Format: The string "LBS Snapshot v0.2" which identifies this file as
+        an LBS backup descriptor.  The version number (v0.2) might
+        change if there are changes to the format.  It is expected that
+        at some point, once the format is stabilized, the version
+        identifier will be changed to v1.0.
+    Producer: A informative string which identifies the program that
+        produced the backup.
+    Date: The date the snapshot was produced.  This matches the
+        timestamp encoded in the filename, but is written out in full.
+        A timezone is given.  For example: "2007-08-06 09:22:39 -0700".
+    Scheme: The <scheme> field from the descriptor filename.
+    Segments: A whitespace-seprated list of segment names.  Any segment
+        which is referenced by this snapshot must be included in the
+        list, since this list can be used in garbage-collecting old
+        segments, determining which segments need to be downloaded to
+        completely reconstruct a snapshot, etc.
+    Root: A single object reference which points to the metadata
+        listing for the snapshot.
+    Checksums: A checksum file may be produced (with the same name as
+        the snapshot descriptor file, but with extension .sha1sums
+        instead of .lbs) containing SHA-1 checksums of all segments.
+        This field contains a checksum of that file.
diff --git a/doc/implementation.txt b/doc/implementation.txt

new file mode 100644 (file)

index 0000000..1ba78ac
--- /dev/null
+++ b/doc/implementation.txt
@@ -0,0 +1,91 @@
+                  LBS: An LFS-Inspired Backup Solution
+                        Implementation Overview
+
+HIGH-LEVEL OVERVIEW
+===================
+
+There are two different classes of data stored, typically in different
+directories:
+
+The SNAPSHOT directory contains the actual backup contents.  It consists
+of segment data (typically in compressed/encrypted form, one segment per
+file) as well as various small per-snapshot files such as the snapshot
+descriptor files (which names each snapshot and tells where to locate
+the data for it) and checksum files (which list checksums of segments
+for quick integrity checking).  The snapshot directory may be stored on
+a remote server.  It is write-only, in the sense that data does not need
+to be read from the snapshot directory to create a new snapshot, and
+files in it are immutable once created (they may be deleted if they are
+no longer needed, but file contents are never changed).
+
+The LOCAL DATABASE contains indexes used during the backup process.
+Files here keep track of what information is known to be stored in the
+snapshot directory, so that new snapshots can appropriate re-use data.
+The local database, as its name implies, should be stored somewhere
+local, since random access (read and write) will be required during the
+backup process.  Unlike the snapshot directory, files here are not
+immutable.
+
+Only the data stored in the snapshot directory is required to restore a
+snapshot.  The local database does not need to be backed up (stored at
+multiple separate locations, etc.).  The contents of the local database
+can be rebuilt (at least in theory) from data in the snapshot directory
+and the local filesystem; it is expected that tools will eventually be
+provided to do so.
+
+The format of data in the snapshot directory is described in format.txt.
+The format of data in the local database is more fluid and may evolve
+over time.  The current structure of the local database is described in
+this document.
+
+
+LOCAL DATABASE FORMAT
+=====================
+
+The local database directory currently contains two files:
+localdb.sqlite and a statcache file.  (Actually, two types of files.  It
+is possible to create snapshots using different schemes, and have them
+share the same local database directory.  In this case, there will still
+be one localdb.sqlite file, but one statcache file for each backup
+scheme.)
+
+Each statcache file is a plain text file, with a format similar to the
+file metadata listing used in the snapshot directory.  The purpose of
+the statcache file is to speed the backup process by making it possible
+to determine if a file has changed since the previous snapshot by
+comparing the results of a stat() system call with the data in the
+statcache file, and if the file is unchanged, providing the checksum and
+list of data blocks used to previously store the file.  The statcache
+file is rewritten each time a snapshot is taken, and can safely be
+deleted (with the only major side effect being that the first backups
+after doing so will progress much more slowly).
+
+localdb.sqlite is an SQLite database file, which is used for indexing
+objects stored in the snapshot directory and various other purposes.
+The database schema is contained in the file schema.sql in the LBS
+source.  Among the data tracked by localdb.sqlite:
+
+  - A list of segments stored in the snapshot directory.  This might not
+    include all segments (segments belonging to old snapshots might be
+    removed), but for correctness all segments listed in the local
+    database must exist in the snapshot directory.
+
+  - A block index which tracks objects in the snapshot directory used to
+    store file data.  It is indexed by block checksum, and so can be
+    used while generating a snapshot to determine if a just-read block
+    of data is already stored in the snapshot directory, and if so how
+    to name it.
+
+  - A list of recent snapshots, together with a list of the objects from
+    the block index they reference.
+
+The localdb SQL database is central to data sharing and segment
+cleaning.  When creating a new snapshot, information about the new
+snapshot and the blocks is uses (including any new ones) is written to
+the database.  Using the database, separate segment cleaning processes
+can determine how much data in various segments is still live, and
+determine which segments are best candidates for cleaning.  Cleaning is
+performed by updating the database to mark objects in the cleaned
+segments as unavailable for use in future snapshots; when the backup
+process next runs, any files that would use these expired blocks instead
+have a copy of the data written to a new segment.
diff --git a/format.txt b/format.txt

deleted file mode 100644 (file)

index 78f32c3..0000000
--- a/format.txt
+++ /dev/null
@@ -1,254 +0,0 @@
-                       Backup Format Description
-                  for an LFS-Inspired Backup Solution
-
-NOTE: This format specification is not yet complete.  Right now the code
-provides the best documentation of the format.
-
-This document simply describes the snapshot format.  It is described
-from the point of view of a decompressor which wishes to restore the
-files from a snapshot.  It does not specify the exact behavior required
-of the backup program writing the snapshot.
-
-This document does not explain the rationale behind the format; for
-that, see design.txt.
-
-
-DATA CHECKSUMS
-==============
-
-In several places in the LBS format, a cryptographic checksum may be
-used to allow data integrity to be verified.  At the moment, only the
-SHA-1 checksum is supported, but it is expected that other algorithms
-will be supported in the future.
-
-When a checksum is called for, the checksum is always stored in a text
-format.  The general format used is
-    <algorithm>=<hexdigits>
-
-<algorithm> identifies the checksum algorithm used, and allows new
-algorithms to be added later.  At the moment, the only permissible value
-is "sha1", indicating a SHA-1 checksum.
-
-<hexdigits> is a sequence of hexadecimal digits which encode the
-checksum value.  For sha1, <hexdigits> should be precisely 40 digits
-long.
-
-A sample checksum string is
-    sha1=67049e7931ad7db37b5c794d6ad146c82e5f3187
-
-
-SEGMENTS & OBJECTS: STORAGE AND NAMING
-======================================
-
-An LBS snapshot consists, at its base, of a collection of /objects/:
-binary blobs of data, much like a file.  Higher layers interpret the
-contents of objects in various ways, but the lowest layer is simply
-concerned with storing and naming these objects.
-
-An object is a sequence of bytes (octets) of arbitrary length.  An
-object may contain as few as zero bytes (though such objects are not
-very useful).  Object sizes are potentially unbounded, but it is
-recommended that the maximum size of objects produced be on the order of
-megabytes.  Files of essentially unlimited size can be stored in an LBS
-snapshot using objects of modest size, so this should not cause any real
-restrictions.
-
-For storage purposes, objects are grouped together into /segments/.
-Segments use the TAR format; each object within a segment is stored as a
-separate file.  Segments are named using UUIDs (Universally Unique
-Identifiers), which are 128-bit numbers.  The textual form of a UUID is
-a sequence of lowercase hexadecimal digits with hyphens inserted at
-fixed points; an example UUID is
-    a704eeae-97f2-4f30-91a4-d4473956366b
-This segment could be stored in the filesystem as a file
-    a704eeae-97f2-4f30-91a4-d4473956366b.tar
-The UUID used to name a segment is assigned when the segment is created.
-
-Filters can be layered on top of the segment storage to provide
-compression, encryption, or other features.  For example, the example
-segment above might be stored as
-    a704eeae-97f2-4f30-91a4-d4473956366b.tar.bz2
-or
-    a704eeae-97f2-4f30-91a4-d4473956366b.tar.gpg
-if the file data had been filtered through bzip2 or gpg, respectively,
-before storage.  Filtering of segment data is outside the scope of this
-format specification, however; it is assumed that if filtering is used,
-when decompressing the unfiltered data can be recovered (yielding data
-in the TAR format).
-
-Objects within a segment are numbered sequentially.  This sequence
-number is then formatted as an 8-digit (zero-padded) hexadecimal
-(lowercase) value.  The fully qualified name of an object consists of
-the segment name, followed by a slash ("/"), followed by the object
-sequence number.  So, for example
-    a704eeae-97f2-4f30-91a4-d4473956366b/000001ad
-names an object.
-
-Within the segment TAR file, the filename used for each object is its
-fully-qualified name.  Thus, when extracted using the standard tar
-utility, a segment will produce a directory with the same name as the
-segment itself, and that directory will contain a set of
-sequentially-numbered files each storing the contents of a single
-object.
-
-NOTE: When naming an object, the segment portion consists of the UUID
-only.  Any extensions appended to the segment when storing it as a file
-in the filesystem (for example, .tar.bz2) are _not_ part of the name of
-the object.
-
-There are two additional components which may appear in an object name;
-both are optional.
-
-First, a checksum may be added to the object name to express an
-integrity constraint: the referred-to data must match the checksum
-given.  A checksum is enclosed in parentheses and appended to the object
-name:
-    a704eeae-97f2-4f30-91a4-d4473956366b/000001ad(sha1=67049e7931ad7db37b5c794d6ad146c82e5f3187)
-
-Secondly, an object may be /sliced/: a subset of the bytes actually
-stored in the object may be selected to be returned.  The slice syntax
-is
-    [<start>+<length>]
-where <start> is the first byte to return (as a decimal offset) and
-<length> specifies the number of bytes to return (again in decimal).  It
-is invalid to select using the slice syntax a range of bytes that does
-not fall within the original object.  The slice specification should be
-appended to an object name, for example:
-    a704eeae-97f2-4f30-91a4-d4473956366b/000001ad[264+1000]
-selects only bytes 264..1263 from the original object.
-
-Both a checksum and a slice can be used.  In this case, the checksum is
-given first, followed by the slice.  The checksum is computed over the
-original object contents, before slicing.
-
-
-FILE METADATA LISTING
-=====================
-
-A snapshot stores two distinct types of data into the object store
-described above: data and metadata.  Data for a file may be stored as a
-single object, or the data may be broken apart into blocks which are
-stored as separate objects.  The file /metadata/ log (which may be
-spread across multiple objects) specifies the names of the files in a
-snapshot, metadata about them such as ownership and timestamps, and
-gives the list of objects that contain the data for the file.
-
-The metadata log consists of a set of stanzas, each of which are
-formatted somewhat like RFC 822 (email) headers.  An example is:
-
-    name: etc/fstab
-    checksum: sha1=11bd6ec140e4ec3110a91e1dd0f02b63b701421f
-    data: 2f46bce9-4554-4a60-a4a2-543637bd3989/000001f7
-    group: 0 (root)
-    mode: 0644
-    mtime: 1177977313
-    size: 867
-    type: -
-    user: 0 (root)
-
-The meanings of all the fields are described later.  A blank line
-separates stanzas with information about different files.  In addition
-to regular stanzas, the metadata listing may contain a line containing
-an object reference prefixed with "@".  Such a line indicates that the
-contents of the referenced object should be fetched and parsed as a
-metadata listing at this point, prior to continuing to parse the current
-object.
-
-Several common encodings are used for various fields.  The encoding used
-for each field is specified in the field listing that follows.
-    encoded string: An arbitrary string (octet sequence), with bytes
-        optionally escaped by replacing a byte with %xx, where "xx" is a
-        hexadecimal representation of the byte replaced.  For example,
-        space can be replaced with "%20".  This is the same escaping
-        mechanism as used in URLs.
-    integer: An integer, which may be written in decimal, octal, or
-        hexadecimal.  Strings starting with 0 are interpreted as octal,
-        and those starting with 0x are intepreted as hexadecimal.
-
-Common fields (required in all stanzas):
-    name [encoded string]: Full path of the file archived.
-    user [special]: The user ID of the file, as an integer, optionally
-        followed by a space and the corresponding username, as an
-        escaped string enclosed in parentheses.
-    group [special]: The group ID which owns the file.  Encoding is the
-        same as for the user field: an integer, with an optional name in
-        parentheses following.
-    mode [integer]: Unix mode bits for the file.
-    type [special]: A single character which indicates the type of file.
-        The type indicators are meant to be consistent with the
-        characters used to indicate file type in a directory listing:
-            -   regular file
-            b   block device
-            c   character device
-            d   directory
-            l   symlink
-            p   pipe
-            s   socket
-    mtime [integer]: Modification time of the file.
-
-Optional common fields:
-    links [integer]: Number of hard links to this file, generally only
-        reported if greater than 1.
-    inode [string]: String specifying the inode number of this file when
-        it was dumped.  If "links" is greater than 1, then searching for
-        other files that have an identical "inode" value can be used to
-        determine which files should be hard-linked together when
-        restoring.  The inode field should be treated as an opaque
-        string and compared for equality as such; an implementation may
-        choose whatever representation is convenient.  The format
-        produced by the standard tool is <major>/<minor>/<inode> (where
-        <major> and <minor> specify the device of the containing
-        filesystem and <inode> is the inode number of the file).
-
-Special fields used for regular files:
-    checksum [string]: Checksum of the file contents.
-    size [integer]: Size of the file, in bytes.
-    data [reference list]: Whitespace-separated list of object
-        references.  The referenced data, when concatenated in the
-        listed order, will reconstruct the file data.  Any reference
-        that begins with a "@" character is an indirect reference--the
-        given object includes a whitespace-separated list of object
-        references which should be parsed in the same manner as the data
-        field.
-
-
-SNAPSHOT DESCRIPTOR
-===================
-
-The snapshot descriptor is a small file which describes a single
-snapshot.  It is one of the few files which is not stored as an object
-in the segment store.  It is stored as a separate file, in plain text,
-but in the same directory as segments are stored.
-
-The name of snapshot descriptor file is
-    snapshot-<scheme>-<timestamp>.lbs
-<scheme> is a descriptive text which can be used to distinguish several
-logically distinct sets of snapshots (such as snapshots for two
-different directory trees) that are being stored in the same location.
-<timestamp> gives the date and time the snapshot was taken; the format
-is %Y%m%dT%H%M%S (20070806T092239 means 2007-08-06 09:22:39).
-
-The contents of the descriptor are a set of RFC 822-style headers (much
-like the metadata listing).  The fields which are defined are:
-    Format: The string "LBS Snapshot v0.2" which identifies this file as
-        an LBS backup descriptor.  The version number (v0.2) might
-        change if there are changes to the format.  It is expected that
-        at some point, once the format is stabilized, the version
-        identifier will be changed to v1.0.
-    Producer: A informative string which identifies the program that
-        produced the backup.
-    Date: The date the snapshot was produced.  This matches the
-        timestamp encoded in the filename, but is written out in full.
-        A timezone is given.  For example: "2007-08-06 09:22:39 -0700".
-    Scheme: The <scheme> field from the descriptor filename.
-    Segments: A whitespace-seprated list of segment names.  Any segment
-        which is referenced by this snapshot must be included in the
-        list, since this list can be used in garbage-collecting old
-        segments, determining which segments need to be downloaded to
-        completely reconstruct a snapshot, etc.
-    Root: A single object reference which points to the metadata
-        listing for the snapshot.
-    Checksums: A checksum file may be produced (with the same name as
-        the snapshot descriptor file, but with extension .sha1sums
-        instead of .lbs) containing SHA-1 checksums of all segments.
-        This field contains a checksum of that file.
diff --git a/implementation.txt b/implementation.txt

deleted file mode 100644 (file)

index 1ba78ac..0000000
--- a/implementation.txt
+++ /dev/null
@@ -1,91 +0,0 @@
-                  LBS: An LFS-Inspired Backup Solution
-                        Implementation Overview
-
-HIGH-LEVEL OVERVIEW
-===================
-
-There are two different classes of data stored, typically in different
-directories:
-
-The SNAPSHOT directory contains the actual backup contents.  It consists
-of segment data (typically in compressed/encrypted form, one segment per
-file) as well as various small per-snapshot files such as the snapshot
-descriptor files (which names each snapshot and tells where to locate
-the data for it) and checksum files (which list checksums of segments
-for quick integrity checking).  The snapshot directory may be stored on
-a remote server.  It is write-only, in the sense that data does not need
-to be read from the snapshot directory to create a new snapshot, and
-files in it are immutable once created (they may be deleted if they are
-no longer needed, but file contents are never changed).
-
-The LOCAL DATABASE contains indexes used during the backup process.
-Files here keep track of what information is known to be stored in the
-snapshot directory, so that new snapshots can appropriate re-use data.
-The local database, as its name implies, should be stored somewhere
-local, since random access (read and write) will be required during the
-backup process.  Unlike the snapshot directory, files here are not
-immutable.
-
-Only the data stored in the snapshot directory is required to restore a
-snapshot.  The local database does not need to be backed up (stored at
-multiple separate locations, etc.).  The contents of the local database
-can be rebuilt (at least in theory) from data in the snapshot directory
-and the local filesystem; it is expected that tools will eventually be
-provided to do so.
-
-The format of data in the snapshot directory is described in format.txt.
-The format of data in the local database is more fluid and may evolve
-over time.  The current structure of the local database is described in
-this document.
-
-
-LOCAL DATABASE FORMAT
-=====================
-
-The local database directory currently contains two files:
-localdb.sqlite and a statcache file.  (Actually, two types of files.  It
-is possible to create snapshots using different schemes, and have them
-share the same local database directory.  In this case, there will still
-be one localdb.sqlite file, but one statcache file for each backup
-scheme.)
-
-Each statcache file is a plain text file, with a format similar to the
-file metadata listing used in the snapshot directory.  The purpose of
-the statcache file is to speed the backup process by making it possible
-to determine if a file has changed since the previous snapshot by
-comparing the results of a stat() system call with the data in the
-statcache file, and if the file is unchanged, providing the checksum and
-list of data blocks used to previously store the file.  The statcache
-file is rewritten each time a snapshot is taken, and can safely be
-deleted (with the only major side effect being that the first backups
-after doing so will progress much more slowly).
-
-localdb.sqlite is an SQLite database file, which is used for indexing
-objects stored in the snapshot directory and various other purposes.
-The database schema is contained in the file schema.sql in the LBS
-source.  Among the data tracked by localdb.sqlite:
-
-  - A list of segments stored in the snapshot directory.  This might not
-    include all segments (segments belonging to old snapshots might be
-    removed), but for correctness all segments listed in the local
-    database must exist in the snapshot directory.
-
-  - A block index which tracks objects in the snapshot directory used to
-    store file data.  It is indexed by block checksum, and so can be
-    used while generating a snapshot to determine if a just-read block
-    of data is already stored in the snapshot directory, and if so how
-    to name it.
-
-  - A list of recent snapshots, together with a list of the objects from
-    the block index they reference.
-
-The localdb SQL database is central to data sharing and segment
-cleaning.  When creating a new snapshot, information about the new
-snapshot and the blocks is uses (including any new ones) is written to
-the database.  Using the database, separate segment cleaning processes
-can determine how much data in various segments is still live, and
-determine which segments are best candidates for cleaning.  Cleaning is
-performed by updating the database to mark objects in the cleaned
-segments as unavailable for use in future snapshots; when the backup
-process next runs, any files that would use these expired blocks instead
-have a copy of the data written to a new segment.
author	Michael Vrable <mvrable@cs.ucsd.edu>
	Thu, 13 Sep 2007 20:18:19 +0000 (13:18 -0700)
committer	Michael Vrable <mvrable@turin.ucsd.edu>
	Thu, 13 Sep 2007 20:18:19 +0000 (13:18 -0700)
design.txt	[deleted file]	patch \| blob \| history
doc/design.txt	[new file with mode: 0644]	patch \| blob
doc/format.txt	[new file with mode: 0644]	patch \| blob
doc/implementation.txt	[new file with mode: 0644]	patch \| blob
format.txt	[deleted file]	patch \| blob \| history
implementation.txt	[deleted file]	patch \| blob \| history