Michael Vrable [Tue, 13 May 2008 16:41:41 +0000 (09:41 -0700)]
Put updated copyright statements in all source files.
These now reflect the fact that all code should be distributable under the
GPLv2.
Michael Vrable [Mon, 12 May 2008 19:57:49 +0000 (12:57 -0700)]
Report compressed size of data written in a backup as well as uncompressed.
When exiting, a summary of the size of all segments (grouped by type: data,
metadata, ...) is printed. Extend this so that both the uncompressed and
compressed sizes of the segments are printed, and to do so now also keep
track of the compressed size of data.
Michael Vrable [Fri, 11 Apr 2008 01:17:48 +0000 (18:17 -0700)]
Update copyright dates in source files.
Michael Vrable [Fri, 11 Apr 2008 01:10:03 +0000 (18:10 -0700)]
Squeeze extra blank lines when dumping metadata logs.
In lbs-util, squeeze out extra blank lines in the output of the
read-metadata command. Extra blank lines may appear in the input,
particularly when delta-encoding metadata logs, but to produce
uniform-looking outputs, delete these extra blank lines.
Michael Vrable [Wed, 9 Apr 2008 21:54:06 +0000 (14:54 -0700)]
NEWS updates.
Michael Vrable [Wed, 9 Apr 2008 21:04:44 +0000 (14:04 -0700)]
Update arguments passed to upload script.
Fix the remote upload script implementation in lbs so that it matches the
conventions expected by the S3 sample script.
Michael Vrable [Wed, 9 Apr 2008 18:26:03 +0000 (11:26 -0700)]
Implement a simple backend script to store data to Amazon S3.
This currently doesn't quite use the interface expected by lbs. The
interfaces will be matched soon.
Michael Vrable [Thu, 3 Apr 2008 20:07:11 +0000 (13:07 -0700)]
Preliminary support for external file upload scripts.
This adds initial support for calling out to an external script to transfer
files to a backup server. Storage requirements on the client using this
are minimal: space for the local database and for spooling several files
for upload. Local temporary files are deleted as they are uploaded, and
the backup rate is throttled to the upload rate.
Michael Vrable [Wed, 2 Apr 2008 03:58:27 +0000 (20:58 -0700)]
Initial framework for direct transfer of backups to remote storage.
Add a layer of indirection in the writing of files to the backup store, and
create a background thread to handle the processing of files to be stored.
Right now this secondary thread does not do much, but will easily be able
to launch a helper script for transferring data to a remote server.
Files are processed by the background thread one at a time. Multiple files
can be queued up for processing, but the size of the queue is limited so
that the production of backup data will be throttled to the speed at which
the data can be transferred (to bound the temporary space needed for
storing files).
Michael Vrable [Sat, 1 Mar 2008 00:08:35 +0000 (16:08 -0800)]
Make restoring from snapshots more efficient.
When restoring a snapshot, restore files in order roughly determined by how
they are stored in segments, instead of in pure lexicographic order. This
should ensure that, for the most part, each segment only has to be unpacked
once, instead of perhaps many times as could happen previously, and so
should make restoring more efficient.
This implementation loads all metadata into memory to determine the
ordering, and so restores are now much more memory-intensive than before.
It would be good to work on memory requirements later--either offer an
option to use the old behavior, or perhaps load some of the data into a
temporary database.
Michael Vrable [Thu, 28 Feb 2008 00:20:31 +0000 (16:20 -0800)]
Allow restores of just selected files/directories.
Previously, only a complete snapshot could be restored. This change to
lbs-util will allow just selected data to be restored.
Michael Vrable [Tue, 19 Feb 2008 22:14:57 +0000 (14:14 -0800)]
Documentation updates.
Michael Vrable [Tue, 19 Feb 2008 18:51:37 +0000 (10:51 -0800)]
Add GPLv2 license conditions.
Since some code is derived from GPL-covered software, allow the entire
program to be distributed under the terms of the GPL, version 2.
Michael Vrable [Thu, 14 Feb 2008 22:48:04 +0000 (14:48 -0800)]
Minor documentation updates.
Michael Vrable [Wed, 13 Feb 2008 22:27:52 +0000 (14:27 -0800)]
Do not attempt to clean the same segment multiple times.
Michael Vrable [Wed, 13 Feb 2008 01:05:33 +0000 (17:05 -0800)]
Slight tweaks to the local database to improve cleaning procedures.
In addition to marking objects in cleaned segments, mark the segment itself
as cleaned.
Michael Vrable [Thu, 17 Jan 2008 04:19:14 +0000 (20:19 -0800)]
Include snapshot intent value in the backup descriptor.
It wasn't included earlier, but could be useful to have when actually going
back to clean out old snapshots at a later point in time.
Michael Vrable [Tue, 15 Jan 2008 21:57:55 +0000 (13:57 -0800)]
Documentation updates.
Michael Vrable [Tue, 15 Jan 2008 18:48:30 +0000 (10:48 -0800)]
Fix to segment age calculation in local database.
It seems that in SQLite, max(x, NULL) yields NULL, not x. This was being
used to set the mtime of a segment to the maximum mtime of any object in
it, starting with an mtime of NULL. Fix the computation so it does the
right thing.
Michael Vrable [Wed, 9 Jan 2008 22:26:07 +0000 (14:26 -0800)]
Extend tracking of used segments to cover metadata segments.
In the segments_used table in the local database, include segments that
contain metadata in addition to data segments. Additionally, slightly
extend the segment tracking code so that the modification time of segments
is written out.
The exact utilization of the metadata segments is not yet computed; for now
the utilization is listed as 1.0 even if it is actually less.
Michael Vrable [Wed, 9 Jan 2008 21:20:50 +0000 (13:20 -0800)]
Minor fix to segment cleaning.
Previously, all objects were marked to be rewritten, instead of merely
those in segments marked for cleaning. Properly handle this.
Michael Vrable [Tue, 8 Jan 2008 19:50:49 +0000 (11:50 -0800)]
Add a flag to force a full rewrite of the metadata log in a snapshot.
When --full-metadata is given, no pointers to old metadata will be written
out. This could be used periodically in backups (say, weekly) to prevent
long dependencies in the metadata logs, at least until better cleaning is
implemented.
Michael Vrable [Tue, 25 Dec 2007 04:00:42 +0000 (20:00 -0800)]
Add intent-based cleaning to lbs-util.
Allow the level of segment cleaning performed to be adjusted by specifying
the next type of backup to be performed. If the next backup is to be
longer-lived, then clean more aggressively.
Michael Vrable [Fri, 14 Dec 2007 19:51:41 +0000 (11:51 -0800)]
Fix a bug in computing the size of a segment that led to utilization > 1.0.
The old code for computing the size of a segment (to be stored in the
segments table) could leave off the last object to be written to the
segment. This could cause the computed segment utilization to be greater
than 1.0, which should be impossible. Fix the size calculation so that it
should always include all data written to the segment. As a bonus, this
also correctly computes the size of metadata-log segments, even though the
metadata objects don't appear in the block_index table (which was
previously used for computing the segment size, but is no longer).
Michael Vrable [Wed, 12 Dec 2007 19:36:50 +0000 (11:36 -0800)]
Fix a bug that caused blocks not to be properly re-used on checksum match.
Michael Vrable [Wed, 12 Dec 2007 18:42:37 +0000 (10:42 -0800)]
Fix uninitialized variable warning in sample restore program.
Michael Vrable [Wed, 12 Dec 2007 18:34:46 +0000 (10:34 -0800)]
Include sizes in references to blocks in each file's data list.
This optimization is aimed at large files that are composed of many
blocks--including the size of each block allows a restore program to
determine the offset at which each block begins in the output file (by
adding up the sizes of the previous block). This may allow for more
efficient restores, in which file data is filled in as blocks are
encountered, instead of having to find the blocks in the order they appear
in the data list.
A future change might be to only include the sizes when necessary--files
which are composed of a single object do not need a size, nor does the last
block of a large file. But for now, simply include the size on all
objects.
This is part of a recommended format change, but one that is both forward-
and backward-compatible.
Michael Vrable [Wed, 12 Dec 2007 18:16:39 +0000 (10:16 -0800)]
Snapshot format change: extend the slice syntax with a length-only form.
Change slice format so that in addition to <start>+<length>, it is possible
to specify just <length>. This isn't needed (0+<length>) could be used
instead, but looks more pleasing if lengths are specified more frequently
on objects. Also update the various tools to correctly parse the new
syntax.
This is part of the new v0.6 format.
Michael Vrable [Wed, 12 Dec 2007 05:49:23 +0000 (21:49 -0800)]
When verifying a snapshot, check that the segment list is accurate.
This should help find bugs such as the one fixed in commit
1b39ce3ff11a.
Michael Vrable [Wed, 12 Dec 2007 01:49:44 +0000 (17:49 -0800)]
Add "intent" field to a snapshot.
This field is intended to indicate how long the backup might be kept or
what backup schedule the given snapshot is part of--for example 1 for a
daily backup, 7 for a weekly backup.
This might be used when performing segment cleaning or deleting old
snapshots, but for now the information is just stored in the local
database.
Michael Vrable [Fri, 7 Dec 2007 21:33:25 +0000 (13:33 -0800)]
Ensure that segments with reused metadata are listed in root descriptor.
Michael Vrable [Fri, 7 Dec 2007 21:01:30 +0000 (13:01 -0800)]
Add format support for efficient sparse file handling.
While making other format changes, also add in support for explicitly
representing regions of a file that are entirely zero, as can happen with
sparse files. These are represented with an object reference of the form
"zero[0+<length>]". Update the lbs tool to generate and parse these
references, and the utility code to also handle it.
The restore tools do not seek over zero regions when writing out a file, so
the file is not restored as a sparse file, but that support can easily be
added later with no change needed to the format.
Michael Vrable [Fri, 7 Dec 2007 03:16:57 +0000 (19:16 -0800)]
Upgrades to utility code for new formats, and a few more database tweaks.
Update the sample restore script and the Python code to support the new
snapshot format and the new local database schema. While updating segment
cleaning, also slightly rearrange the database schema to better support it.
Michael Vrable [Thu, 6 Dec 2007 05:36:47 +0000 (21:36 -0800)]
Update the NEWS file with some information about format changes.
Michael Vrable [Thu, 6 Dec 2007 05:27:37 +0000 (21:27 -0800)]
Drop the obsolete snapshot_contents table from the local database.
Michael Vrable [Thu, 6 Dec 2007 05:25:39 +0000 (21:25 -0800)]
Provide a script for converting the local database to the v0.6 format.
Michael Vrable [Thu, 6 Dec 2007 04:14:08 +0000 (20:14 -0800)]
Modifications to the local database: create a summary segments_used table.
Make the local database more compact by only storing, for each snapshot, a
listing of the segments it uses and the fraction of each which is used,
instead of listing all objects referenced individually.
This commit only adds the new table; it doesn't yet delete the old table
(snapshot_contents).
Michael Vrable [Mon, 3 Dec 2007 19:10:48 +0000 (11:10 -0800)]
Flag "volatile" files when creating a snapshot.
If a file has changed very near to the time it was backed up (right now 30
seconds, though this could probably be decreased to only a few seconds),
mark the file as "volatile" and do not use the stat information to skip
that file on the next backup. This is to avoid a race condition where a
file's stat information is saved, the file is dumped, and then the file is
modified again. If this happens within the same second as the earlier
modifications, then mtime and ctime will not be updated (since they already
refer to the current second), and on a subsequent backup the file would not
be stored since it appears to be unchanged. However, if the file's mtime
and ctime are in the past, then this can't happen, so use this as a test
for when it is safe to skip apparently unchanged files.
The volatile flag only needs to go in the statcache, not the main metadata
log, but for the moment it is going in both.
Michael Vrable [Thu, 29 Nov 2007 21:04:11 +0000 (13:04 -0800)]
Assorted minor code cleanups.
Michael Vrable [Wed, 28 Nov 2007 23:12:59 +0000 (15:12 -0800)]
Ensure the "name:" key shows up first in metadata output.
This isn't necessary, but is nice for readability.
Michael Vrable [Wed, 28 Nov 2007 22:22:39 +0000 (14:22 -0800)]
Partially revert metadata format changes in
6c94114148c4.
On second thought, the renaming of the "name:" field to "path:" isn't worth
the trouble of making the change since there isn't much benefit and
updating tools to deal with either format will be more complex. The other
changes can be left since they are smaller and easier to support.
Revert this now, before any releases are made with the change in effect.
Michael Vrable [Wed, 28 Nov 2007 22:18:29 +0000 (14:18 -0800)]
Drop the old statcache implementation.
The statcache is now replaced with the unified local metadata log, which is
used to aid in reusing unchanged parts of the metadata log in snapshots,
but additionally contains all the information needed to determine if a file
is unchanged.
Michael Vrable [Wed, 21 Nov 2007 23:02:54 +0000 (15:02 -0800)]
Initial implementation of metadata log sharing.
Allow metadata written to segments to be reused between snapshots. Keep
track of what metadata was written out on the client, and when identicial
metadata would be written on a subsequent backup, instead emit a reference
to the old metadata.
This needs more testing and verification. There also needs to be a
mechanism for performing the equivalent of segment cleaning for metadata,
so that the metadata log does not become excessively fragmented over time.
Michael Vrable [Tue, 20 Nov 2007 03:01:52 +0000 (19:01 -0800)]
Bugfix for splitting the metadata log in the new metadata code.
Michael Vrable [Tue, 20 Nov 2007 00:27:51 +0000 (16:27 -0800)]
Write out new-style statcache data.
Write out statcache-style data from the metadata logging module of lbs.
This will eventually replace the old statcache implementation, but is not
complete. This new statcache data is not yet read in or used elsewhere.
In the new format, the data in the statcache file has the same format as
the data in the metadata log itself. Each stanza with file information is
prefixed with a @@reference line that gives a reference to the location of
the metadata. If the metadata has not changed, this will allow metadata
log data to be re-used between snapshots.
Michael Vrable [Mon, 19 Nov 2007 17:57:42 +0000 (09:57 -0800)]
Drop the use of indirect blocks for storing pointers to data.
Now store the entire list of blocks that contain each file's contents
inline in the metadata log, even when that list is large. Previously, the
list was split out into a separate object when it contained more than 8
entries. These indirect blocks may still be useful, but they also
complicate the metadata/statcache rewrite, so for the moment disable them.
They may be reintroduced later.
Michael Vrable [Fri, 16 Nov 2007 06:29:19 +0000 (22:29 -0800)]
Initial refactoring of metadata logging.
Move the writing of entries in the metadata log to a separate class in a
separate file (MetadataWriter in metadata.cc). This is the first step of
the changes, which moves the existing code but does not significantly
change it. It is preparation for more significant changes to metadata
writing.
Michael Vrable [Thu, 15 Nov 2007 23:38:59 +0000 (15:38 -0800)]
Changes to the metadata log format.
A small collection of changes to make the snapshot format a little more
rational (I hope). The intent is to also more or less merge the formats of
the metadata log and statcache. There will likely be other changes in the
future in the course of working on metadata scalability. The format is not
yet finalized!
WARNING: These changes will require new tools to read the generated backup
snapshots. The version number in the format has been updated. This
changeset does not include the necessary changes to the snapshot-parsing
code.
Michael Vrable [Wed, 14 Nov 2007 04:30:54 +0000 (20:30 -0800)]
NEWS updates for 0.5.1 release (minor changes only).
Michael Vrable [Thu, 8 Nov 2007 22:07:32 +0000 (14:07 -0800)]
Provide a sample tool, contrib/parity-gen, for RAID-like parity sets.
parity-gen will use the par2 command to generate redundant data files, to
be stored with backup segments, that will allow segments to recover even if
some of the data is lost or damaged. When given a directory, it will
incrementally update parity sets to reflect changes made to the directory
since the last run.
This tool is still under development, and shouldn't be completely trusted
yet.
Michael Vrable [Wed, 7 Nov 2007 22:25:37 +0000 (14:25 -0800)]
Check LBS format version when reading snapshots in the tools.
Have the lbs-util command check the version of a snapshot when it is read,
and signal an error if the version is newer than what is supported, so that
silent failures do not occur if the format is changed in the future.
Michael Vrable [Tue, 30 Oct 2007 20:43:29 +0000 (13:43 -0700)]
lbs-util: Add a command for dumping a flattened metadata log file.
Michael Vrable [Mon, 29 Oct 2007 18:43:59 +0000 (11:43 -0700)]
Minor documemtation update.
Michael Vrable [Tue, 16 Oct 2007 21:00:57 +0000 (14:00 -0700)]
LBS v0.5 release.
Michael Vrable [Tue, 16 Oct 2007 20:47:34 +0000 (13:47 -0700)]
Fix mismatched new/delete calls.
Caught by Valgrind: memory was being allocated with new[] but freed with
delete. Use delete[] instead.
Michael Vrable [Tue, 2 Oct 2007 19:00:28 +0000 (12:00 -0700)]
Make extraction of segments in lbs.py even more quiet.
Michael Vrable [Tue, 2 Oct 2007 18:54:37 +0000 (11:54 -0700)]
Add a restore-snapshot command to lbs-util.
This is still in development, and not fully tested yet. This is meant to
mostly replace restore.pl, and will eventually have more features (it can
already restore without needing all segments unpacked first).
The restore.pl script won't go away, though, since it is still useful to
include as a small, self-contained bare-bones restore program.
Michael Vrable [Tue, 25 Sep 2007 22:04:51 +0000 (15:04 -0700)]
Add "prune database" command to lbs-util.
This acts like clean, but doesn't perform the step of marking old segments
as expired.
Michael Vrable [Thu, 20 Sep 2007 23:26:21 +0000 (16:26 -0700)]
File reorganization: move non-essential binaries to contrib/.
The top-level directory is now mostly for the main lbs binary, and the
lbs-util script. Other scripts, which are not necessary for backups (but
helpful) can be found in contrib.
Michael Vrable [Mon, 17 Sep 2007 04:25:15 +0000 (21:25 -0700)]
When cleaning the cache of objects, don't be so verbose.
Change the command for cleaning from "rm -rv", dropping the "-v".
Michael Vrable [Sat, 15 Sep 2007 05:04:20 +0000 (22:04 -0700)]
Enhance object-checksums command.
If no segments are specified, dump object checksums for all segments.
Additionally, prompt for a password (to decrypt segments) when needed.
Michael Vrable [Thu, 13 Sep 2007 20:18:19 +0000 (13:18 -0700)]
Move most documentation into a doc/ subdirectory.
Michael Vrable [Wed, 12 Sep 2007 19:26:29 +0000 (12:26 -0700)]
Remove cleandb.sql SQL script since this functionality is now in lbs-util.
Michael Vrable [Wed, 12 Sep 2007 18:01:07 +0000 (11:01 -0700)]
Rename lbs-util.py to lbs-util.
The Python implementation is now the standard utility program for working
with LBS archives. Drop the ".py" suffix. It replaces the old lbs-util
program, which was implemented in Perl.
Michael Vrable [Wed, 12 Sep 2007 17:59:39 +0000 (10:59 -0700)]
Delete the Perl LBS module and the Perl-based lbs-util script.
Michael Vrable [Tue, 11 Sep 2007 17:38:13 +0000 (10:38 -0700)]
Expanded Python module for accessing LBS snapshots, and lbs-util.py.
Expand the lbs Python module with code for reading from LBS archives. This
includes decoding segments and parsing snapshot descriptors and metadata
logs.
Expand the lbs-util.py tool so that it has most of the functionality of
lbs-util (Perl implementation). The Perl code should now be considered
deprecated, and will be removed at some point in the future.
The Python code still needs some cleaning up.
Michael Vrable [Sun, 9 Sep 2007 00:36:43 +0000 (17:36 -0700)]
Suppress error messages from Makefile when git-describe is not available.
Michael Vrable [Sun, 9 Sep 2007 00:32:32 +0000 (17:32 -0700)]
README spelling corrections.
Michael Vrable [Fri, 24 Aug 2007 18:03:51 +0000 (11:03 -0700)]
NEWS file updates in preparation for v0.4 release.
Michael Vrable [Fri, 24 Aug 2007 17:56:15 +0000 (10:56 -0700)]
Preview of a new Python-based management tool; includes segment cleaning.
This adds a Python-based lbs-util program which can perform automatic
segment cleaning. This hasn't been entirely worked out yet, so it may yet
be a little buggy, and the policies implemented can certainly be improved.
Expect future improvements in this area, and don't yet rely on it too
heavily.
Michael Vrable [Fri, 24 Aug 2007 17:24:09 +0000 (10:24 -0700)]
Documentation improvements.
Highlights are a README file with instructions for getting started, and
description of some implementation details, starting with the purpose and
format of the local database.
Michael Vrable [Thu, 23 Aug 2007 18:23:28 +0000 (11:23 -0700)]
Place expired and repacked objects into segments based on database.
The local database can store an integer with each expired object that can
be used to group objects together based on age or other factors. Update
the lbs snapshot utility to place objects into segments based on this
value.
Michael Vrable [Tue, 21 Aug 2007 00:38:20 +0000 (17:38 -0700)]
Implement --signature-filter for signing backup snapshots.
If supplied, the argument to --signature-filter will be a program through
which the text for the root snapshot descriptor is filtered. This filter
should act much like "gpg --clearsign": produce a nearly-identical text
file, with perhaps a few lines at the start or end containing the
signature, but which can be treated as an ordinary segment descriptor by
ignoring this leading and trailing data.
Michael Vrable [Fri, 17 Aug 2007 19:38:03 +0000 (12:38 -0700)]
lbs-util now supports reading encrypted segments (with lbs-filter-gpg).
Change the default to encrypted segments rather than compressed with bzip2.
The default of the lbs program has not been changed; it is still necessary
to specify the correct options there to generate encrypted backups.
Michael Vrable [Fri, 17 Aug 2007 18:52:45 +0000 (11:52 -0700)]
Update NEWS file.
Michael Vrable [Fri, 17 Aug 2007 15:08:11 +0000 (08:08 -0700)]
Bug fix for the rewritten spawn_filter function.
We were accidentally using the wrong variable (filter_pid instead of pid)
when looking at the result of a fork call, with the result that both
processes thought they were the parent.
Michael Vrable [Fri, 17 Aug 2007 04:38:25 +0000 (21:38 -0700)]
Switch to stdio-based I/O for writing descriptor file.
This should be a little bit easier to hook into filtering code (we can open
the output file, spawn the filter, and then fdopen the resulting
descriptor), which will be used when implementing signing of the descriptor
files.
Michael Vrable [Thu, 16 Aug 2007 22:53:29 +0000 (15:53 -0700)]
Cleanup of the tar-store code.
- Make spawn_filter generic, not part of Tarfile, so that it can be used
by other parts of the code. (Specifically, it should be used later
when writing out a segment descriptor, so that the descriptor can be
filtered through gpg for signing.)
- Drop internal_write_object. It had only one caller, so simply inline
the code.
Michael Vrable [Thu, 16 Aug 2007 22:12:40 +0000 (15:12 -0700)]
lbs-filter-gpg should run gpg in batch mode.
Supply the --batch option to gpg so that it can run without a terminal.
Michael Vrable [Thu, 16 Aug 2007 21:19:13 +0000 (14:19 -0700)]
Write an example script for invoking gpg to encrypt backup segments.
This script acts as a filter with options for encrypting, decrypting, and
signing data. Signing uses --clearsign, and will eventually be appropriate
for signing the snapshot descriptor files.
Michael Vrable [Fri, 10 Aug 2007 18:18:49 +0000 (11:18 -0700)]
Include version number in usage message.
Michael Vrable [Fri, 10 Aug 2007 18:13:17 +0000 (11:13 -0700)]
Update NEWS for 0.3 release.
Michael Vrable [Fri, 10 Aug 2007 18:04:57 +0000 (11:04 -0700)]
Preview of a new lbs-util command for snapshot maintenance.
This introduces a new Perl module (LBS.pm) which is an interface for
reading LBS snapshots, and a small command which uses it (lbs-util). Few
commands are implemented yet, but more should follow.
Michael Vrable [Fri, 10 Aug 2007 17:59:33 +0000 (10:59 -0700)]
Include segment checksums as "Checksums" not "Checksum-File" in descriptor.
Avoid the use of a dash in a key name in the descriptor file, since it is
not well-tesetd with the various tools.
Michael Vrable [Fri, 10 Aug 2007 17:23:51 +0000 (10:23 -0700)]
Minor database schema fix.
When constructing the segment_info view, we can no longer do a natural join
on the block_index and segments tables, since the segments table recently
acquired a checksum column. Now explicitly join on the segmentid column
only, which is the desired behavior (and what a natural join used to do).
Michael Vrable [Fri, 10 Aug 2007 03:41:37 +0000 (20:41 -0700)]
NEWS updates.
Michael Vrable [Fri, 10 Aug 2007 03:36:51 +0000 (20:36 -0700)]
Write out a .sha1sums file with checksums for segments in this snapshot.
Some segments might be left out; we only write out lines for those segments
that we know a checksum for. These checksums are stored in the local
database (so we can find checksums for old segments), but entries might be
missing.
Michael Vrable [Fri, 10 Aug 2007 02:23:35 +0000 (19:23 -0700)]
Improve reporting of database errors.
Michael Vrable [Fri, 10 Aug 2007 02:23:20 +0000 (19:23 -0700)]
Compute checksums of segments and store them in the local database.
When a segment is fully written out, compute a checksum of the file
actually written (post-filtering). Store this in the local database so
that it will be possible to write out, at the end of a backup, a file
containing the checksums of all segments used for the snapshot (including
old ones not written out in this execution).
Michael Vrable [Fri, 10 Aug 2007 02:20:30 +0000 (19:20 -0700)]
Fix a double-close of a file descriptor.
When a file is dumped, the file descriptor is opened by scanfile(), so that
is the function which should close it (not dump_inode()).
Michael Vrable [Wed, 8 Aug 2007 23:41:06 +0000 (16:41 -0700)]
Extract the version number from NEWS if git-describe is not available.
This should allow building from tarballs (not the git version), and still
correctly fill in the version number (so it can be incorporated into
generated files).
If git-describe does not return a value, the first word from the NEWS file
is taken as the version number.
Michael Vrable [Wed, 8 Aug 2007 19:09:31 +0000 (12:09 -0700)]
Add a NEWS file summarizing changes in each release.
Michael Vrable [Wed, 8 Aug 2007 17:26:05 +0000 (10:26 -0700)]
Do not include link/inode information for directories.
Directories cannot be hard-linked, so do not bother to include a link count
and inode information for directories in a snapshot.
Michael Vrable [Wed, 8 Aug 2007 17:00:00 +0000 (10:00 -0700)]
Include link counts and inode numbers in metadata dumps.
When the link count on a file is greater than one, include the link count
and a representation of the inode number (actually, inode number and
device). This will provide all information which should be needed for
detecting hard links at restore time. The backup program itself does not
identify hard-linked files and treat them any differently--since storage
for identical files can be shared, the storing a hard-linked file multiple
times should still be relatively efficient.
Michael Vrable [Tue, 7 Aug 2007 20:57:02 +0000 (13:57 -0700)]
Drop dependence on libtar.
Implement everything we need to write TAR files ourself; this isn't too
difficult since we are only ever writing regular files with fixed-length
filenames (so need to worry about long file names). In fact, doing it
directly is really no more complicated than using libtar.
Michael Vrable [Mon, 6 Aug 2007 21:02:23 +0000 (14:02 -0700)]
Update the format documentation to describe the current backup format.
The old documentation referred to the old binary backup format, and was
incomplete at that. Rewrite it to discuss the current format, including a
discussion of segment/object storage, object references, the format of the
metadata listing, and the root backup descriptor.
The documentation can be improved, and some parts are certainly a bit
spotty, but this gives a good quick overview of the entire format.
Michael Vrable [Fri, 3 Aug 2007 05:14:37 +0000 (22:14 -0700)]
Print a help message if no paths are specified to back up.
Rather than defaulting to backing up the current directory, always require
that a path be specified. If no paths are specified, print the usage
message and exit.
Also, fix a crash that previously occurred when no paths were specified.
Michael Vrable [Thu, 2 Aug 2007 23:32:07 +0000 (16:32 -0700)]
Update .gitignore: explicitly ignore .o files.
Michael Vrable [Fri, 27 Jul 2007 18:58:17 +0000 (11:58 -0700)]
Provide a more detailed indication of file status when backing up.
Indicate whether a file is new, or whether we managed to find the old data
in other segments, or whether data is being copied because we are cleaning
segments, etc.
Michael Vrable [Fri, 27 Jul 2007 01:01:01 +0000 (18:01 -0700)]
URI-escape the '@' character.
Ensure that the '@' character is escaped in strings. This isn't necessary
now, but in the future this might be useful so that indirect references are
never ambiguous. (If a "@" appears, it's an indirect reference; if a
literal "@" is needed, it is escaped.)