bluesky.git
14 years agoFix some resource leaks in journal replay.
Michael Vrable [Tue, 31 Aug 2010 22:00:09 +0000 (15:00 -0700)]
Fix some resource leaks in journal replay.

14 years agoImplement basic full log replay.
Michael Vrable [Tue, 31 Aug 2010 21:06:53 +0000 (14:06 -0700)]
Implement basic full log replay.

This still needs some checking over for bugs and minor fixes.  It replays
the entire journal from the start to rebuild filesystem state.  Still
needed: partial joural replay, starting from a checkpoint in the cloud.

14 years agoUpdate CRC-32 implementation.
Michael Vrable [Tue, 31 Aug 2010 20:02:38 +0000 (13:02 -0700)]
Update CRC-32 implementation.

Invert the result of the CRC computation at the end.  This will catch extra
null bytes at the end of the buffer, but required updating the CRC
validation.

14 years agoAdd in some support for journal replay.
Michael Vrable [Fri, 27 Aug 2010 23:16:17 +0000 (16:16 -0700)]
Add in some support for journal replay.

This isn't all functional yet, but making it fully functional will require
updating the data fields that are written to the journal first...

14 years agoStart work on log replay for filesystem recovery.
Michael Vrable [Thu, 26 Aug 2010 00:28:26 +0000 (17:28 -0700)]
Start work on log replay for filesystem recovery.

Right now this implements scanning of one journal segment with consistency
checking to find items that were written out.

Also fix the checksum calculation on log entries so that they will validate
properly (we want to compute the checksum so that on validation, computing
the checksum of the entire object results in a value of zero).

14 years agoAdd an inode map data structure to track the location of inodes in logs.
Michael Vrable [Wed, 25 Aug 2010 20:24:34 +0000 (13:24 -0700)]
Add an inode map data structure to track the location of inodes in logs.

14 years agoUpdate logic for flushing data to cloud.
Michael Vrable [Tue, 24 Aug 2010 23:53:00 +0000 (16:53 -0700)]
Update logic for flushing data to cloud.

Do not force a commit of the most recent data, and instead just write out
whatever was last written to the journal.

This could be a win or a loss:
  - We do not need to force a sync of all data to the journal when we
    upload data to the cloud.
  - But, we may end up writing out old data, which we'll then need to
    overwrite a short time later.

14 years agoMake cache size run-time configurable.
Michael Vrable [Mon, 23 Aug 2010 17:21:37 +0000 (10:21 -0700)]
Make cache size run-time configurable.

14 years agoImplement new scheme for retaining needed journal segments.
Michael Vrable [Sun, 22 Aug 2010 05:42:35 +0000 (22:42 -0700)]
Implement new scheme for retaining needed journal segments.

Write full filesystem snapshots to the cloud, and keep track of the journal
position before the snapshot process starts.  When it finishes, the journal
segments before that mark can be reclaimed (if needed).

This could be improved but should at least be safe.

14 years agoFix a longstanding(?) memory-leak bug when truncating a file.
Michael Vrable [Sun, 22 Aug 2010 04:48:50 +0000 (21:48 -0700)]
Fix a longstanding(?) memory-leak bug when truncating a file.

14 years agoBack out dirty reference tracking, as the design was flawed.
Michael Vrable [Fri, 20 Aug 2010 20:51:31 +0000 (13:51 -0700)]
Back out dirty reference tracking, as the design was flawed.

Objects can be written to the journal but not to the cloud--for example, if
a data block is written to the journal but overwritten before the file is
flushed to the cloud.  This write-combining is good, but the old code for
tracking when a journal segment could be reclaimed couldn't handle this.

So, back out that dirty reference tracking code, in preparation for
replacing it with another approach.

14 years agoMake cloud storage more robust.
Michael Vrable [Fri, 20 Aug 2010 00:16:53 +0000 (17:16 -0700)]
Make cloud storage more robust.

  - Do not consider data committed until we get a reply from the cloud.
  - Add retries on write and on read.

14 years agoAdd a target size for the cache, and prune the cache when it gets larger.
Michael Vrable [Thu, 19 Aug 2010 00:03:29 +0000 (17:03 -0700)]
Add a target size for the cache, and prune the cache when it gets larger.

14 years agoTrack journal files which contain dirty data and which can be reclaimed.
Michael Vrable [Wed, 18 Aug 2010 20:45:40 +0000 (13:45 -0700)]
Track journal files which contain dirty data and which can be reclaimed.

14 years agoImplement a (dumb) cache garbage collector.
Michael Vrable [Wed, 18 Aug 2010 01:46:58 +0000 (18:46 -0700)]
Implement a (dumb) cache garbage collector.

This is a proof of concept; it doesn't delete journal files and deletes
cache files nearly as soon as they are unused, so it needs better
algorithms for choosing when to delete files.  But it does seem to work.

14 years agoImprove journal/cloud cache locking and add access time tracking.
Michael Vrable [Tue, 17 Aug 2010 22:23:40 +0000 (15:23 -0700)]
Improve journal/cloud cache locking and add access time tracking.

14 years agoDebugging/refcount cleanups.
Michael Vrable [Tue, 17 Aug 2010 18:02:56 +0000 (11:02 -0700)]
Debugging/refcount cleanups.

14 years agoMinor bugfixes/tweaks.
Michael Vrable [Mon, 16 Aug 2010 22:13:53 +0000 (15:13 -0700)]
Minor bugfixes/tweaks.

14 years agoFirst attempt at supporting reading data back from cloud log segments.
Michael Vrable [Mon, 16 Aug 2010 19:04:03 +0000 (12:04 -0700)]
First attempt at supporting reading data back from cloud log segments.

There are still some bugs, hacks, race conditions, etc., but this seems to
be doing mostly the right thing and so is a good start.

14 years agoSerialized inode data should be dropped from caches, too.
Michael Vrable [Sat, 14 Aug 2010 22:47:40 +0000 (15:47 -0700)]
Serialized inode data should be dropped from caches, too.

14 years agoAttempt at limiting the rate at which memory is dirtied.
Michael Vrable [Thu, 12 Aug 2010 05:16:35 +0000 (22:16 -0700)]
Attempt at limiting the rate at which memory is dirtied.

14 years agoReference counting bugfix.
Michael Vrable [Wed, 11 Aug 2010 22:58:39 +0000 (15:58 -0700)]
Reference counting bugfix.

14 years agoNewly-created inodes should be marked as modified.
Michael Vrable [Wed, 11 Aug 2010 20:15:14 +0000 (13:15 -0700)]
Newly-created inodes should be marked as modified.

The NFS proxy code previously didn't do this, with the result that some
inodes (symlinks were first noticed, but the problem affected other areas
too) would not get entered into the appropriate LRU lists.

14 years agoMore aggressively use memory-mapped data for cloud log items.
Michael Vrable [Wed, 11 Aug 2010 19:43:19 +0000 (12:43 -0700)]
More aggressively use memory-mapped data for cloud log items.

Replace a string with a memory-mapped version as soon as possible when the
item is written out.  The intent is that memory-mapped versions rely on the
kernel's memory management, and don't need to be written to swap like a
private copy would, and so should give better overall system memory
management.

14 years agoImprove tracking of memory usage in BlueSky.
Michael Vrable [Wed, 11 Aug 2010 19:29:09 +0000 (12:29 -0700)]
Improve tracking of memory usage in BlueSky.

Most data, except for dirty data blocks not flushed out to the journal yet,
will be in the form of cloud log entries.  Create statistics counters to
track how many cloud log items are in each of several states (in memory
only, writeback, on disk, in cloud).

14 years agoMore fixes to BlueSky cache management.
Michael Vrable [Tue, 10 Aug 2010 21:19:49 +0000 (14:19 -0700)]
More fixes to BlueSky cache management.

14 years agoDrop old code for flushing data to the cloud.
Michael Vrable [Tue, 10 Aug 2010 00:36:17 +0000 (17:36 -0700)]
Drop old code for flushing data to the cloud.

14 years agoWork to unify the cloud segment writing with other cache management.
Michael Vrable [Tue, 10 Aug 2010 00:21:56 +0000 (17:21 -0700)]
Work to unify the cloud segment writing with other cache management.

14 years agoSplit cloud log segments into modestly-sized chunks.
Michael Vrable [Mon, 9 Aug 2010 23:00:57 +0000 (16:00 -0700)]
Split cloud log segments into modestly-sized chunks.

14 years agoAdd a null storage implementation.
Michael Vrable [Fri, 6 Aug 2010 21:08:59 +0000 (14:08 -0700)]
Add a null storage implementation.

This simply discards all data written to it.  Useful for testing purposes,
if all the data remains in the local log and so we never need to fetch data
from the storage implementation.

14 years agoFix some memory leaks.
Michael Vrable [Thu, 5 Aug 2010 16:48:41 +0000 (09:48 -0700)]
Fix some memory leaks.

14 years agoMake links between cloud log entries direct.
Michael Vrable [Thu, 5 Aug 2010 16:38:20 +0000 (09:38 -0700)]
Make links between cloud log entries direct.

Rather than giving the ID provide a direct pointer to the object.

14 years agoRework caching of data blocks to eliminate double-caching.
Michael Vrable [Wed, 4 Aug 2010 22:32:00 +0000 (15:32 -0700)]
Rework caching of data blocks to eliminate double-caching.

Previously data could be cached both as a cloud log item and as a string at
the inode level.  Now cache as a string for dirty data and a cloud log item
for clean data.

14 years agoFix up reference counting for cloud log items.
Michael Vrable [Wed, 4 Aug 2010 04:58:51 +0000 (21:58 -0700)]
Fix up reference counting for cloud log items.

14 years agoA few attempted bugfixes for log data lifetimes.
Michael Vrable [Wed, 4 Aug 2010 04:21:20 +0000 (21:21 -0700)]
A few attempted bugfixes for log data lifetimes.

A much better fix will depend on reworking this code a bit more.

14 years agoFix up reference counting for memory-mapped journal log segments.
Michael Vrable [Tue, 3 Aug 2010 21:22:25 +0000 (14:22 -0700)]
Fix up reference counting for memory-mapped journal log segments.

14 years agoImprove the reading back of objects committed to the journal.
Michael Vrable [Tue, 3 Aug 2010 21:10:21 +0000 (14:10 -0700)]
Improve the reading back of objects committed to the journal.

Implement a cache of memory-mapped log files so that when multiple objects
are requested we can re-use the mapping.  Make log files fixed sizes (call
ftruncate when opening the log file) so the entire thing can be memory
mapped at the start.

14 years agoMore cache behavior tweaks.
Michael Vrable [Tue, 3 Aug 2010 05:05:28 +0000 (22:05 -0700)]
More cache behavior tweaks.

14 years agoPreliminary support for dropping cached file data from memory.
Michael Vrable [Tue, 3 Aug 2010 03:48:21 +0000 (20:48 -0700)]
Preliminary support for dropping cached file data from memory.

14 years agoWork to allow mmap-ed log entries to be used for data blocks.
Michael Vrable [Mon, 2 Aug 2010 22:50:37 +0000 (15:50 -0700)]
Work to allow mmap-ed log entries to be used for data blocks.

14 years agoGradually converting code to use cloud logs for storing data.
Michael Vrable [Fri, 30 Jul 2010 23:17:30 +0000 (16:17 -0700)]
Gradually converting code to use cloud logs for storing data.

14 years agoDump cloud location of data items in debug output.
Michael Vrable [Thu, 29 Jul 2010 22:43:10 +0000 (15:43 -0700)]
Dump cloud location of data items in debug output.

14 years ago(Mostly) merge local and cloud logging together.
Michael Vrable [Thu, 29 Jul 2010 22:05:37 +0000 (15:05 -0700)]
(Mostly) merge local and cloud logging together.

14 years agoPreparatory work before implementing proper cloud writing.
Michael Vrable [Wed, 28 Jul 2010 19:05:05 +0000 (12:05 -0700)]
Preparatory work before implementing proper cloud writing.

14 years agoSome initial work on logging gathering data into cloud log segments.
Michael Vrable [Mon, 26 Jul 2010 03:19:13 +0000 (20:19 -0700)]
Some initial work on logging gathering data into cloud log segments.

This is still in progress, and needs to be better hooked in to cache
management as well as actually writing out a proper sequence of logs
instead of overwriting the same location each time.  But it should have the
basics of gathering up data for dirty inodes into a segment and writing it.

14 years agoInitial work on cloud log-structured storage.
Michael Vrable [Thu, 22 Jul 2010 21:51:37 +0000 (14:51 -0700)]
Initial work on cloud log-structured storage.

Right now this is just the first work towards tracking what objects are
stored where (a log in the cloud, in local memory, on local disk, etc.).

14 years agoCode cleanup.
Michael Vrable [Tue, 20 Jul 2010 18:54:06 +0000 (11:54 -0700)]
Code cleanup.

14 years agoAdd checksumming to filesystem journal.
Michael Vrable [Mon, 19 Jul 2010 22:31:40 +0000 (15:31 -0700)]
Add checksumming to filesystem journal.

This will be used to check for consistency during log recovery.

14 years agoAllow batched log writes when writing dirty inodes.
Michael Vrable [Mon, 19 Jul 2010 19:46:30 +0000 (12:46 -0700)]
Allow batched log writes when writing dirty inodes.

14 years agoBasic filesystem journaling.
Michael Vrable [Sun, 18 Jul 2010 04:35:54 +0000 (21:35 -0700)]
Basic filesystem journaling.

Infrastructure for writing log entries synchronously (though the log format
is not yet finished and isn't yet useful), and a partial hook into the
BlueSky filesystem.

14 years agoAdd synchronous inode logging in the NFS server.
Michael Vrable [Thu, 15 Jul 2010 22:54:52 +0000 (15:54 -0700)]
Add synchronous inode logging in the NFS server.

This is in preparation for adding inode logging for data durability.

14 years agoBarriers did not handle requests that finished too quickly.
Michael Vrable [Thu, 15 Jul 2010 22:54:05 +0000 (15:54 -0700)]
Barriers did not handle requests that finished too quickly.

14 years agoCommit a few log benchmark results.
Michael Vrable [Wed, 14 Jul 2010 23:32:08 +0000 (16:32 -0700)]
Commit a few log benchmark results.

14 years agoBugfix.
Michael Vrable [Wed, 14 Jul 2010 22:25:51 +0000 (15:25 -0700)]
Bugfix.

14 years agoUpdate commit log benchmarks.
Michael Vrable [Wed, 14 Jul 2010 22:23:17 +0000 (15:23 -0700)]
Update commit log benchmarks.

14 years agoMake the log benchmark configurable and make a parameter sweep script.
Michael Vrable [Wed, 14 Jul 2010 05:25:39 +0000 (22:25 -0700)]
Make the log benchmark configurable and make a parameter sweep script.

14 years agoA new microbenchmark tool to figure out what format to use for logs.
Michael Vrable [Wed, 14 Jul 2010 00:30:50 +0000 (17:30 -0700)]
A new microbenchmark tool to figure out what format to use for logs.

We want to log filesystem operations to disk so they are persistent across
proxy crashes, but should do so in a manner that is relatively high
performance...  Try to figure out what that should be.

14 years agoAttempt to batch together database writes for performance.
Michael Vrable [Tue, 13 Jul 2010 00:26:21 +0000 (17:26 -0700)]
Attempt to batch together database writes for performance.

Doesn't seem to help all that much though...

14 years agoSwitch to an explicit BDB operations queue instead of thread pool.
Michael Vrable [Mon, 12 Jul 2010 19:29:15 +0000 (12:29 -0700)]
Switch to an explicit BDB operations queue instead of thread pool.

Create a dedicated thread for handling BDB operations, and pass gets/puts
to it via a queue.  This is in preparation for batching operations
together in transactions to see if we can improve performance that way.

14 years agoRemove localstore.c; for now BDB work will be done in store-bdb.c.
Michael Vrable [Wed, 7 Jul 2010 19:05:04 +0000 (12:05 -0700)]
Remove localstore.c; for now BDB work will be done in store-bdb.c.

14 years agoSome test work with using Berkeley DB for a local disk cache.
Michael Vrable [Wed, 7 Jul 2010 19:03:40 +0000 (12:03 -0700)]
Some test work with using Berkeley DB for a local disk cache.

Implement this at first by simply making it a new storage backend.  Later
we will have to make it a separate layer so we can stack a Berkeley DB
cache/write log with a remote storage option.

14 years agoSome new format design notes.
Michael Vrable [Wed, 30 Jun 2010 23:59:14 +0000 (16:59 -0700)]
Some new format design notes.

14 years agoAdd another S3 benchmark tool.
Michael Vrable [Wed, 30 Jun 2010 19:25:14 +0000 (12:25 -0700)]
Add another S3 benchmark tool.

14 years agoMore S3 benchmark work.
Michael Vrable [Wed, 30 Jun 2010 19:24:50 +0000 (12:24 -0700)]
More S3 benchmark work.

14 years agoAdd an S3 test script for range requests.
Michael Vrable [Fri, 18 Jun 2010 01:11:15 +0000 (18:11 -0700)]
Add an S3 test script for range requests.

14 years agoAdd results from a simple test run of the multi:s3 backend.
Michael Vrable [Wed, 16 Jun 2010 22:26:29 +0000 (15:26 -0700)]
Add results from a simple test run of the multi:s3 backend.

14 years agoTesting S3 with more object sizes.
Michael Vrable [Wed, 16 Jun 2010 20:50:29 +0000 (13:50 -0700)]
Testing S3 with more object sizes.

14 years agoMore Amazon S3 test script updates.
Michael Vrable [Tue, 15 Jun 2010 18:29:29 +0000 (11:29 -0700)]
More Amazon S3 test script updates.

14 years agoFix a possibel deadlock from synchronizing the superblock.
Michael Vrable [Fri, 4 Jun 2010 21:33:54 +0000 (14:33 -0700)]
Fix a possibel deadlock from synchronizing the superblock.

14 years agoAdd a "multi" storage backend which doubles all GET requests.
Michael Vrable [Fri, 4 Jun 2010 17:47:25 +0000 (10:47 -0700)]
Add a "multi" storage backend which doubles all GET requests.

This can be used to test whether the performance with S3 improves when
making parallel requests to reduce latency.

14 years agoUpdates to script for testing multiple parallel fetches.
Michael Vrable [Thu, 3 Jun 2010 22:25:31 +0000 (15:25 -0700)]
Updates to script for testing multiple parallel fetches.

14 years agoTesting of multiple requests in parallel.
Michael Vrable [Fri, 28 May 2010 19:08:59 +0000 (12:08 -0700)]
Testing of multiple requests in parallel.

14 years agoAdd a few more test scripts.
Michael Vrable [Wed, 26 May 2010 21:17:41 +0000 (14:17 -0700)]
Add a few more test scripts.

14 years agoMore trace analysis scripts.
Michael Vrable [Thu, 13 May 2010 20:32:24 +0000 (13:32 -0700)]
More trace analysis scripts.

14 years agoImprove analysis of S3 packet traces.
Michael Vrable [Sat, 8 May 2010 02:12:51 +0000 (19:12 -0700)]
Improve analysis of S3 packet traces.

14 years agoKeep IP addresses with network measurements.
Michael Vrable [Thu, 6 May 2010 23:32:45 +0000 (16:32 -0700)]
Keep IP addresses with network measurements.

14 years agoMore S3 performance evaluation scripts.
Michael Vrable [Wed, 5 May 2010 20:46:44 +0000 (13:46 -0700)]
More S3 performance evaluation scripts.

14 years agoCompile fix for directory renaming; use virtual host access method for S3.
Michael Vrable [Wed, 5 May 2010 00:24:32 +0000 (17:24 -0700)]
Compile fix for directory renaming; use virtual host access method for S3.

14 years agoMore cloud storage performance-measurement scripts.
Michael Vrable [Tue, 4 May 2010 23:53:14 +0000 (16:53 -0700)]
More cloud storage performance-measurement scripts.

14 years agoDirectory reorganization.
Michael Vrable [Tue, 4 May 2010 19:22:51 +0000 (12:22 -0700)]
Directory reorganization.

14 years agoWork on more tools for automating cloud storage performance measurement.
Michael Vrable [Tue, 4 May 2010 19:19:42 +0000 (12:19 -0700)]
Work on more tools for automating cloud storage performance measurement.

14 years agoA few more Azure/analysis updates.
Michael Vrable [Tue, 4 May 2010 04:46:50 +0000 (21:46 -0700)]
A few more Azure/analysis updates.

14 years agoAnother update to the Azure test code.
Michael Vrable [Mon, 3 May 2010 04:51:22 +0000 (21:51 -0700)]
Another update to the Azure test code.

14 years agoStart work on a test program to communicate with Windows Azure.
Michael Vrable [Mon, 3 May 2010 02:10:02 +0000 (19:10 -0700)]
Start work on a test program to communicate with Windows Azure.

This could provide another backend for BlueSky, and it might be nice to at
least make a few benchmark measurements of Azure to complement the S3 ones.

14 years agoSimple tool for analyzing a dump of a single S3 connection.
Michael Vrable [Thu, 29 Apr 2010 23:07:21 +0000 (16:07 -0700)]
Simple tool for analyzing a dump of a single S3 connection.

Will try to extract information about any delays in the transfers.

14 years agoMore playing with parsing of packet traces.
Michael Vrable [Wed, 28 Apr 2010 22:06:50 +0000 (15:06 -0700)]
More playing with parsing of packet traces.

Extract window size values (and handle TCP window scaling).

14 years agoMore updates to S3 pipelining test.
Michael Vrable [Wed, 28 Apr 2010 21:30:14 +0000 (14:30 -0700)]
More updates to S3 pipelining test.

It seems there is a bug in S3 with many pipelined requests that we are
triggering in testing:
http://developer.amazonwebservices.com/connect/thread.jspa?messageID=39907

14 years agoCreate a simple Python script for sending pipelined GET requests to S3.
Michael Vrable [Wed, 28 Apr 2010 18:04:12 +0000 (11:04 -0700)]
Create a simple Python script for sending pipelined GET requests to S3.

S3 doesn't seem to respond properly to these yet.

14 years agoPrint flow identification in TCP parsing output.
Michael Vrable [Tue, 27 Apr 2010 21:13:45 +0000 (14:13 -0700)]
Print flow identification in TCP parsing output.

Also fix a division-by-zero bug in computing bandwidth if no data is
transferred.

14 years agoSome initial results of running the NFS replay code.
Michael Vrable [Mon, 26 Apr 2010 22:36:45 +0000 (15:36 -0700)]
Some initial results of running the NFS replay code.

14 years agoFix a memory leak in the NFS-over-UDP code.
Michael Vrable [Mon, 26 Apr 2010 05:38:27 +0000 (22:38 -0700)]
Fix a memory leak in the NFS-over-UDP code.

The TCP code still has a per-connection leak, but at least this should help
with the UDP per-request leak.

14 years agoEnable real-time trace replay instead of maximum-speed replay mode.
Michael Vrable [Sun, 25 Apr 2010 00:34:09 +0000 (17:34 -0700)]
Enable real-time trace replay instead of maximum-speed replay mode.

14 years agoAllow longer server names; fixes (temporarily) a buffer overflow.
Michael Vrable [Sat, 24 Apr 2010 02:46:13 +0000 (19:46 -0700)]
Allow longer server names; fixes (temporarily) a buffer overflow.

14 years agoCode fixes for TBBT.
Michael Vrable [Fri, 23 Apr 2010 20:30:25 +0000 (13:30 -0700)]
Code fixes for TBBT.

14 years agoAdd two missing files from the TBBT import.
Michael Vrable [Fri, 23 Apr 2010 19:50:15 +0000 (12:50 -0700)]
Add two missing files from the TBBT import.

14 years agoImport TBBT (NFS trace replay).
Michael Vrable [Fri, 23 Apr 2010 19:48:14 +0000 (12:48 -0700)]
Import TBBT (NFS trace replay).

Code downloaded from http://www.ecsl.cs.sunysb.edu/TBBT/.

14 years agoMerge git+ssh://root@c09-44.sysnet.ucsd.edu/scratch/bluesky
Michael Vrable [Fri, 23 Apr 2010 19:43:57 +0000 (12:43 -0700)]
Merge git+ssh://root@c09-44.sysnet.ucsd.edu/scratch/bluesky

14 years agoAdd pcap dump parser for extracting S3 performance measurements.
Michael Vrable [Wed, 21 Apr 2010 23:13:58 +0000 (16:13 -0700)]
Add pcap dump parser for extracting S3 performance measurements.

14 years agoMore measurement results of S3 latency.
Michael Vrable [Wed, 21 Apr 2010 23:13:39 +0000 (16:13 -0700)]
More measurement results of S3 latency.

14 years agoAdd new results.
Michael Vrable [Tue, 20 Apr 2010 03:10:02 +0000 (20:10 -0700)]
Add new results.