Michael Vrable [Sun, 20 Feb 2011 04:28:46 +0000 (20:28 -0800)]
More reworking of microbenchmark handling of reads/writes and file layout.
Michael Vrable [Sun, 20 Feb 2011 03:30:49 +0000 (19:30 -0800)]
Change how mixed read/write workloads are done in mixedbench
Split reads and writes into separate threads, instead of mixing operations
within a single thread. This does not allow arbitrary mixes any longer,
since the ratios are quantized based on the number of total threads.
Michael Vrable [Mon, 14 Feb 2011 23:06:03 +0000 (15:06 -0800)]
Commit script to parse microbenchmark results.
Michael Vrable [Mon, 14 Feb 2011 19:15:36 +0000 (11:15 -0800)]
Delete old benchmark execution script. It is obsolete.
Michael Vrable [Mon, 14 Feb 2011 07:41:10 +0000 (23:41 -0800)]
Update benchmark script to run a few other cases.
Michael Vrable [Mon, 14 Feb 2011 05:57:38 +0000 (21:57 -0800)]
Add a "native" target to run the kernel NFS server.
Michael Vrable [Mon, 14 Feb 2011 05:39:56 +0000 (21:39 -0800)]
Fix standard deviation estimation.
Properly reset statistics counters between intervals.
Michael Vrable [Mon, 14 Feb 2011 05:35:36 +0000 (21:35 -0800)]
More benchmark changes.
Michael Vrable [Mon, 14 Feb 2011 05:30:35 +0000 (21:30 -0800)]
Benchmark fixups.
Michael Vrable [Mon, 14 Feb 2011 05:12:52 +0000 (21:12 -0800)]
Spread test files for mixedbench across multiple subdirectories.
Michael Vrable [Mon, 14 Feb 2011 03:42:38 +0000 (19:42 -0800)]
Reduce effective memory size on benchmark client.
Lock memory to reduce the amount available for the NFS file cache, so
that requests will go the proxy.
Michael Vrable [Mon, 14 Feb 2011 03:01:14 +0000 (19:01 -0800)]
Split benchmark file system setup and execution into separate steps.
Michael Vrable [Sun, 13 Feb 2011 20:57:13 +0000 (12:57 -0800)]
Allow interval at which benchmark results are printed to be controlled.
Michael Vrable [Sun, 13 Feb 2011 20:34:00 +0000 (12:34 -0800)]
Work on automatically running benchmarks.
Michael Vrable [Thu, 10 Feb 2011 18:40:47 +0000 (10:40 -0800)]
Minor fix to benchmark running script.
root [Thu, 10 Feb 2011 00:49:49 +0000 (16:49 -0800)]
Add script to clean up files from a benchmarking run with nfsproxy.
Michael Vrable [Thu, 10 Feb 2011 00:31:51 +0000 (16:31 -0800)]
Add script to delete state from S3 for benchmarking cleanup.
root [Thu, 10 Feb 2011 00:27:53 +0000 (16:27 -0800)]
Add script for stopping a running nfsproxy
Michael Vrable [Wed, 9 Feb 2011 23:26:54 +0000 (15:26 -0800)]
Add script to run a simple microbenchmark.
root [Wed, 9 Feb 2011 23:23:36 +0000 (15:23 -0800)]
Add setup script for starting the proxy for benchmarking.
Michael Vrable [Mon, 7 Feb 2011 18:02:04 +0000 (10:02 -0800)]
A few more updates to the mixed read/write microbenchmark.
Michael Vrable [Mon, 7 Feb 2011 15:22:44 +0000 (07:22 -0800)]
Changes to output formattng of mixedbench.
Michael Vrable [Mon, 7 Feb 2011 13:58:01 +0000 (05:58 -0800)]
Add a new more general-purpose microbenchmark tool in C.
Michael Vrable [Thu, 3 Feb 2011 00:07:58 +0000 (16:07 -0800)]
Fix a potential race between creating and destroying mmaped strings
Potentially, we might try to create a new reference to a memory-mapped
region while another thread unreferences and frees that region. Ensure
when freeing the mapping that there really are no mappings (doube-check
after taking the lock).
Michael Vrable [Wed, 2 Feb 2011 18:01:00 +0000 (10:01 -0800)]
Fix a memory leak.
Michael Vrable [Wed, 2 Feb 2011 17:30:47 +0000 (09:30 -0800)]
Fix use-after-free in network address lookups.
Michael Vrable [Fri, 28 Jan 2011 23:48:46 +0000 (15:48 -0800)]
Update simplestore to use a standard port number.
Michael Vrable [Fri, 28 Jan 2011 23:36:35 +0000 (15:36 -0800)]
Add simplestore server to build.
Michael Vrable [Mon, 24 Jan 2011 01:12:01 +0000 (17:12 -0800)]
Add a connection pool to the simplesever client
Re-use TCP connections across requests instead of closing the connection
each time.
Michael Vrable [Mon, 24 Jan 2011 00:52:52 +0000 (16:52 -0800)]
Start work on a new cleaner backend
Begin adding support for the simplestore backend to the cleaner code. This
is not yet tested and so probably needs a bit more work.
Michael Vrable [Fri, 21 Jan 2011 06:12:53 +0000 (22:12 -0800)]
Rename filestore -> simplestore
This is more consistent with the naming in the rest of the BlueSky code.
Michael Vrable [Fri, 21 Jan 2011 01:55:31 +0000 (17:55 -0800)]
Enable range requests in the simple storage backend
Michael Vrable [Fri, 21 Jan 2011 00:34:59 +0000 (16:34 -0800)]
Fix off-by-one error in the BlueSky simple storage backend.
One extra byte of data was being included into the data returned, leading
to data corruption. Fix this.
Michael Vrable [Tue, 18 Jan 2011 05:35:38 +0000 (21:35 -0800)]
Check in a new very simple client/server storage protocol implementation.
This is meant for local testing; the server writes data to a local disk
without synchronization so data isn't guaranteed to be durable but this
should provide better performance for benchmarking (where we care about the
BlueSky performance and not the storage sever itself).
Michael Vrable [Fri, 10 Dec 2010 23:02:15 +0000 (15:02 -0800)]
Commit a basic but functional online cleaner implementation.
The cleaner needn't be aware of parallelism; the proxy will handle merging
of data.
Michael Vrable [Fri, 10 Dec 2010 19:58:25 +0000 (11:58 -0800)]
Fixes and debugging assertions.
Michael Vrable [Fri, 10 Dec 2010 17:12:50 +0000 (09:12 -0800)]
Fix a race condition in bluesky_mmap_unref.
It turned out that the old code had a race--between decrementing and
testing the reference count and acquiring the lock, another thread could
potentially acquire a reference, release the reference, and then unmap the
data. This still left the reference count as zero, so a second unmap was
attempted. Now, use mmap->addr to detect if the unmap needs to be done or
not which should eliminate this race.
Michael Vrable [Fri, 10 Dec 2010 02:26:15 +0000 (18:26 -0800)]
Mostly implement inode merging for the in-proxy cleaner component.
This doesn't yet ensure modifications are serialized out again later.
Michael Vrable [Thu, 9 Dec 2010 20:12:04 +0000 (12:12 -0800)]
Fix to the cleaner when writing out a new inode map.
Michael Vrable [Thu, 9 Dec 2010 06:32:17 +0000 (22:32 -0800)]
Add code in the proxy cleaner component to iterate over new inodes.
For the moment we actually iterate over all inodes in the cleaned
checkpoint, but later can restrict to just those that are needed. Actually
merging inode changes also still needs to be implemented.
Michael Vrable [Tue, 7 Dec 2010 05:40:30 +0000 (21:40 -0800)]
In-progress commit of online cleaner.
This adds some code to check the cleaner log when the proxy is launched,
but doesn't yet merge any data.
Michael Vrable [Mon, 6 Dec 2010 04:46:49 +0000 (20:46 -0800)]
Rework the checkpoint record format to include a version vector.
This will be used for keeping track of whether we have incorporated changes
made by the cleaner or not, and is a first step (along with having the
clenaer write to the a separate set of logs) to making a functional online
cleaner.
Michael Vrable [Thu, 2 Dec 2010 19:39:56 +0000 (11:39 -0800)]
Try to avoid accessing profiling objects after they are freed.
Michael Vrable [Tue, 30 Nov 2010 04:59:14 +0000 (20:59 -0800)]
Write profile data if and only if requested.
Michael Vrable [Tue, 30 Nov 2010 04:57:27 +0000 (20:57 -0800)]
Allow profile results to be written to a file.
Michael Vrable [Mon, 29 Nov 2010 04:29:33 +0000 (20:29 -0800)]
Reduce debugging messages in non-verbose mode.
Michael Vrable [Mon, 29 Nov 2010 00:21:36 +0000 (16:21 -0800)]
More work on request profiling.
Michael Vrable [Tue, 23 Nov 2010 23:48:41 +0000 (15:48 -0800)]
Add locking, thread-ID tracking to profiling.
Michael Vrable [Tue, 23 Nov 2010 18:27:31 +0000 (10:27 -0800)]
Track requests that initiate inode fetches.
Michael Vrable [Tue, 23 Nov 2010 17:53:18 +0000 (09:53 -0800)]
Start to add request time profiling.
Allow a "profile" object to be created when a request is received, and
track the time at which various events occur when handling the request.
Michael Vrable [Thu, 18 Nov 2010 03:39:00 +0000 (19:39 -0800)]
Add a lookup_last method to the multi-store backend.
Michael Vrable [Wed, 17 Nov 2010 18:12:56 +0000 (10:12 -0800)]
Updates to the workload generator to perform a parameter sweep.
Michael Vrable [Fri, 12 Nov 2010 19:41:46 +0000 (11:41 -0800)]
Implement very basic grouped fetches of objects from the cloud.
Add an item prefetch call, which for now just collects together items that
have been prefetched and when one of them is actually fetched, a range
request is sent that will cover all the objects. Thus this doesn't improve
the time to fetch the first object (actually it will hurt), it does arrange
to have objects fetched in groups where possible which reduces the total
number of operations and the time to get subsequent objects.
Michael Vrable [Fri, 12 Nov 2010 03:12:47 +0000 (19:12 -0800)]
Add a prefetch method. At the moment it is a no-op.
Michael Vrable [Thu, 11 Nov 2010 23:34:42 +0000 (15:34 -0800)]
Move benchmark results elsewhere, out of the repository.
Michael Vrable [Tue, 9 Nov 2010 00:41:42 +0000 (16:41 -0800)]
Return failure code when an S3 put operation fails.
Previously, we were indicating success even when that wasn't true, leading
to errors later when we went to read data we hadn't written. Now, the
caller can retry the put operation if needed.
Michael Vrable [Thu, 4 Nov 2010 23:59:55 +0000 (16:59 -0700)]
Drop a verbose debugging message.
Michael Vrable [Thu, 4 Nov 2010 22:09:54 +0000 (15:09 -0700)]
Update disk cache usage tracking to handle sparse files.
Michael Vrable [Wed, 3 Nov 2010 23:38:03 +0000 (16:38 -0700)]
Bugfix in partial log fetching.
Do not bother mapping the file until the data is available, otherwise the
mapping may be deleted from under us while we are waiting.
Michael Vrable [Wed, 3 Nov 2010 21:35:07 +0000 (14:35 -0700)]
Implement fetching of cloud log items via range requests.
This still needs a bit more work for stability, checking for leaks, etc.,
but implements the basic functionality needed to selectively retrieve just
the needed byte ranges out of cloud log segments to download individual log
items.
Michael Vrable [Sun, 31 Oct 2010 21:53:05 +0000 (14:53 -0700)]
Convert rangeset implementation from a hashtable to GSequence.
This gives log(N) performance, but allows us to easily tell if a request
falls in the middle of an object and check for overlaps on inserts (though
that part isn't implemented yet).
Michael Vrable [Sat, 30 Oct 2010 00:40:14 +0000 (17:40 -0700)]
Implement a rangeset data type and use it to track items in log segments.
The range set can be used to track which byte ranges in a file correspond
to valid objects, and so can be used in the cloud log code to check that
1) objects are actually available if partial log fetching is implemented,
and 2) that objects being fetched have actually been authenticated (if
needed).
This isn't fully working yet. The range sets should be changed from a hash
table to a sequence, since some lookups in the cloud log code are into the
middle of an object (to skip the header and remap the data), and that needs
to be handled properly.
Michael Vrable [Tue, 26 Oct 2010 16:27:56 +0000 (09:27 -0700)]
Add support for byterange requests in the storage layer.
Michael Vrable [Tue, 26 Oct 2010 14:38:36 +0000 (07:38 -0700)]
Rename private structures to remove leading underscores.
This is a cosmetic change, but should make debugging and looking up
structure definitions easier.
Michael Vrable [Mon, 25 Oct 2010 05:43:24 +0000 (22:43 -0700)]
Merge git+ssh://c09-44.sysnet.ucsd.edu/scratch/bluesky
Michael Vrable [Mon, 25 Oct 2010 05:15:41 +0000 (22:15 -0700)]
Updates to the workload generator and some results from runs.
Michael Vrable [Thu, 21 Oct 2010 21:28:56 +0000 (14:28 -0700)]
Remove debugging messages when decrypting cloud log segments.
Michael Vrable [Thu, 21 Oct 2010 20:07:18 +0000 (13:07 -0700)]
Fix Amazon S3 store_lookup_last implementation.
Amazon will return directory listings in chunks. Be sure to repeat list
calls until we find the actual end of the directory listing.
Michael Vrable [Thu, 21 Oct 2010 05:34:45 +0000 (22:34 -0700)]
Fix a memory leak resulting from the in-memory inode map.
Actually release the reference to the old version of an inode when creating
a new one.
Michael Vrable [Tue, 19 Oct 2010 21:14:50 +0000 (14:14 -0700)]
Add a more aggressive mode of cleaning up disk space.
If enough space cannot be deleted in the disk cache, drop nearly all clean
data from memory so that most disk cache files are unreferenced, then make
another pass to delete log/cache files.
Michael Vrable [Tue, 19 Oct 2010 20:43:58 +0000 (13:43 -0700)]
Move disk cache cleanup code to cache.c from log.c
Michael Vrable [Tue, 19 Oct 2010 02:01:21 +0000 (19:01 -0700)]
A few more minor fixes cleaning up the cloud log state counting.
Michael Vrable [Tue, 19 Oct 2010 01:05:50 +0000 (18:05 -0700)]
Fix up cloud log state counting.
Counts of cloud log items in various states were not being kept up to date,
so that the counts were often wrong. Fix a few cases to make the counts
more accurate.
Michael Vrable [Tue, 19 Oct 2010 00:22:15 +0000 (17:22 -0700)]
Ignore lockmem executable.
Michael Vrable [Mon, 18 Oct 2010 21:48:57 +0000 (14:48 -0700)]
The signature on log items should not extend to the pointer locations.
This allows the cleaner to rewrite the pointers without invalidating the
signature.
Michael Vrable [Mon, 18 Oct 2010 21:48:47 +0000 (14:48 -0700)]
Add timeout on cleaner S3 retry.
Michael Vrable [Mon, 18 Oct 2010 21:41:44 +0000 (14:41 -0700)]
Add retries to the S3 backend in the cleaner.
Michael Vrable [Mon, 18 Oct 2010 20:03:15 +0000 (13:03 -0700)]
Small utility to use free memory in a system.
For benchmarking, to take memory away from the page cache.
Michael Vrable [Mon, 18 Oct 2010 04:40:44 +0000 (21:40 -0700)]
Support encrypted log items in the cleaner.
Michael Vrable [Sun, 17 Oct 2010 23:18:23 +0000 (16:18 -0700)]
Delete dead code.
Michael Vrable [Sun, 17 Oct 2010 23:17:40 +0000 (16:17 -0700)]
When decrypting a log item also clear out the IV field.
Not really needed, but this way the IV field being zero should be
synonymous with an unencrypted log item.
Michael Vrable [Sun, 17 Oct 2010 20:40:20 +0000 (13:40 -0700)]
Add per-item encryption/authentication to the cloud log storage.
We should generate encrypted data and decrypt it again on read, but we
don't yet enforce only reading data which passes the integrity check.
Michael Vrable [Fri, 15 Oct 2010 18:01:45 +0000 (11:01 -0700)]
Allow S3 bucket used for BlueSky storage to be specified.
Michael Vrable [Mon, 11 Oct 2010 18:35:17 +0000 (11:35 -0700)]
Work on a simple workload generator for benchmarking.
Michael Vrable [Fri, 8 Oct 2010 23:57:35 +0000 (16:57 -0700)]
Start adding in selective encryption of cloud log items.
Not fully hooked in, but some of the logic for encryption is written now.
Michael Vrable [Mon, 27 Sep 2010 23:30:13 +0000 (16:30 -0700)]
Another logging fix.
Michael Vrable [Mon, 27 Sep 2010 20:28:57 +0000 (13:28 -0700)]
Fix for journal committing.
Sometimes we could previously, under load, report that journal items were
committed when they were not. Try to track the uncommitted state more
carefully now.
Michael Vrable [Mon, 27 Sep 2010 17:57:59 +0000 (10:57 -0700)]
Implement handling of unstable data in WRITE/COMMIT nfs procedures.
Before all NFS operations were synchronous; now we support asynchronous
commits of file writes which might improve performance.
Michael Vrable [Mon, 27 Sep 2010 05:55:19 +0000 (22:55 -0700)]
Updated microbenchmarking script.
Michael Vrable [Wed, 22 Sep 2010 20:52:28 +0000 (13:52 -0700)]
Starting work on scripts to automate benchmarking.
Michael Vrable [Wed, 22 Sep 2010 18:47:17 +0000 (11:47 -0700)]
Improve cleaner performance.
When reading an object in, seek to and read just the needed bytes instead
of the entire log segment. Improves performance significantly.
Michael Vrable [Mon, 20 Sep 2010 22:09:12 +0000 (15:09 -0700)]
Remove obsolete file.
Michael Vrable [Mon, 20 Sep 2010 15:56:01 +0000 (08:56 -0700)]
Use a thread pool for inode fetches, and remove some debugging output.
Michael Vrable [Mon, 20 Sep 2010 15:55:03 +0000 (08:55 -0700)]
Remove an extraneous mutex unlock.
I'm surprised that this didn't cause trouble earlier; it seems that
unlocking an unlocked mutex raises no errors (but under heavy load, when
the mutex is locked by another thread then unlocking it can cause trouble).
Michael Vrable [Mon, 20 Sep 2010 03:07:27 +0000 (20:07 -0700)]
More fixes for memory management.
This should allow memory and cache space to be reclaimed by not keeping
items pinned in memory, finally. Still needs a bit more testing.
Michael Vrable [Sun, 19 Sep 2010 22:22:02 +0000 (15:22 -0700)]
Work on reducing memory pinned by the inode map.
Michael Vrable [Sun, 19 Sep 2010 18:31:15 +0000 (11:31 -0700)]
Add cleaner option to rewrite and compact all inodes (but not all data).
Michael Vrable [Sun, 19 Sep 2010 04:22:15 +0000 (21:22 -0700)]
Allow cloudlog items to be unreferenced in the background.
This is to avoid certain deadlocks, when we don't care if resources are
reclaimed immediately or not.
Michael Vrable [Sun, 19 Sep 2010 00:44:07 +0000 (17:44 -0700)]
Do not hold references to all inode data in inode map.
The inode map should not hold full refereces to all inode objects and all
corresponding data, since that will lock all such data in memory or the
disk cache. Do some initial work towards just holding weak references to
the data so that it can be expired from the cache.
Michael Vrable [Wed, 15 Sep 2010 20:29:42 +0000 (13:29 -0700)]
Restart journal sequence numbering properly.
Handle the case even where some old journal files have been deleted.