backup
When implementing a backup system we may need the client to ask the server if it already has a certain block. There are several ways we can implement this in order to reduce latency impacts;
Option 1
Client sends block hash to server Server replies Y/N High latency impact Option 2
Server sends client a list of all block hashes on server Client is then free to stream in new blocks as fast as it can No latency impact Sensitive data release?
Read more
This article covers the design of an ideal file-based backup solution.
DESIRED PROPERTIES
Untrusted server Incremental forever METHOD
Split file into blocks using rsync algorithm / deterministic reparse points This makes it very likely that a 1B addition in the middle of the file won’t cause an entire reupload of the following sections Use a block interval that is quite large (e.g. 100KB) to minimise size of metadata
Read more
There are a few approaches to encrypted incremental backup assuming the constraints are (1) we don’t trust the remote storage provider to snoop through our data, so it must be encrypted client-side; and (2) bandwidth is expensive, we want to minimise data sent to the remote storage provider - that includes only ever sending the changes to our data.
Take duplicity[1] for instance. In order to work with untrusted remote storage sites, all data is encrypted and the encryption key is never exposed to the remote storage provider.
Read more