Story of a File
From flud
This page provides a skeleton outline of the detailed steps taken to backup, verify, and retrieve a file. These skeletons provide a life sketch for the "story of a file" in the flud backup network.
Backing up a File
For each file to be backed up:
- Create an encryption key and a storage key for the file, f
- The encryption key is the SHA256 hash of the file, H(f) = eK
- The storage key is the SHA256 hash of the eK, H(eK) = H(H(f)) = sK
- Encrypt the file with eK
- Create local filesystem metadata
- filename, permissions, creation/modification/change times, ownership, extended attributes, etc.
- Encrypt file metadata with public key, Ku
- Encrypt file encryption key, eK, with public key, Ku, so that it, too, can be stored with metadata
- Erasure code the encrypted file
- split the file up into m blocks
- add k parity blocks
- Create a metadata key mK as the crc32 value of the plaintext filename
- (this allows us to align blocks of identical files but different names with their respective metadata blocks)
- Erasure code file metadata (same m & k as for file)
- Each storage block is composed of the one block of the file data and one block of the filesystem metadata
- See if the file has previously been stored (query DHT for sK)
- if previously stored, verify blocks.
- if all blocks verify correctly, done. Skip the remaining steps.
- if some blocks do not verify, replace them.
- if previously stored, verify blocks.
- Store the erasure-coded file
- each block is stored by its name, which is the CAS key for the block, H(b)
- each block can be stored on any node, choice is up to client (see Fairness)
- Create file metadata
- list of block names and the nodeIDs of where they are stored
- Store file metadata
- this is stored in the DHT layer, under sK
- Update local master metadata
- master metadata is simply a list of filenames to storage keys (sK), with timestamps
Retrieving a Backup File
For each file to retrieve:
- Query the DHT with sK (previously saved in master metadata), resulting in file metadata
- Retrive blocks described in metadata until reconstruction (erasure decode) is successful
- Decrypt file with eK (previously stored in file metadata)
- Copy the file to its proper location in the filesystem, restore filesystem metadata
Verifying a Backup File
- Create an encryption key and a storage key for the file, f
- The encryption key is the SHA256 hash of the file, H(f) = eK
- The storage key is the SHA256 hash of the eK, H(eK) = H(H(f)) = sK
- Create local file metadata
- filename, permissions, creation/modification/change times, ownership, extended attributes, etc.
- Encrypt file metadata with public key, Ku
- Encrypt the file with eK
- Erasure code the encrypted file
- split the file up into m blocks
- add k parity blocks
- Erasure code file metadata (same m & k as for file)
- See if the file has previously been stored (query DHT for sK)
- if previously stored, verify blocks.
- send challenge with random offset and length into each block (for both data and metadata)
- responses comes back as H(offset, length bytes) of stored block
- send challenge with random offset and length into each block (for both data and metadata)
- if previously stored, verify blocks.
- if all blocks verify correctly, success.
- report failures for blocks that do not verify correctly
- note that steps 1-5 can be paritally optimized away by creating many challenges and storing them for later use (generate either during the initial store operation, or during a subsequent verify operation)