StoringCredentialsInFlud
From flud
- note: the store-credentials-in-flud save scheme implies that nodes must not enforce convergent storage checks when storing data. But since almost all other data will store convergently, a node may be able to earmark such "special" storage operations and target them for attack. Is the encoding/splitting of the file enough to ensure that such an attack is difficult to mount? Should we use stronger encryption on such special cases? Any other protections?
- Also, the special case of the config file being stored under a non-convergent storage key implies that the storage layer can take [key, data] args, instead of just [data] args. This is fine (and actually has the advantage of allowing us to switch to a different cryptographic hash in the future without breaking backwards compatibility), but must be kept in mind as we design the DHT-like layer. It also implies that a node doing a store operation for convergent data (identical data that it has already stored) must still receive the complete data from the store request before deleting it (since there is already a copy stored). This introduces some inefficiencies in network bandwidth required, but not in storage. Perhaps we can solve the bandwidth efficiency problem with an rsync-like strategy for storage ops to an existing key.
- Proposal: Since a malicious node will always be able to detect config file storage operations anyway, we shouldn't be too concerned about having a special API for this operation, and perhaps a separate db. This allows us to continue to make the [naive?] collision-free assumption about keys & data for normal storage operations. It also optimizes config file verification operations, which need to occur more frequently than regular ops.
- Attack: a malicious node continues to store config file and data for other nodes until it notices that a node has stopped doing its verification ops, After some period of silence, the malicious node assumes that the node in question has gone down and perhaps lost its data. The malicious node assumes that the node in question will want to do a restore, but the malicious node is more interested in freeing its own resources, so it deletes the config file data and any other data that it can associate with the node in question. When the node in question begins to do its restore, the system is robust enough to withstand a few such failures, but if multiple malicious nodes are all doing this independently... (this could occur non-maliciously, too; suppose a client is written which wants to purge "old" data, and assumes that data that has not had verify ops performed for some threshold of time should go...)
- Solution: proxy all ops with some probability. Onion routing could also be introduced. This makes it more difficult to perform the analysis required to associate data with a particular node (and, as a side benefit, also allows us to retire non-refreshed data after some [long] threshold of time -- would need to do generous estimates for how often, statistically, each block of data should get verified, and then make sure even outliers don't get junked very often).
- Problem: we must do nodeID accounting somewhere. Since data can move as nodes join or leave the flud network, it is problematic to do the accounting at the client. On the other hand, it is trivial to do accounting at the server -- each node just keeps track of where its store requests originate. But if we do onion routing, this is difficult. In fact, if we do onion routing to anonymize the origin of store requests, it is impossible. So, consider doing onion routing only for verify ops (and maybe challenge/groupID, etc). Each node in the onion chain will update its trust according to the results, so it hurts (quite badly) to fail at verify requests.
- Freenet's Keyword-signed-keys (actually, use Signed Subspace Key to avoid hijacking) might be a good way to store data with a different effective content hash than it actually contains... (http://zgp.org/pipermail/p2p-hackers/2002-July/000715.html). SSKs require a descriptive string and a PK pair. The string and the public key are hashed independently, combined together (concatenated or XORed) and then hashed again to get the file storage key. What we really want is a variant of this, where the user's key pair is encrypted with the top-secret passphrase, hashed and then stored. Perhaps a seperate storage layer should be used for non content addressable storage... Weakness in having a seperate layer as described is that hijacking can occur. A hijacker just insert a new value into the non content addressable storage space that matches an existing key, but will effectively cause a loss of the value. The other weakness is that any node this data is stored on knows that the data stored there is "different", and thus perhaps highly valuable (if we trust the encryption, this doesn't matter, but...)
- Weakly-enforced content addressable storage: STORE requests succeed if no collision is detected. If a collision is detected, STORE responds by notifying requestor of the collision. Requestor performs a VERIFY op on the key. If it does not match, requestor does a STORECHECK op, which causes the recipient to do a content addressable check if a collision is detected. If the stored content does not match the key, it is expunged, and the new data is written over it.