LocalizedTrust

From flud

Contents

[edit] What is Trust?

In the flŭd network, 'trust' is a metric used to determine how reliable or cooperative another node is. Much as in social networks, individuals in the flŭd network prefer to interact with others who are trustworthy. Trust is central to the efficient and correct operation of the flŭd backup network. Data stored to other nodes must be verifiable, and other nodes that store data must be accessible. Trust is also central to enforcing fairness and preventing free-loaders from consuming more resources than they provide. In flŭd, storage operations must be symmetric; a node who wishes to store 1M of data to another node must also provide 1M of data for that node locally.

Trust is measured directly in flŭd; there is no second-hand information traded among peers that can be exploited by colluding nodes or sybil networks. This means that all information used by a node must be perceived directly by the node, through interaction with other nodes. By some stretch, it could be said that this fulfills the "embodiment" principle of Brooksian behavior-based systems, if we define "the body" to be the flŭd node and "the world" to be the flŭd network.

See the blog entry "How Comes this Unity?" for an explanation of trust's role in helping global intelligent behavior to emerge.

[edit] Identity

All trust systems must be built upon reliable identity systems. Otherwise, malicious individuals can 'steal' other nodes' identities and get away with free-loading or other harmful acts. In flŭd, it is cryptographically infeasible for an imposter to masquerade as another user. Identities are tied to 1024-bit RSA private keys through SHA-256 hashes. In order for an imposter to be successful, they must either find the correct collision in the SHA-256 nodeID and use that to reproduce the private/public key pair (which is cryptographically infeasible), or they must steal a copy of the private key.

[edit] Ensuring Identity

All data storage, retrieval, and verification operations are performed only after the initiator can answer a challange. The challenger generates a challenge by encrypting a small amount of random data with the initiator's public key. Only the holder the corresponding private key can successfully decrypt this challenge and respond with a correct answer. In this way, individuals in the flŭd network can be certain that nodes are who they say they are, and can tie actions irrevocably to nodeIDs.

[edit] Trust Operations

When a node agrees to store data for another node, the sending node increases its trust in that node. The storer thus earns trust by performing this operation. Retrieval and verify operations likewise earn trust when performed successfully, but also decrease trust quickly if they fail on these operations.

[edit] Trust Database

Each node then keeps a local database of all nodes that it interacts with, assigning a trust level to each node. Because the storage layer is seperate from the metadata layer, a node can restrict its trust interactions to a relatively small subset of all nodes in the flŭd network. This means that the size of the database can be capped at a reasonable number (say, 500 entries), with nodes whose trust levels sink to a certain point being discarded (perhaps to a blacklist) to make room for nodes whose interactions are beneficial.

[edit] Scoring Trust

flud nodes have default values by which they increment/decrement trust when operations succeed/fail. There are maximum delta values for trust changes per day, as well as maximum/minimum values for accumulated trust. These values can be customized (by the user, the implementor, or self-tuning software). Default values earn trust slowly, and lose it more quickly.

Along with the values of the trust rating itself, nodes maintain trust delta velocities. These velocities indicate how quickly a peer's values change, and allow a node to detect transient outages and avoid punishing these unmercifully.

For example, if a peer has accumulated trust over time, but then experiences a 1 day outage, we would like for the node to detect this outage and decrement trust accordingly, but it would be to both nodes' disadvantage to decrement it mercilessly over a short period of time. When the velocity indicates that the peer's trust is decrementing rapidly, it falls out of favor as a choice for peer operations. As some period of time passes, this velocity will once again indicate that the peer can be chosen for operations.

Since all operations in flud involve choosing from among extra peers, these metrics allow a node to temporarily ignore faulty nodes.

[edit] Storage Accounting

Both parties involved in storage operations (the sender/client and the receiver/server) must keep some accounting data, so that they can keep track of how much storage space they are exchanging. The client relies on metadata to account for storage on other nodes. The metadata is stored to the DHT, but local copies are also kept for efficient access. By walking this metadata in search of a specific nodeID, a client can know how much current data is stored on a particular server.

The server has no immediate way of associating a stored block of data with a particular node by consulting the metadata, so it instead uses a very simple accounting method. Each non-tarred block of data starts out with a list of nodes that are storing it. For files that are stored inside of tarballs, the tarball itself (being named after the originating node) serves as the accounting method.

When blocks naturally expire {link to description of expiration, where data is purged after not being accessed for X hours}, they are simply removed. If a block of data is removed by a specific node via a DELETE {link} operation, it is removed from the list at the head of the file. If it is the last/only entry in the list, the file itself is removed. This is similar to ABS' signed blocks method {link to ABS paper}, but the cryptographic signature is unnecessary in our case, because of the challenge/response prior to STORE {link}.