SimilarSystems
From flud
All of the following solutions contain clever and innovative components that have at least some focus on decentralized backup. The notes below focus on comparing/contrasting these offerings with flŭd backup.
Contents |
[edit] AllMyData / HiveCache ( / Mnet / MojoNation )
AllMyData has recently garnered a bit of buzz about their Tahoe project. This is a nice improvement over what appears to be a meager amount of success with their first generation backup product/service, which had a somewhat confused focus: you got to both share your computing resources and pay for services, and it is all centrally controlled by one company.
The first generation service:
- - AllMyData is centralized. You are required to have an account registered with AllMyData, Inc. Even though your backup data is spread out to many machines not controlled by AllMyData, if AllMyData goes out of business, your backup goes away because you won't be able to access your account. Likewise, the service has several 'single points of failure' (see forum posting) that can prevent service even while AllMyData is operating normally.
- - Closed-source.
- - 10-to-1 storage ratio. For every MB of data you want to store, you must provide 10x the space locally for others to use. This ratio seems higher than it needs to be. flud gives you a 1-to-1 storage ratio (currently, this means that it actually requires a little more than 2x, because every file you store has parity plus metadata).
- Note: some of the above info may be out of date. From AllMyData's current homepage, it isn't clear that there is a free (10-to-1) storage service anymore, nor is there any mention of their services being decentralized. It may be that they have converted into a pure centralized remote backup service (can anyone confirm? This forum posting seems to indicate that the company is de-emphasizing decentralization, and this one claims that no one's files are backed up anywhere but to AllMyData's servers).
- +/- Open source (+), with the caveat that AllMyData might keep improvements closed for up to 12 months after releasing binaries (-).
- + Active development / community
- + Goal for complete decentralization, but
- - all participants must currently be completely connected, which is not scalable
- - centralized vdrive server
- - centralized introducer
- + plans to fix all of the above at some point
- + it does have a I-don't-need-allmydata.com mode of operation (private tahoe networks among trusted friends)
- - unclear how/if tahoe will support collaborative backup among untrusted peers
- + Smart folks working on neat technology
- -/+ emphasis on non-backup use cases (streaming data, alacrity, etc) could make tahoe less effective as a pure backup play (-), but also make the network more readily usable for applications in other domains (+) such as file sharing etc.
- +/- See this p2p-hackers thread: http://lists.zooko.com/pipermail/p2p-hackers/2007-May/001041.html
- - given recent changes in focus for AllMyData's 1st generation services (see above), it is unclear how a decentralized Tahoe will fit in with the company's current business model.
[edit] CleverSafe
Cleversafe routinely speaks to some of the core values that flud targets: extreme resiliency, geographic dispersal, and open source (cred). They initially positioned their technology as p2p, but have since moved away from that term, preferring "grid" (which is an accurate description). They one-up AllMyData by making a system in which multiple entities can provide service. One vital difference between cleversafe and flud is that cleversafe is a solution destined squarely for the datacenter (with all the associated weight that datacenters carry: bandwidth and maintanence costs, IT expertise, etc), while flud eschews the datacenter entirely, preferring to live on the computers of individuals at the edge of the network.
- + CleverSafe is open source, and you can set up your own backup grid.
- - But to participate in CleverSafe's test backup grid, once again you need create an account with CleverSafe. This is currently free, but there is no indication that it will remain that way permanently. If you want to set up your own grid, you'll need roughly a dozen computers, and enough time to configure the CleverSafe server software on each one. You'll also need the time to keep these machines running in the future (patches, updates, etc.).
- - update: it appears that the public grid has been mothballed for several months.
- - update: last open source release was October 2006 (releases had been appearing very often up until October -- maybe the open source bits have become mothballed, too?)
- - Non-p2p. The Cleversafe grid doesn't appear to have internet-scale, and requires that you set up one "director node" in your grid. The director node is responsible for keeping a list of files stored, as well as knowing how to connect to the other nodes. Although you can have 'backup director nodes' in case the main director fails, this centralized component is a design bottleneck.
- - If one 'Pillar' (of the 11 total Pillars) becomes unavailable, the grid ceases to function. Even though there is enough redundancy to recover data, storage will be impossible (see http://forums.cleversafe.org/viewtopic.php?p=78#78).
- - Requires homogeneaity. All 11 servers must have the same amount of capacity; the server with the least amount of storage determines the total amount of storage available.
- - Because of the above 3 points, CleverSafe is not an appropriate implementation for harnassing spare desktop/workstation storage (in CleverSafe, 'clients' and 'nodes' are seperate computers. In flud, there is no distinction -- 'clients' /are/ 'nodes'). CleverSafe is more appropriate as a data-center storage solution, NOT a end-user solution (the amount of work required to get a CleverSafe grid installed is a testament to this -- take a look at Getting Started Guide).
- - Usability. CleverSafe is an IT solution, designed for IT experts. flud backup is an end-user solution. No IT required.
- - Complexity. CleverSafe appears to be overly complex. Parts of the architecture are very, very nice, but others seem more complicated than they need to be. This is purely subjective, but results seem to support it -- the progress CleverSafe has made in almost 2 years, with a funded team of developers, is not encouraging (very primitive clients, last release included such *new* features as limited windows support, file delete, and more than one director, etc).
- - No convergent storage. If N users back up their complete WindowsXP systems, CleverSafe will store N different copies of all the system files. This is inefficient.
[edit] CrashPlan
- + allows you to backup to friends computers
- + can do initial backup locally or (soon) to media than can be transported locally
- - have to manually manage where backups occur
- - uses simple replication instead of erasure coding = inefficient
- - no convergent storage (SIS)
- - correlated failures can result in complete data loss, e.g., hurricane etc.
- + seems to have a nice interface
- + social networking virality vibe
- - is 'decentralized' in only a loose sense -- manual management of backup peers
- - has "Crash" in its name?
- - closed source
- - security, blah blah
- +/- looks like a sensible interface (+) over nothing more than regular rsync (-)
[edit] Zoogmo
- - Similar to Crashplan and Magic Mirror, requires the user to manually select 'partners' (isn't really decentralized p2p, but rather client-to-many servers).
- - Similar to the old AllMyData, requires a centralized account at zoogmo. If zoogmo disappears, the service disappears.
- + Fairly complete GUI
- + Unlimited backup
- - Similar to Crashplan/Vembu/BitVault/Magic Mirror, it just uses simple replication to all 'partners', i.e., to get the same data protection as flud, would need to consume more than 20x the amount of space on other computers.
[edit] BitVault / LeanBackup / LeanOnMe
- - Proprietary and closed source
- + Uses JXTA P2P, but
- - doesn't appear to be internet-scale (must create and manage backup groups)
- - Creates multiple replicas of files, instead of using erasure codes (less space efficient)
- - not yet available (http://www.312inc.com/bitvault_network.html). A similar product, BitVault Online, which uses servers owned and operated by 312inc, is currently available
[edit] Vembu's Storegrid
- + cross platform
- + free version
- - free version is limited to a total of 3 computers
- - closed-source
- - no trust system, user must manually select which computers in the grid to backup to
- - LAN-scale only (not internet scale).
- + many options and reports
- - complex user interface
[edit] Magic Mirror Backup
- + semi-cross-platform (java-based, doesn't work on Mac OSX)
- - uses simple replication instead of erasure coding
- - cannot schedule backups / continuous backup / etc. -- all operations are manual
- - doesn't do incremental backups
- - users have to manage which computers they will backup to.
- - non-free (free trial version and 1.0 will be free)
- - proprietary
- - stagnant - not much development or recent news, not much activity in user forums
[edit] Distributed Internet Backup System (DIBS)
- + open-source
- + erasure-codes (one of the first deployed distributed backup systems to attempt this)
- + has (had?) a decent sized user community
- + nicely documented
- - peer-management is manually managed (DIBS is 'distributed', not 'decentralized'). Although much of this management has been abstracted away with the addition of a centralized peer-finder service. Posting/Accepting of contracts appears to still be a manual step, and it appears that the user must also keep track of trust/reliability among peers.