[flud-devel] private flud network and flud goals
Alen Peacock
alenlpeacock at gmail.com
Tue Sep 4 16:21:39 PDT 2007
Posted with permission of Stuart Langdon, who put some good thought
into what an ideal distributed backup system would look like (at
http://www.kryogenix.org/days/2006/04/12/distributed-backups-to-friends).
His description seems to align fairly closely with flud's goals:
On 9/4/07, Stuart Langridge wrote:
> Alen,
>
> I've just been pointed at flud (sorry, no breve;
> confusing charset issues between PuTTY and my ssh
> server). A while back I wrote up how I want a
> perfect distributed backup system to work, at
> http://www.kryogenix.org/days/2006/04/12/distributed-backups-to-friends
> and flud looks like that's
> it (ignoring that there's no public flud network
> yet). Am I right in thinking that flud does
> roughly what I was hoping for, or have I
> misunderstood?
> sil
Stuart,
No need to apologize about lack of breve -- I often leave it out myself :)
I think you'll find that flŭd, once finished, will do essentially what
you describe. I'm hoping to make another official release in the next
week or so, but if you downloaded a snapshot from svn right now, you
would be able to:
1. create a groupID that you could share with friends in order to
create a private flŭd network. [*1]
2. use the GUI to easily select files/directories for
inclusion/exclusion from backup, which will then be automatically
monitored for new files/changes and backed up to other flud nodes in
the background
3. use the GUI to easily select backed-up files/directories for
restoring from the flud network.
Files are encoded using LDPC, a technique which I believe is superior
to reed-solomon encoding used in parchive, especially for large
amounts of data. Current parameters require 2x remote space,
splitting files up into 40 chunks, needing only any 20+ chunks to
recover a file (these parameters might be adjusted in the future, or
could even be tweaked per user).
There is *no* centralized component in flud, and thus, no sign-up
procedure. flud uses a shared secret, unforgeable identities, and
challenge-response queries for nodes to a) prove their identities and
b) prove their group membership. If you want to set up a private flud
network among friends, the network will remain private as long as no
members expose the private groupID. This corresponds fairly closely
with the "group name" idea you mention -- as long as members know the
group name, they can participate in the group.
flud uses symmetric storage relationships among peers to enforce
fairness. This corresponds to your statement "if you want to back up
N megabytes you have to offer 3N megabytes of space to the group."
[*2]
Now, the caveats:
- currently, flud is not cross-platform. I've had to narrow focus to
Linux in order to try to make better progress. The good news is,
however, that the code is very portable, and there are actually only a
few bits that would need to be changed to work on Windows, and even
fewer that would need to be done for OSX -- its more a matter of the
extra weight of testing and packaging for these platforms. But, it
will absolutely run on these at some point.
- currently, installing flud requires a compilation step and a couple
other minor annoying hoops to jump through. I may eliminate that in
the upcoming release. Long term, I definitely agree that setup of the
software needs to be super-extra-painfree, and this has been a goal
since day one. "No ten pages of options" -- amen! The intention is
that on first run, the software will already be configured to backup a
default set of files that most users would want to backup, and that
you can customize it from there. In other words, we're aiming at
install, run-once, and then forget about it.
- currently, you select files from the gui and not the filemanager. I
initially thought that making filemanager plugins for flud on all the
platforms it runs on was the way to go. I've since become somewhat
convinced (and surprised) that average users don't find that as
intuitive as a specialized application. But there's no reason someone
couldn't add such interfaces if desired.
- currently, there is no differential backup for changed files. This
is a must-have, but it is currently low priority (because it can be
layered very nicely on top of the flud substrate). It will become
higher priority once we have proven flud "in-the-wild". BTW, I think
this will be quite easy to implement using rdiff (or somesuch),
without the existing flud substrate even needing to know about it. Do
note, however, that flud won't re-backup files that are currently
stored, and benefits from other efficiencies, such as convergent
storage (also referred to in the literature as single-instance-store).
- currently, the daemon processes (FludNode and FludScheduler) don't
start up automatically via initscript (or other system level service
creator), but this too should be trivial to add.
- currently, FludScheduler is very primitive. You can't configure
specific times to do backup, and it uses a dumb (stat) method for
determining what files have changed. It is a placeholder for a more
fully-featured scheduler.
That's a longer response than I set out to write, but it was a good
exercise, and I enjoyed reading your thoughts on the topic.
flud is still very rough around the edges and is definitely
experimental software, but I hope to get it gradually polished into
something that looks very much like what you envisioned.
Alen
*1: currently requires editing a config file, as install currently
defaults to the global flud network (may make this an option to set
during first run of flud GUI in the future). There currently isn't a
lot of documentation on how to do this, but it is a 1-line change in
the config. See 'Node Identity', under
http://flud.org/wiki/Architecture .
*2: the current svn snapshot does not enforce this constraint at the
moment, in order to facilitate testing (this is the main reason I
haven't pushed to instantiate a public flud network yet). The
architecture fully supports it however, and the next release will
enable this by default.
More information about the flud-devel
mailing list