[flud-devel] open/locked/db files (was: DHT Performance? Design?)

Alen Peacock alenlpeacock at gmail.com
Wed Oct 31 19:53:40 PDT 2007


Reply in multiple posts:


On Oct 31, 2007 12:32 AM, Bill Broadley <bill at broadley.org> wrote:
>
> Yes, but by my understanding is basically invisible to a python backup program
> calling opendir, readdir, open(file) and friends.  So if you end up writing
> a special program that can handle open files or recognizes the database and
> snapshots it why not have it dump to something the equivalent of maildir.
>
> In any case I didn't mean to argue against delta compression, just wanted
> to mention the pitfalls.
>
> > (I've seen some as big as 80GB).  Backing up this entire file each
> > time a single new email arrives is unworkable.
>
> Agreed.
>
> > I have several apps on
> > my Linux desktop that similarly update db files while they run
> > (luckily, none of them are as large as Outlook yet).
>
> The "right" way is to handle backups explicitly, like maybe select * from
> table where records>time_of_last_backup.  I've definitely seen issues from
> mysql and plone/zope when active and backed up as just a file.  I hope
> subversion is more backup friendly, but I've yet to check.
>
> Alas outlook brings up an ugly detail, afaik open files are mostly invisible
> under windows.  At least backuppc punts and says "run a util to dump it,
> then run backuppc".

  Dealing with open files (on both posix and windows) is not a lot of
fun, neither is handling locked files (on windows).  Plugging into
MS's VSS seems even less appealing.  Writing custom handlers for
specific filetypes is less-than-ideal, as does providing special hooks
into every conceivable application that uses db or db-like files to
tell them to quiesce.

  Further complicating matters, a backup app that bugs the user to
manually intervene during a *background* backup operation is a bad
design from a usability standpoint.

  There are a couple of general techniques for trying to get a
consistent view of a file without knowing anything about the
application that writes it.  Right now, I'm most inclined to use one
of these dumb techniques, even though they are inefficient, not 100%
effective, and somewhat silly.

  For now, I'm punting.  flud will just assume that all files can be
read and backed-up.  At some point we'll need to fix this.  If someone
else figures out a clever way to deal with all this in the meantime,
all the better.

Alen



More information about the flud-devel mailing list