[flud-devel] open/locked/db files (was: DHT Performance? Design?)
Alen Peacock
alenlpeacock at gmail.com
Wed Oct 31 19:53:40 PDT 2007
Reply in multiple posts:
On Oct 31, 2007 12:32 AM, Bill Broadley <bill at broadley.org> wrote:
>
> Yes, but by my understanding is basically invisible to a python backup program
> calling opendir, readdir, open(file) and friends. So if you end up writing
> a special program that can handle open files or recognizes the database and
> snapshots it why not have it dump to something the equivalent of maildir.
>
> In any case I didn't mean to argue against delta compression, just wanted
> to mention the pitfalls.
>
> > (I've seen some as big as 80GB). Backing up this entire file each
> > time a single new email arrives is unworkable.
>
> Agreed.
>
> > I have several apps on
> > my Linux desktop that similarly update db files while they run
> > (luckily, none of them are as large as Outlook yet).
>
> The "right" way is to handle backups explicitly, like maybe select * from
> table where records>time_of_last_backup. I've definitely seen issues from
> mysql and plone/zope when active and backed up as just a file. I hope
> subversion is more backup friendly, but I've yet to check.
>
> Alas outlook brings up an ugly detail, afaik open files are mostly invisible
> under windows. At least backuppc punts and says "run a util to dump it,
> then run backuppc".
Dealing with open files (on both posix and windows) is not a lot of
fun, neither is handling locked files (on windows). Plugging into
MS's VSS seems even less appealing. Writing custom handlers for
specific filetypes is less-than-ideal, as does providing special hooks
into every conceivable application that uses db or db-like files to
tell them to quiesce.
Further complicating matters, a backup app that bugs the user to
manually intervene during a *background* backup operation is a bad
design from a usability standpoint.
There are a couple of general techniques for trying to get a
consistent view of a file without knowing anything about the
application that writes it. Right now, I'm most inclined to use one
of these dumb techniques, even though they are inefficient, not 100%
effective, and somewhat silly.
For now, I'm punting. flud will just assume that all files can be
read and backed-up. At some point we'll need to fix this. If someone
else figures out a clever way to deal with all this in the meantime,
all the better.
Alen
More information about the flud-devel
mailing list