[ZODB-Dev] ZEO client cache tempfile oddness

Tue Apr 10 16:24:09 EDT 2007

[Tim Peters]
>> The ZEO client cache is stored in a fixed-size disk file.  When a ZEO
>> client needs to create this file for the first time, it's trying to
>> ensure there's enough space on disk for it at the start, and reserve
>> that disk space then, rather than risk dying with a "no space left on
>> device" error umpteen hours later.  Alas, there is no portable way to
>> do so (the C standard says nothing about physical devices such as
>> disks).

[Paul Winkler]
> So it might work as intended on some systems but inspire false
> confidence on others. That's sad, but I guess we can't do much about
> it.

The full intent is almost met on Windows.  On other systems it at
least /informs/ the OS about the intended size of the file.  In any
case it does no harm.

[Paul]
>>> ...
>>>"saves" means. Does the behavior vary on different filesystems?

>> Yes.
>>
>> Sounds like it optimizes for sparse files.  Not all filesystems do.

> OK... I'm still wondering on which filesystems this code actually does
> guarantee sufficient space. Our experiments suggest that ext2, ext3,
> and reiserfs optimize for sparse files so there is no such guarantee.
> AFAICT from some quick googling and wikipediaing, the same is true for
> NTFS, XFS, JFS, ZFS. I suspect we've accounted for the majority of the
> production Zope installations in the world.  The only fs I found that
> has no sparse file support is HFS+.  Not sure about UFS (I found
> people claiming both no and yes).

NTFS has "optional" sparse-file support, meaning there is a way to
create sparse files under NTFS, but it's not the default.  It isn't
exposed at the C stdio level (creating a sparse file under NTFS
requires additional Windows-specific API calls), and Python builds on
C stdio.

In any case there's no benefit to sparse files in this context:  while
the ZEO cache file starts out "almost empty", every byte is eventually
used, so it seems good to make whatever cheap efforts can be made to
minimize the chance that ZEO will die after umpteen hours if there's
not enough space for the client cache file it needs at the start.

>> If you care, the intent could probably be better served by changing
>> the code to write a junk byte at every (say) thousandth byte offset
>> throughout the file (at least one non-NUL byte per disk block).  Alas,
>> there would still be no guarantee that there's actually enough space
>> on the physical disk to store it.

> OK, then I don't see how we could get a real guarantee without
> actually writing junk to every byte, which might be just a little
> slower :)

Still no guarantee (e.g., there might be enough RAM to hold all the
bytes in I/O buffers, but not enough disk space remaining to
materialize them), but writing a non-NUL byte per disk block would be
as effective in practice as writing to every byte position.

> Short of that, since crystal ball technology is still not widely
> deployed outside the Python Secret Underground, we can't know if
> currently free space will still be available when we need it.

That's right.  On all platforms the current dance suffices to inform
the I/O system of how large a file ZEO needs (note that just seeking
to the max size does not suffice -- a byte needs to be written at the
end to set the EOF pointer).  On NTFS that's almost reliable (see
above); on other systems it may or may not be.