[ZODB-Dev] Major refactoring of the ZEO ClientStorage Blob Cache

Jim Fulton jim at zope.com
Wed Dec 3 10:44:03 EST 2008


On Dec 3, 2008, at 1:50 AM, Christian Theune wrote:

> Hi,
>
> On Tue, 2008-12-02 at 12:03 -0500, Jim Fulton wrote:
>> ZEO has two modes for dealing with client blob data, shared, and non-
>> shared.  In shared mode, a distributed file system is used to share a
>> blob directory with a ZEO server.  This requires management of a
>> distributed file system, in addition to the ZEO protocol.  Any  
>> caching
>> is provided by the distributed file system.
>>
>> In non-shared mode, blob data are downloaded to the ZEO client using
>> the ZEO protocol.  No distributed file-system is needed and blob  
>> files
>> are cached locally. Unfortunately, the current implementation  
>> provides
>> no facilities for managing the client cache. There are no provisions
>> in the ZEO client software for removing unused blob files and the  
>> blob
>> implementation makes almost no provision for blob file removal.
>>
>> I'm working on refactoring ClientStorage's handling of non-shared  
>> blob
>> data.  I'm implementing a mechanism for periodically cleaning out
>> files that haven't been accessed in a while. As part of this, I'm
>> going to radically change the layout of the ClientStorage's non- 
>> shared
>> blob directory.
>>
>> Currently, the bushy layout, with deeply nested directories is used.
>> While I think this layout makes some sense on the server, I don't
>> think it makes much sense on the client.  Cleaning up unused blob
>> files is complicated by the need to clean up directories too.  I'm
>> going to go for a fairly flat layout.  There will be a small number
>> (997) of directories and blob files will reside directly in these
>> directories.  (The directory will be chosen by taking the remainder  
>> of
>> dividing an oid by 997.)
>
> Any specific reason for this specific number?

It is prime, and ~1000 directories seems pretty manageable.  More  
importantly, I'm using a file lock per directory and I only allow one  
process/thread at a time to operate on a file in the directory.  I  
want a somewhat large number to try to avoid contention.


>> It appears that modern operating systems can
>> handle large directories just fine.  I've created directories with 1
>> million files on Linux/Ext, Mac OS X/HFS+, and Windows XP/NTFS and  
>> saw
>> no degredation in performance as the number of files in a directory
>> increased.
>
> FTR: The reason for introducing the bushy layout is due to  
> restrictions
> on the number of directory entries a directory can contain which  
> seem to
> be a different restriction than the number of file entries a directory
> can contain. At least on ext3 I can't create more than 65k directories
> in a directory while I still can create a lot more files in the same
> directory. Wikipedia has a generally good overview and comparison
> between file systems but doesn't cover the maximum number of directory
> entries per directory.

The ext limitation on the number of subdirectories arises from a limit  
on the number of links to an inode.  Each subdirectory has a ".."  
entry which ads a link to the containing directory.   I don't know if  
there is a limit on the number of directory entries. If there is, it  
is quite large. (I did a test of adding 10 million files to a ext  
directory, although I got tired of waiting for it after it had gotten  
up to a bit over 6 million.)

>> I plan to have ClientStorage use the file layout mentioned above.   
>> The
>> ClientStorage constructor will fail if an older layout is found. An
>> alternative is to just log a warning and ignore the existing
>> directories, as the new directories will have non-overlapping names.
>>
>> I mention this both as a heads up and to see if anyone can point  
>> out a
>> problem with my approach.  I have a feeling that no one is using non-
>> shared client blob directories for anything important yet, so I  
>> assume
>> the change won't have much effect.
>
> I am. I'd prefer if you'd fail on the directory structure instead of
> mixing it with the new approach.

OK.

Jim

--
Jim Fulton
Zope Corporation




More information about the ZODB-Dev mailing list