[ZODB-Dev] Some interesting (to some:) numbers

Nitro nitro at dr-code.org
Tue May 11 07:59:17 EDT 2010


Am 11.05.2010, 13:47 Uhr, schrieb Adam GROSZER <agroszer at gmail.com>:

> Hello Jim,
>
> Tuesday, May 11, 2010, 1:37:19 PM, you wrote:
>
> JF> On Tue, May 11, 2010 at 7:13 AM, Adam GROSZER <agroszer at gmail.com>  
> wrote:
>>> Hello Jim,
>>>
>>> Tuesday, May 11, 2010, 12:33:04 PM, you wrote:
>>>
>>> JF> On Tue, May 11, 2010 at 3:16 AM, Adam GROSZER <agroszer at gmail.com>  
>>> wrote:
>>>>> Hello Jim,
>>>>>
>>>>> Monday, May 10, 2010, 1:27:00 PM, you wrote:
>>>>>
>>>>> JF> On Sun, May 9, 2010 at 4:59 PM, Roel Bruggink  
>>>>> <roel at fourdigits.nl> wrote:
>>>>>>> That's really interesting! Did you notice any issues performance  
>>>>>>> wise, or
>>>>>>> didn't you check that yet?
>>>>>
>>>>> JF> I didn't check performance. I just iterated over a file storage  
>>>>> file,
>>>>> JF> checking compressed and uncompressed pickle sizes.
>>>>>
>>>>> I'd say some checksum is then also needed to detect bit failures that
>>>>> mess up the compressed data.
>>>
>>> JF> Why?
>>>
>>> I think the gzip algo compresses to a bit-stream, where even one bit
>>> has an error the rest of the uncompressed data might be a total mess.
>>> If that one bit is relatively early in the stream it's fatal.
>>> Salvaging the data is not a joy either.
>>> I know at this level we should expect that the OS and any underlying
>>> infrastructure should provide error-free data or fail.
>>> Tho I've seen some magic situations where the file copied without
>>> error through a network, but at the end CRC check failed on it :-O
>
> JF> How would a checksum help?  All it would do is tell you your hosed.
> JF> It wouldn't make you any less hosed.
>
> Yes, but I would know why it's hosed.
> Not like I'm expecting 2+2=4 and get 5 somewhere deep in the custom
> app that does some calculation.

You could have bitflips anywhere in the database, not just the payload  
parts. You'd have to checksum and test everything all the time. Imo it's  
not worth the complexity and performance penalty given today's redundant  
storages like RAID, ZRS or zeoraid.

Btw, the current pickle payload format is not secured against any bitflips  
either I think.

-Matthias


More information about the ZODB-Dev mailing list