[ZODB-Dev] Space used by IOBTrees

Guido van Rossum guido@python.org
Fri, 28 Feb 2003 08:52:14 -0500


> Another question: I had a closer look at the pickles itself using
> pickletools. The PCDATA parts of the XML document were stored inside
> the tree as unicode strings. Inside the disassembled pickle
> they were "marked" as BINUNICODE. What encoding is used to pickle
> unicode strings (looks like utf-8 rather when UCS-2)?

RTSL.

> Another observation: it looks like the names of attributes only
> appear once at the start of the pickle and are referenced later
> somehow. So I would not matter either to have long or short
> attribute names for nested datastructures (looking at the complete
> size of the pickle because I only appear once)...right?

Yes, but of course the names are still repeated in each pickle.  See
PEP 307 for a way to shorten them more.

--Guido van Rossum (home page: http://www.python.org/~guido/)