[ZODB-Dev] Sharing (persisted) strings between threads

Jim Fulton jim at zope.com
Wed Dec 8 07:28:14 EST 2010


On Wed, Dec 8, 2010 at 5:06 AM, Malthe Borch <mborch at gmail.com> wrote:
> Currently, when a thread loads a non-ghost into its object cache, its
> straight from being unpickled. That means that if two threads load the
> exact same object, any (immutable) string contained in the object
> state will be allocated for in duplicate (or in general, on the count
> of the active threads).
>
> If instead, all unpickled strings were made canonical via a weak
> dictionary, there would be only one copy in memory, no matter the
> thread count, e.g.:
>
>  string = weak_string_map.setdefault(string, string)
>
> If the returned string was a different (canonical) copy, the duplicate
> would immediately be ready for garbage collection.
>
> This is a real win in memory savings. Using Plone, I experimented with
> the approach by using the Python pickle implementation and interning
> all byte strings (using ``intern``) directly in the unpickle routine
> to the same effect:
>
>    def load_binstring(self):
>        len = mloads('i' + self.read(4))
>        string = self.read(len)
>        interned = intern(string)    # (sic)
>        self.append(interned)
>
> With 20 active threads, each having rendered the Plone 4 front page,
> this approach reduced the memory usage with 70 MB.

Out of a total of what?

Note that if a process is CPU bound (as most dynamic Python apps
should be), then there is little or no benefit in having multiple
threads, due to the (damn) GIL.

If your app only renders pages based on data read from a ZODB, and
it's not CPU bound with a single thread, then your database config is
probably wrong.

> Note that unicode
> strings aren't internable (but the alternative technique of using a
> weak mapping should work fine).

Except that you can't create wekrefs to strings or unicode.

Also, while interning is fine for an experiment, it's wasteful for
strings that are rarely needed.

Sharing immutable data between threads is very appealing
intellectually. I've certainly thoughtr about it a lot. In practice,
I doubt the benefit will be worth the extra overhead (let alond the
effort :).

Jim

--
Jim Fulton


More information about the ZODB-Dev mailing list