[ZODB-Dev] URGENT: ZODB down - Important Software Application at CERN
Pedro Ferreira
jose.pedro.ferreira at cern.ch
Mon May 25 09:23:12 EDT 2009
Dear Andreas, Marius,
> This means that you're using ZEO, right? Have you tried to use strace
> to see what it's doing? Is it using any CPU time?
>
>
Yes, we're using ZEO.
It's doing a lot of lseek() and read() calls, i.e.:
read(6, "eq\7cdatetime\ndatetime\nq\10(U\n\7\326\t\r\f"..., 4096) = 4096
lseek(6, 3156590592, SEEK_SET) = 3156590592
lseek(6, 3156590592, SEEK_SET) = 3156590592
lseek(6, 3156590592, SEEK_SET) = 3156590592
lseek(6, 3156590592, SEEK_SET) = 3156590592
read(6, "\n_authorGenq9(U\10\0\0\0\0\3\v\375\367q:h\5tQU\t"..., 4096) = 4096
lseek(6, 3156594688, SEEK_SET) = 3156594688
lseek(6, 3156594688, SEEK_SET) = 3156594688
lseek(6, 3156594688, SEEK_SET) = 3156594688
lseek(6, 3156594688, SEEK_SET) = 3156594688
lseek(6, 3156594688, SEEK_SET) = 3156594688
lseek(6, 3156594688, SEEK_SET) = 3156594688
And the allocated memory grows up to ~200 MB, data.fs.index.index_tmp is
created, and then the process seems to die and restart (different PID).
It seems to go up to 100% for a significant time (~20 min), then slowly
goes down (moment in which some data seems to be written to index_tmp),
and then comes back to 100% again, repeating it maybe a couple of times
before dying and starting all over again.
>> We tried to restart the database, but the
>> script seems to hang, while trying to create the index:
>>
>> -rw-r--r-- 1 root root 6396734704 May 25 13:21 dataout.fs
>> -rw-r--r-- 1 root root 173 May 25 12:21 dataout.fs.index
>> -rw-r--r-- 1 root root 229755165 May 25 13:22
>> dataout.fs.index.index_tmp
>> -rw-r--r-- 1 root root 7 May 25 12:21 dataout.fs.lock
>> -rw-r--r-- 1 root root 70956364 May 25 13:21 dataout.fs.tmp
>>
>> We tried to do fsrecovery, but it says "0 bytes removed during
>> recovery", and the result ends up being the same. We tried it in
>> different machines, with no success. In one of them, after a while
>> trying to create the index, a Python exception was thrown, saying
>> "maximum recursion depth exceeded".
>>
>
> I'm not intimately familiar with the internals of ZODB. If it's doing
> object graph traversals recursively, and if your object graph is very
> deep, you may mitigate this by calling, e.g.
>
> sys.setrecursionlimit(2 * sys.getrecursionlimit())
>
OK, we'll give it a try. Thanks a lot!
We noticed there was a problem when a pack failed (yesterday, around
12:00 CET):
Traceback (most recent call last):
File "/opt/python24/lib/python2.4/site-packages/MaKaC/tools/packDB.py", line 24, in ?
DBMgr.getInstance().pack(days=1)
File "/opt/python24/lib/python2.4/site-packages/MaKaC/common/db.py", line 135, in pack
self._storage.pack(days=days)
File "/opt/python24/lib/python2.4/site-packages/ZEO/ClientStorage.py", line 865, in pack
return self._server.pack(t, wait)
File "/opt/python24/lib/python2.4/site-packages/ZEO/ServerStub.py", line 161, in pack
self.rpc.call('pack', t, wait)
File "/opt/python24/lib/python2.4/site-packages/ZEO/zrpc/connection.py", line 536, in call
raise inst # error raised by server
RuntimeError: maximum recursion depth exceeded
We were packing a 15GB (which normally results in a 6-7 GB) database.
So, we'll try increasing the recursion depth limit (maybe the process is
dying because it reaches the limit?).
Silly question: is there any way to access (read-only) the data without
generating the index?
Thanks, once again,
We really appreciate your help.
Regards,
Pedro
> ------------------------------------------------------------------------
>
> _______________________________________________
> For more information about ZODB, see the ZODB Wiki:
> http://www.zope.org/Wikis/ZODB/
>
> ZODB-Dev mailing list - ZODB-Dev at zope.org
> http://mail.zope.org/mailman/listinfo/zodb-dev
>
--
José Pedro Ferreira
(Software Developer, Indico Project)
IT-UDS-AVC
CERN
Geneva, Switzerland
Office 513-R-042
Tel. +41 22 76 77159
More information about the ZODB-Dev
mailing list