[Zope] Regular Expressions, External methods and my great big headache

Chris Muldrow muldrow@mac.com
Thu, 18 Oct 2001 13:32:21 -0400


Howdy! We've been using a rather lengthy external method to parse out 
pieces of some files on our site. It's been working well, but we just 
upgraded from Zope 2.3 to 2.4.1, and now I'm getting a "maximum 
recursion error" when the script runs. I've narrowed it down to a couple 
of regular expressions--I take them out, and the thing works fine. I 
suspect there's an error with my logic in the building of these re 
compiles, but I'm not sure. (And they've never failed before. Anyone see 
the error of my ways? (Attached is the function that's failing. It's 
being fed news story-length chunks of text with html code inside. The 
first re is meant to pull out the contents of the body tag, the second 
is meant to grab the contents of any non- <p> tag)
Thanks! - Chris M.

def getBody(raw):
	""" Generates the document source for the object by parsing out the 
body tag in the story and then
	stripping out all tags but the paragraph tags. """
	r=re.compile('<body>(.*?)</body>', re.DOTALL|re.IGNORECASE) # this 
is a problem re
	r2=re.compile('<(?P<tag>[^p][a-zA-Z1-9]*)>.*?</(?P=tag)>', 
re.DOTALL|re.IGNORECASE) # and this is a problem
	r3=re.compile('<briefhead>', re.IGNORECASE|re.DOTALL)
	r4=re.compile('</briefhead>', re.IGNORECASE|re.DOTALL)
	r5=re.compile('!--briefhead--', re.IGNORECASE|re.DOTALL)
	r6=re.compile('!--briefheadend--', re.IGNORECASE|re.DOTALL)
	raw=r3.sub('!--briefhead--', raw)
	raw=r4.sub('!--briefheadend--', raw)
	i=r.search(raw) # this is the line choking things
	i=r2.sub(' ', i.group(1))
	i=r5.sub('<span class="briefhead">', i)
	i=r6.sub('</span>', i)
	if i:
		return i
	else:
		return ''