[Zope] easy regular expression for URL fixup

Thomas B. Passin tpassin@mitretek.org
Wed, 6 Mar 2002 15:16:21 -0500


[Ed Colmar]
>
> I'm working up a quick re to give me the folder above a webpage...  For
> instance:
>
> ###  I want: http://www.the.net/bigfolder/ ###
> import re
> url = "http://www.the.net/bigfolder/somepage.html"
> htmlfile = re.compile("/\w*\.html")
> htmlfile.match(href_url)
> if htmlfile:
>     folder_url = htmlfile.sub(href_url, "/")
>
>
> For some reason I cannot get my re to do this right...
>

Double each backslash, otherwise Python interprets them and removes them.
This isn't new behavior, though, been around for years.  Also,
htmllfile.match should return a match object, so you really want something
like:

m=htmlfile.match(href_url)
if m:
    .....

Cheers,

Tom P