[Checkins] SVN: lovely.tal/trunk/ - always strip out script-tags

Juergen Kartnaller juergen at kartnaller.at
Tue Feb 3 03:06:16 EST 2009


Log message for revision 96018:
   - always strip out script-tags 
   - added "allow-scripts" for explicit enabling of script-tags
  

Changed:
  U   lovely.tal/trunk/CHANGES.txt
  U   lovely.tal/trunk/src/lovely/tal/README.txt
  U   lovely.tal/trunk/src/lovely/tal/sample.pt
  U   lovely.tal/trunk/src/lovely/tal/textformatter.py

-=-
Modified: lovely.tal/trunk/CHANGES.txt
===================================================================
--- lovely.tal/trunk/CHANGES.txt	2009-02-03 06:40:58 UTC (rev 96017)
+++ lovely.tal/trunk/CHANGES.txt	2009-02-03 08:06:16 UTC (rev 96018)
@@ -2,6 +2,12 @@
 lovely.tal
 ==========
 
+after
+=====
+
+ - always strip out script-tags 
+ - added "allow-scripts" for explicit enabling of script-tags
+
 2008/11/12 0.5.0a1:
 ===================
 

Modified: lovely.tal/trunk/src/lovely/tal/README.txt
===================================================================
--- lovely.tal/trunk/src/lovely/tal/README.txt	2009-02-03 06:40:58 UTC (rev 96017)
+++ lovely.tal/trunk/src/lovely/tal/README.txt	2009-02-03 08:06:16 UTC (rev 96018)
@@ -247,6 +247,78 @@
   'ein superlangerstring…'
 
 
+
+Option 'allow-scripts'
+======================
+
+the option 'allow-scripts' has to be set explicitly if you want to include scripts.
+
+  >>> context = Context({'allow-all':True, 'allow-scripts':True})
+  >>> html = """<p>this is html containing a 
+  ...             <script type="text/javascript">
+  ...                alert("i'm not allowed');
+  ...             </script> script.
+  ...           </p>"""
+  >>> print tf._doFormat(html, context)
+  <p>this is html containing a 
+              <script type="text/javascript">
+                 alert("i'm not allowed');
+              </script> script.
+            </p>
+
+
+if not, all scripts will be stripped out although allow-all is enabled::
+
+  >>> context = Context({'allow-all':True})
+  >>> html = """<p>this is html containing a 
+  ...             <script type="text/javascript">
+  ...                alert("i'm not allowed');
+  ...             </script> script.
+  ...           </p>"""
+  >>> print tf._doFormat(html, context)
+  <p>this is html containing a 
+             script.
+          </p>
+
+test uppercase and whitespaces::
+
+  >>> html = """<p>this is html containing a 
+  ...             <   SCRIPT
+  ...                type="text/javascript">
+  ...                alert("i'm not allowed');
+  ...             <  /SCRIPT > script.
+  ...           </p>"""
+  >>> print tf._doFormat(html, context)
+  <p>this is html containing a 
+               script.
+            </p>
+
+escaped scripttags::
+
+  >>> html = """<p>this is html containing a 
+  ...             &lt;script type="text/javascript"&gt;
+  ...                alert("i'm not allowed');
+  ...             &lt;/SCRIPT&gt; script.
+  ...           </p>"""
+  >>> print tf._doFormat(html, context)
+  <p>this is html containing a 
+               script.
+            </p>
+
+
+escaped scripts including whitespace and different case::
+
+  >>> html = """<p>this is html containing a 
+  ...             &lt;SCRIPT 
+  ...                type="text/javascript"&gt;
+  ...                alert("i'm not allowed');
+  ...             &lt; /SCRIPT &gt; script.
+  ...           </p>"""
+  >>> print tf._doFormat(html, context)
+  <p>this is html containing a 
+               script.
+            </p>
+
 Option 'urlparse'
 =================
 

Modified: lovely.tal/trunk/src/lovely/tal/sample.pt
===================================================================
--- lovely.tal/trunk/src/lovely/tal/sample.pt	2009-02-03 06:40:58 UTC (rev 96017)
+++ lovely.tal/trunk/src/lovely/tal/sample.pt	2009-02-03 08:06:16 UTC (rev 96018)
@@ -11,9 +11,12 @@
                    contain no html-tags, therefor the < and > are replaced 
                    by &lt;, &gt;
                    
-   option allow-all: allow all html-tags in the string
+   option allow-all: allow all html-tags in the string (excluding scripts)
    					 e. g. "allow-all: 'True'"
    
+   option allow-scripts: explicitly allow scripts in the string
+             e. g. "allow-scripts: python:True"
+   
    option break-string: force the string to break after a given number of characters
    						e.g. "break-string python:25" breaks the string after 
    						a sequence of 25 characters not containing a linebreak

Modified: lovely.tal/trunk/src/lovely/tal/textformatter.py
===================================================================
--- lovely.tal/trunk/src/lovely/tal/textformatter.py	2009-02-03 06:40:58 UTC (rev 96017)
+++ lovely.tal/trunk/src/lovely/tal/textformatter.py	2009-02-03 08:06:16 UTC (rev 96018)
@@ -17,6 +17,8 @@
 import re
 from zope.tales.expressions import PathExpr
 
+NO_SCRIPTS = re.compile('<\s*script[^>]+>(.*)<\s*\/script\s*>', re.I | re.DOTALL)
+NO_ESCAPED_SCRIPTS = re.compile('&lt;\s*script[^&]+.*&lt;\s*\/script\s*&gt;', re.I | re.DOTALL)
 
 class TextFormatter(PathExpr):
 
@@ -31,6 +33,9 @@
 
         allowAll = ('allow-all' in context.vars)
 
+        if not 'allow-scripts' in context.vars:
+            rendered = self._stripScripts(rendered, context)
+
         if 'clear-html' in context.vars:
             rendered = self._clearHTML(rendered, context)
 
@@ -160,6 +165,11 @@
             return rendered + attach
         return rendered
 
+    def _stripScripts(self, rendered, context):
+        rendered = re.sub(NO_SCRIPTS, '', rendered)
+        rendered = re.sub(NO_ESCAPED_SCRIPTS, '', rendered)
+        return rendered
+
     def _urlparse(self, rendered, context):
         #searches for urls coded with www. or http:
 



More information about the Checkins mailing list