[Checkins] SVN: zdgbook/trunk/ Modify Object Publishing chapter accoring to Errata

Wed Feb 18 00:40:17 EST 2009

Log message for revision 96665:
  Modify Object Publishing chapter accoring to Errata
  

Changed:
  D   zdgbook/trunk/Errata.stx
  U   zdgbook/trunk/source/ObjectPublishing.rst

-=-
Deleted: zdgbook/trunk/Errata.stx
===================================================================

--- zdgbook/trunk/Errata.stx	2009-02-18 05:15:27 UTC (rev 96664)
+++ zdgbook/trunk/Errata.stx	2009-02-18 05:40:17 UTC (rev 96665)
@@ -1,104 +0,0 @@
-Changes to Zope Developers Guide, Chapter 2, Object Publishing to support
-Toby's Unicode changes.
-
-* Add the following before the section 'HTTP Responses' under 'Stringifying 
-the published object'
-
-Character Encodings for Responses
-
- If the published method returns an object of type 'string', a plain
- 8-bit character string, the publisher will use it directly as the body of the
- response. 
-
- Things are different if the published method returns a unicode string,
- because the publisher has to apply some character encoding. The published
- method can choose which character encoding it uses by setting a
- 'Content-Type' response header which includes a 'charset' property
- (setting response headers is explained later in this chapter). A
- common choice of character encoding is UTF-8. To cause the publisher
- to send unicode results as UTF-8 you need to set a
- 'Content-Type' header with the value 'text/html; charset=UTF-8'
-
- If the 'Content-Type' header does not include a charser property (or if this
- header has not been set by the published method) then the publisher will
- choose a default character encoding. Today this default is ISO-8859-1
- (also known as Latin-1) for compatability with old versions of Zope which
- did not include Unicode support. At some time in the future this default
- is likely to change to UTF-8.
-
-* Inside the section 'Argument Conversion' is a list of type conversion 
-marshalling tags. Insert the following definition of 'ustring' under 'string'
-
- ustring
-  Converts a variable to a Python unicode string.
-
-* and insert this definition at the bottom of the list
-
- ulines, utokens, utext
-  like lines, tokens, text, but using unicode strings instead of
-  plain strings.
-
-* Insert this section before 'Method Arguments'
-
-Character Encodings for Arguments
-
- The publisher needs to know what character encoding was used by the browser
- to encode form fields into the request. That depends on whether the form
- was submitted using GET or POST (which the publisher can work out for itself)
- and on the character encoding used by the page which contained the form
- (for which the publisher needs your help).
-
- In some cases you need to add a specification of the character encoding
- to each fields type converter. The full details of how this works are
- explained below, however most users do not need to deal with the full
- details:
-
- 1 If your pages all use the UTF-8 character encoding (or at least all the
-   pages that contain forms) the browsers will always use UTF-8 for
-   arguments. You need to add ':utf8' into all argument type converts. For
-   example:
-
-   <input type="text" name="name:utf8:ustring">
-   <input type="checkbox" name="numbers:list:int:utf8" value="1">
-   <input type="checkbox" name="numbers:list:int:utf8" value="1">
-
-     % Anonymous User - Apr. 6, 2004 5:56 pm:
-      121
-
- 2 If your pages all use a character encoding which has ASCII as a subset
-   (such as Latin-1, UTF-8, etc) then you do not need to specify any
-   chatacter encoding for boolean, int, long, float, and date types.
-   You can also omit the character encoding type converter from string,
-   tokens, lines, and text types if you only need to handle ASCII characters
-   in that form field.
-
-  Character Encodings for Arguments; The Full Story
-
-   If you are not in one of those two easy categories, you first need
-   to determine which character encoding will be used by the browser to
-   encode the arguments in submitted forms.
-
-   1. Forms submitted using GET, or using POST with 
-      "application/x-www-form-urlencoded" (the default) 
-
-      1. Page uses an encoding of unicode:
-         Forms are submitted using UTF8, as required by RFC 2718 2.2.5
-
-      2. Page uses another regional 8 bit encoding:
-         Forms are often submitted using the same encoding as the
-         page. If you choose to use such an encoding then you should
-         also verify how browsers behave.
-
-   2. Forms submitted using "multipart/form-data": 
-
-      According to HTML 4.01 (section 17.13.4) browsers should state which
-      character encoding they are using for each field in a Content-Type
-      header, however this is poorly supported. The current crop of
-      browsers appear to use the same encoding as the page containing
-      the form. 
-
-   Every field needs that character encoding name appended to is converter.
-   The tag parser insists that tags must only use alphanumberic characters
-   or an underscore, so you might need to use a short form of the
-   encoding name from the Python 'encodings' library package (such
-   as utf8 rather than UTF-8).

Modified: zdgbook/trunk/source/ObjectPublishing.rst
===================================================================
--- zdgbook/trunk/source/ObjectPublishing.rst	2009-02-18 05:15:27 UTC (rev 96664)
+++ zdgbook/trunk/source/ObjectPublishing.rst	2009-02-18 05:40:17 UTC (rev 96665)
@@ -233,6 +233,29 @@
 After the response method has been determined and called, the
 publisher must interpret the results.
 
+Character Encodings for Responses
+=================================
+
+If the published method returns an object of type 'string', a plain
+8-bit character string, the publisher will use it directly as the
+body of the response.
+
+Things are different if the published method returns a unicode
+string, because the publisher has to apply some character
+encoding. The published method can choose which character encoding it
+uses by setting a 'Content-Type' response header which includes a
+'charset' property (setting response headers is explained later in
+this chapter). A common choice of character encoding is UTF-8. To
+cause the publisher to send unicode results as UTF-8 you need to set
+a 'Content-Type' header with the value 'text/html; charset=UTF-8'
+
+If the 'Content-Type' header does not include a charser property (or
+if this header has not been set by the published method) then the
+publisher will choose a default character encoding. Today this
+default is ISO-8859-1 (also known as Latin-1) for compatability with
+old versions of Zope which did not include Unicode support. At some
+time in the future this default is likely to change to UTF-8.
+
 HTTP Responses
 ==============
 
@@ -682,6 +705,8 @@
 
 - string -- Converts a variable to a Python string.
 
+- ustring -- Converts a variable to a Python unicode string.
+
 - required -- Raises an exception if the variable is not present or
   is an empty string.
 
@@ -709,6 +734,9 @@
   endings differently, so this converter makes sure the line endings
   are consistent, regardless of how they were encoded by the browser.
 
+- ulines, utokens, utext -- like lines, tokens, text, but using
+  unicode strings instead of plain strings.
+
 If the publisher cannot coerce a request variable into the type
 required by the type converter it will raise an error. This is useful
 for simple applications, but restricts your ability to tailor error
@@ -728,7 +756,73 @@
 In addition to these type converters, the publisher also supports
 method and record arguments.
 
+Character Encodings for Arguments
+---------------------------------
 
+The publisher needs to know what character encoding was used by the
+browser to encode form fields into the request. That depends on
+whether the form was submitted using GET or POST (which the publisher
+can work out for itself) and on the character encoding used by the
+page which contained the form (for which the publisher needs your
+help).
+
+In some cases you need to add a specification of the character
+encoding to each fields type converter. The full details of how this
+works are explained below, however most users do not need to deal
+with the full details:
+
+1. If your pages all use the UTF-8 character encoding (or at least
+   all the pages that contain forms) the browsers will always use
+   UTF-8 for arguments. You need to add ':utf8' into all argument
+   type converts. For example:
+
+   <input type="text" name="name:utf8:ustring">
+   <input type="checkbox" name="numbers:list:int:utf8" value="1">
+   <input type="checkbox" name="numbers:list:int:utf8" value="1">
+
+     % Anonymous User - Apr. 6, 2004 5:56 pm:
+      121
+
+2. If your pages all use a character encoding which has ASCII as a
+   subset (such as Latin-1, UTF-8, etc) then you do not need to
+   specify any chatacter encoding for boolean, int, long, float, and
+   date types.  You can also omit the character encoding type
+   converter from string, tokens, lines, and text types if you only
+   need to handle ASCII characters in that form field.
+
+Character Encodings for Arguments; The Full Story
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+If you are not in one of those two easy categories, you first need to
+determine which character encoding will be used by the browser to
+encode the arguments in submitted forms.
+
+1. Forms submitted using GET, or using POST with 
+   "application/x-www-form-urlencoded" (the default) 
+
+   1. Page uses an encoding of unicode: Forms are submitted using
+      UTF8, as required by RFC 2718 2.2.5
+
+   2. Page uses another regional 8 bit encoding: Forms are often
+      submitted using the same encoding as the page. If you choose to
+      use such an encoding then you should also verify how browsers
+      behave.
+
+2. Forms submitted using "multipart/form-data":
+
+   According to HTML 4.01 (section 17.13.4) browsers should state
+   which character encoding they are using for each field in a
+   Content-Type header, however this is poorly supported. The current
+   crop of browsers appear to use the same encoding as the page
+   containing the form.
+
+   Every field needs that character encoding name appended to is
+   converter.  The tag parser insists that tags must only use
+   alphanumberic characters or an underscore, so you might need to
+   use a short form of the encoding name from the Python 'encodings'
+   library package (such as utf8 rather than UTF-8).
+
+
 Method Arguments
 ----------------