[Checkins] SVN: zdgbook/trunk/ Modify Object Publishing chapter accoring to Errata
Baiju M
baiju.m.mail at gmail.com
Wed Feb 18 00:40:17 EST 2009
Log message for revision 96665:
Modify Object Publishing chapter accoring to Errata
Changed:
D zdgbook/trunk/Errata.stx
U zdgbook/trunk/source/ObjectPublishing.rst
-=-
Deleted: zdgbook/trunk/Errata.stx
===================================================================
--- zdgbook/trunk/Errata.stx 2009-02-18 05:15:27 UTC (rev 96664)
+++ zdgbook/trunk/Errata.stx 2009-02-18 05:40:17 UTC (rev 96665)
@@ -1,104 +0,0 @@
-Changes to Zope Developers Guide, Chapter 2, Object Publishing to support
-Toby's Unicode changes.
-
-* Add the following before the section 'HTTP Responses' under 'Stringifying
-the published object'
-
-Character Encodings for Responses
-
- If the published method returns an object of type 'string', a plain
- 8-bit character string, the publisher will use it directly as the body of the
- response.
-
- Things are different if the published method returns a unicode string,
- because the publisher has to apply some character encoding. The published
- method can choose which character encoding it uses by setting a
- 'Content-Type' response header which includes a 'charset' property
- (setting response headers is explained later in this chapter). A
- common choice of character encoding is UTF-8. To cause the publisher
- to send unicode results as UTF-8 you need to set a
- 'Content-Type' header with the value 'text/html; charset=UTF-8'
-
- If the 'Content-Type' header does not include a charser property (or if this
- header has not been set by the published method) then the publisher will
- choose a default character encoding. Today this default is ISO-8859-1
- (also known as Latin-1) for compatability with old versions of Zope which
- did not include Unicode support. At some time in the future this default
- is likely to change to UTF-8.
-
-* Inside the section 'Argument Conversion' is a list of type conversion
-marshalling tags. Insert the following definition of 'ustring' under 'string'
-
- ustring
- Converts a variable to a Python unicode string.
-
-* and insert this definition at the bottom of the list
-
- ulines, utokens, utext
- like lines, tokens, text, but using unicode strings instead of
- plain strings.
-
-* Insert this section before 'Method Arguments'
-
-Character Encodings for Arguments
-
- The publisher needs to know what character encoding was used by the browser
- to encode form fields into the request. That depends on whether the form
- was submitted using GET or POST (which the publisher can work out for itself)
- and on the character encoding used by the page which contained the form
- (for which the publisher needs your help).
-
- In some cases you need to add a specification of the character encoding
- to each fields type converter. The full details of how this works are
- explained below, however most users do not need to deal with the full
- details:
-
- 1 If your pages all use the UTF-8 character encoding (or at least all the
- pages that contain forms) the browsers will always use UTF-8 for
- arguments. You need to add ':utf8' into all argument type converts. For
- example:
-
- <input type="text" name="name:utf8:ustring">
- <input type="checkbox" name="numbers:list:int:utf8" value="1">
- <input type="checkbox" name="numbers:list:int:utf8" value="1">
-
- % Anonymous User - Apr. 6, 2004 5:56 pm:
- 121
-
- 2 If your pages all use a character encoding which has ASCII as a subset
- (such as Latin-1, UTF-8, etc) then you do not need to specify any
- chatacter encoding for boolean, int, long, float, and date types.
- You can also omit the character encoding type converter from string,
- tokens, lines, and text types if you only need to handle ASCII characters
- in that form field.
-
- Character Encodings for Arguments; The Full Story
-
- If you are not in one of those two easy categories, you first need
- to determine which character encoding will be used by the browser to
- encode the arguments in submitted forms.
-
- 1. Forms submitted using GET, or using POST with
- "application/x-www-form-urlencoded" (the default)
-
- 1. Page uses an encoding of unicode:
- Forms are submitted using UTF8, as required by RFC 2718 2.2.5
-
- 2. Page uses another regional 8 bit encoding:
- Forms are often submitted using the same encoding as the
- page. If you choose to use such an encoding then you should
- also verify how browsers behave.
-
- 2. Forms submitted using "multipart/form-data":
-
- According to HTML 4.01 (section 17.13.4) browsers should state which
- character encoding they are using for each field in a Content-Type
- header, however this is poorly supported. The current crop of
- browsers appear to use the same encoding as the page containing
- the form.
-
- Every field needs that character encoding name appended to is converter.
- The tag parser insists that tags must only use alphanumberic characters
- or an underscore, so you might need to use a short form of the
- encoding name from the Python 'encodings' library package (such
- as utf8 rather than UTF-8).
Modified: zdgbook/trunk/source/ObjectPublishing.rst
===================================================================
--- zdgbook/trunk/source/ObjectPublishing.rst 2009-02-18 05:15:27 UTC (rev 96664)
+++ zdgbook/trunk/source/ObjectPublishing.rst 2009-02-18 05:40:17 UTC (rev 96665)
@@ -233,6 +233,29 @@
After the response method has been determined and called, the
publisher must interpret the results.
+Character Encodings for Responses
+=================================
+
+If the published method returns an object of type 'string', a plain
+8-bit character string, the publisher will use it directly as the
+body of the response.
+
+Things are different if the published method returns a unicode
+string, because the publisher has to apply some character
+encoding. The published method can choose which character encoding it
+uses by setting a 'Content-Type' response header which includes a
+'charset' property (setting response headers is explained later in
+this chapter). A common choice of character encoding is UTF-8. To
+cause the publisher to send unicode results as UTF-8 you need to set
+a 'Content-Type' header with the value 'text/html; charset=UTF-8'
+
+If the 'Content-Type' header does not include a charser property (or
+if this header has not been set by the published method) then the
+publisher will choose a default character encoding. Today this
+default is ISO-8859-1 (also known as Latin-1) for compatability with
+old versions of Zope which did not include Unicode support. At some
+time in the future this default is likely to change to UTF-8.
+
HTTP Responses
==============
@@ -682,6 +705,8 @@
- string -- Converts a variable to a Python string.
+- ustring -- Converts a variable to a Python unicode string.
+
- required -- Raises an exception if the variable is not present or
is an empty string.
@@ -709,6 +734,9 @@
endings differently, so this converter makes sure the line endings
are consistent, regardless of how they were encoded by the browser.
+- ulines, utokens, utext -- like lines, tokens, text, but using
+ unicode strings instead of plain strings.
+
If the publisher cannot coerce a request variable into the type
required by the type converter it will raise an error. This is useful
for simple applications, but restricts your ability to tailor error
@@ -728,7 +756,73 @@
In addition to these type converters, the publisher also supports
method and record arguments.
+Character Encodings for Arguments
+---------------------------------
+The publisher needs to know what character encoding was used by the
+browser to encode form fields into the request. That depends on
+whether the form was submitted using GET or POST (which the publisher
+can work out for itself) and on the character encoding used by the
+page which contained the form (for which the publisher needs your
+help).
+
+In some cases you need to add a specification of the character
+encoding to each fields type converter. The full details of how this
+works are explained below, however most users do not need to deal
+with the full details:
+
+1. If your pages all use the UTF-8 character encoding (or at least
+ all the pages that contain forms) the browsers will always use
+ UTF-8 for arguments. You need to add ':utf8' into all argument
+ type converts. For example:
+
+ <input type="text" name="name:utf8:ustring">
+ <input type="checkbox" name="numbers:list:int:utf8" value="1">
+ <input type="checkbox" name="numbers:list:int:utf8" value="1">
+
+ % Anonymous User - Apr. 6, 2004 5:56 pm:
+ 121
+
+2. If your pages all use a character encoding which has ASCII as a
+ subset (such as Latin-1, UTF-8, etc) then you do not need to
+ specify any chatacter encoding for boolean, int, long, float, and
+ date types. You can also omit the character encoding type
+ converter from string, tokens, lines, and text types if you only
+ need to handle ASCII characters in that form field.
+
+Character Encodings for Arguments; The Full Story
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+If you are not in one of those two easy categories, you first need to
+determine which character encoding will be used by the browser to
+encode the arguments in submitted forms.
+
+1. Forms submitted using GET, or using POST with
+ "application/x-www-form-urlencoded" (the default)
+
+ 1. Page uses an encoding of unicode: Forms are submitted using
+ UTF8, as required by RFC 2718 2.2.5
+
+ 2. Page uses another regional 8 bit encoding: Forms are often
+ submitted using the same encoding as the page. If you choose to
+ use such an encoding then you should also verify how browsers
+ behave.
+
+2. Forms submitted using "multipart/form-data":
+
+ According to HTML 4.01 (section 17.13.4) browsers should state
+ which character encoding they are using for each field in a
+ Content-Type header, however this is poorly supported. The current
+ crop of browsers appear to use the same encoding as the page
+ containing the form.
+
+ Every field needs that character encoding name appended to is
+ converter. The tag parser insists that tags must only use
+ alphanumberic characters or an underscore, so you might need to
+ use a short form of the encoding name from the Python 'encodings'
+ library package (such as utf8 rather than UTF-8).
+
+
Method Arguments
----------------
More information about the Checkins
mailing list