[Zope3-Users] Re: UnicodeEncodeErrors from zope/app/maildir.py
r.ritz at biologie.hu-berlin.de
Wed Nov 29 09:50:25 EST 2006
Martijn Pieters schrieb:
> On 11/29/06, Raphael Ritz
> <r.ritz at biologie.hu-berlin.de> wrote:
>> The mail's body or pay load on the other hand can have
>> any encoding.
> Nonsense. From RFC 2822, section 2.1:
> At the most basic level, a message is a series of characters. A
> message that is conformant with this standard is comprised of
> characters with values in the range 1 through 127 and interpreted as
> US-ASCII characters [ASCII]. For brevity, this document sometimes
> refers to this range of characters as simply "US-ASCII characters".
> Note: This standard specifies that messages are made up of characters
> in the US-ASCII range of 1 through 127. There are other documents,
> specifically the MIME document series [RFC2045, RFC2046, RFC2047,
> RFC2048, RFC2049], that extend this standard to allow for values
> outside of that range. Discussion of those mechanisms is not within
> the scope of this standard.
> In other words, the MIME RFCs tell you how to encode your message in
> such a way that the message body, the characters that go across the
> wire, fall within the ASCII range. MIME uses a transfer encoding like
> quoted-printable to ensure this.
Looks like I am (we are?) confusing things here and admittedly I've been
too sloppy in my wording - sorry for that.
You are of course right with respect to what goes over the wire.
I was looking at this from a "users" point of view where the
MIME RFCs quoted above are what you are intested in in the end.
What I was trying to get at rather is the fact that encoding
is handled differently for header versus body but anyway ...
After all, it's one purpose of Python's email module to hide
all this away from you and I still think that the OP's problem has
nothing to do with email per se but with mixing string types
as I tried to point to in my other posting in this thread.
PS: ... just sometimes getting tired from broken email clients
sending messages in wrong, unspecified, or weird encodings which makes
it really hard to process them properly, e.g., when piping emails
into Zope sites and creating content from them - ever tried this
with a German document containing umlauts sent by OutlookExpress? :-(
More information about the Zope3-users