[Zope3-dev] Re: [Zope3-Users] PsycopgDA problem

Dmitry Vasiliev lists at hlabs.spb.ru
Thu Mar 30 08:57:29 EST 2006


Stuart Bishop wrote:
> I think I know what might have been going on. The earlier patch neglected to
> change getEncoding() to always return UTF-8. So the code in zope.app.rdb
> that encodes the queries was encoding them to whatever getEncoding()
> happened to return rather than UTF8.

Ugh, I guess I've just found the real problem. I've used PostgreSQL 7.4.7 and 
the attached simple script for testing:

1. The client encoding works just fine for a win1251-encoded database:

$ psql -l
          List of databases
     Name     |  Owner   | Encoding
-------------+----------+-----------
  foo         | foo      | WIN

$ python pgtest.py
[('WIN',)]
('\xc0',)
[('unicode',)]
('\xd0\x90',)
('\xd0\x90',)

2. For a database with the default encoding (created by 'createdb -U foo foo) 
the client encoding doesn't matter, it seems in this case PostgreSQL treats all 
characters as raw data:

$ psql -l
          List of databases
     Name     |  Owner   | Encoding
-------------+----------+-----------
  foo         | foo      | SQL_ASCII

$ python pgtest.py
[('SQL_ASCII',)]
('\xc0',)
[('unicode',)]
('\xc0',)
('\xd0\x90',)

I tried the adapter with "SET client_encoding" and a database with the default 
encoding and of course always got UnicodeDecodeError for win1251-encoded data.

So conclude the only portable way to deal with the encodings is the input form 
with the text (or choice) field.

-- 
Dmitry Vasiliev (dima at hlabs.spb.ru)
     http://hlabs.spb.ru
-------------- next part --------------
import psycopg

connection = psycopg.connect("dbname=foo user=foo")
cursor = connection.cursor()

cursor.execute("SHOW client_encoding")
print cursor.fetchall()

cursor.execute("DELETE FROM foo")

cursor.execute(u"insert into foo values ('\N{CYRILLIC CAPITAL LETTER A}')".encode('cp1251'))

cursor.execute("SELECT * FROM foo")
for row in cursor.fetchall():
    print row

cursor.execute("SET client_encoding TO UNICODE")
cursor.execute("SHOW client_encoding")
print cursor.fetchall()

cursor.execute(u"insert into foo values ('\N{CYRILLIC CAPITAL LETTER A}')".encode('utf-8'))

cursor.execute("SELECT * FROM foo")
for row in cursor.fetchall():
    print row


More information about the Zope3-dev mailing list