As per the attached message from nav-users, I've added SF#2014809 (UnicodeDecodeError in IP Device Info): http://sourceforge.net/support/tracker.php?aid=2014809
The problem stems from the fact that the integration between Django and Cheetah templates will mix unicode and str objects in NAV's Cheetah templates. Django will fill the template with unicode objects, while legacy NAV code will fill it with strings encoded as UTF-8.
When Cheetah attempts join these objects into a single string, using the regular string join method, Python will try do decode the str objects into unicode objects using the ASCII codec. This fails miserably as soon as it hits a str object containing international characters.
Our local quickfix on navdev seems to have been to place a call to sys.setdefaultencoding('utf-8') in sitecustomize.py, to make sure Python uses the utf-8 codec for these operations instead. This must be done in the sitecustomize module, because the site module will remove the function from the sys module's namespace before the actual Python program starts.
We don't want to force users to add these lines to their local sitecustomize module, so we need to find a better fix for this. I propose two different solutions, one for which I've attached a patch.
My patch alters the _cheetah_render helper function in the nav.django.shortcuts module. This function takes a string/unicode object representing a fully rendered Django template and places this in a Cheetah template variable. My patch encodes the entire unicode object from Django as an UTF-8 string, and places this in the Cheetah template.
Another possible solution is to write a Cheetah filter (an example of this can be found under "Encoding with Unicode" on this page: http://wiki.cheetahtemplate.org/cheetah-recipes.html) which makes sure that all values in a Cheetah template are either unicode objects or utf-8 encoded strings.
I've opted for the first fix, as it was smaller and quicker to implement, but if you other Django enthusiasts have comments, I would really like to hear them. I also wouldn't mind a comment from Stein Magnus, who wrote the Django/Cheetah integration in the first place :-)
On Thu, 10 Jul 2008 10:19:46 +0200 Morten Brekkevold morten.brekkevold@uninett.no wrote:
I've opted for the first fix, as it was smaller and quicker to implement, but if you other Django enthusiasts have comments, I would really like to hear them. I also wouldn't mind a comment from Stein Magnus, who wrote the Django/Cheetah integration in the first place :-)
First comment came offline, via Magnus, who found a problem with the patch during development on navdev. Apparently, Django doesn't always render a unicode object; sometimes it returns a str object. Calling the encode method of a str object with an 'utf-8' argument causes Puthon to decode the object as ASCII and then re-encode it as UTF-8, the former step failing when the str object already contains utf-8 data.
Revised patch attached.
On Thu, 10 Jul 2008 13:17:14 +0200 Morten Brekkevold morten.brekkevold@uninett.no wrote:
the encode method of a str object with an 'utf-8' argument causes Puthon to decode the object as ASCII and then re-encode it as UTF-8, the former step failing when the str object already contains utf-8 data.
Revised patch attached.
Well, it appears there are no comments, so I've committed the patch to default and series/3.4.x.