python打印unicode字符问题_在python中打印unicode字符串，无论环境如何-优快云博客

本文提供了一个通用解决方案，用于从Python脚本中跨平台地打印Unicode字符串。该方案确保了在Python 2.7和3.x版本下都能正确显示Unicode字符，并且能够适应不同终端设置及环境变量。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

I'm trying to find a generic solution to print unicode strings from a python script.

The requirements are that it must run in both python 2.7 and 3.x, on any platform, and with any terminal settings and environment variables (e.g. LANG=C or LANG=en_US.UTF-8).

The python print function automatically tries to encode to the terminal encoding when printing, but if the terminal encoding is ascii it fails.

For example, the following works when the environment "LANG=enUS.UTF-8":

x = u'\xea'

print(x)

But it fails in python 2.7 when "LANG=C":

UnicodeEncodeError: 'ascii' codec can't encode character u'\xea' in position 0: ordinal not in range(128)

The following works regardless of the LANG setting, but would not properly show unicode characters if the terminal was using a different unicode encoding:

print(x.encode('utf-8'))

The desired behavior would be to always show unicode in the terminal if it is possible and show some encoding if the terminal does not support unicode. For example, the output would be UTF-8 encoded if the terminal only supported ascii. Basically, the goal is to do the same thing as the python print function when it works, but in the cases where the print function fails, use some default encoding.

解决方案

You can handle the LANG=C case by telling sys.stdout to default to UTF-8 in cases when it would otherwise default to ASCII.

import sys, codecs

if sys.stdout.encoding is None or sys.stdout.encoding == 'ANSI_X3.4-1968':

utf8_writer = codecs.getwriter('UTF-8')

if sys.version_info.major < 3:

sys.stdout = utf8_writer(sys.stdout, errors='replace')

else:

sys.stdout = utf8_writer(sys.stdout.buffer, errors='replace')

print(u'\N{snowman}')

The above snippet fulfills your requirements: it works in Python 2.7 and 3.4, and it doesn't break when LANG is in a non-UTF-8 setting such as C.

It is not a new technique, but it's surprisingly hard to find in the documentation. As presented above, it actually respects non-UTF-8 settings such as ISO 8859-*. It only defaults to UTF-8 if Python would have bogusly defaulted to ASCII, breaking the application.