别小看了L10N&I18N

最新推荐文章于 2024-12-14 08:40:45 发布

原创最新推荐文章于 2024-12-14 08:40:45 发布 · 4k 阅读

3 ·

CC 4.0 BY-SA版权

文章标签：

#solaris #object #string #reference #windows #exception

Enhance VC++/C++ 专栏收录该内容

6 篇文章

订阅专栏

本文介绍了在一个需要支持非英文环境的C/S实时分布式系统中实施Unicode的具体步骤和技术难点。包括Oracle数据库字段类型转换、OCI接口使用、宽字符集支持、CORBA通信适配等关键环节。

以下技术是一种折中方案。根据Oracle网站上罗列的技术文档，纯粹的UNICODE方案并不在如下文章打讨论范围。本人将在以后章节中，介绍如何应用Oracle纯粹的UNICODE解决方案到实际的系统中。

感谢IA students们，以及歌神，James,园园…

技术不是万能的，但是没有技术是万万不能的。

――Searoc Jiang

“如果你还没有对某个程序花费至少一个月的时间——一天工作16小时，其余8小时也睡得不安稳，老是梦到它，为解决‘最后错误’连熬几夜——你就算没有编过真正复杂的程序，你也不会感受到编程中激动人心的东西。”

——Edward Yourdon 《代码大全》

不知道是否每一个程序员都有以上的感受？我做软件系统有六年了，回忆起来，这种 ” 激动人心 ” 的东西不会常常出现 , 起码最近这些年少了。好在这个月她又让我回味了一把年轻人攻克难题的那种兴奋、成就感。

最近做了一个 MRT 项目的 Internationalization and Localization (L10N&I18N), 业主期望这个原本就不支持非英文的软件系统可以拿到台湾去用。因为之前在国内做过 outsourcing 的业务，对软件本地化本来就没有什么好感。主要原因是没有什么技术挑战性；其次，这个系统最后要拿到高雄去用，对腐败案的鄙视也是一个原因。所以，本人对这个项目没有那么大的兴趣。

但是这里我想说的是，有几个技术点可以总结一下，在攻克这几个技术点的过程中，其实也是学习到了很多知识。

先介绍一下整个系统的架构，它是一个 C/S Real-Time Multiple Distributed System, 主要应用于地铁行业。

OS: Windows 2k(client)、Solaris 9(Server)
Database: Oracle 9i
Dev Tools: Visual C++ 6.0 with sp6; platform SDK; g++;
3^rd SDK: omniORB4.0.5(CORBA); ACE 5.5; boost 1.31;OCI 9.2; ….(用的比较多，不一一列举)

有经验的人一眼就能看出这个一个跨操作系统平台的软件系统，使用 CORBA 作为系统间、进程间的交互。要实现 L10N&I18N ，就要对数据库，代码进行改造。在改造的过程中，确实遇到了一些难题，下面一一道来。

1 Oracle

比较庆幸，在我们公司有一个 Oracle 的高手，基本上没有遇到太大的技术难题，只是将原来 Table column 的数据类型从 varchar2 转换成了 nvarchar2 。数据库系统本身的数据类型没有改变，这样尽可能减少了由于数据转变成 unicode 带来的存储空间问题。

但是值得注意的是， Oracle 客户端环境变量要设置成英文（因为原来的数据库系统本身就是英文的）

Windows:

set nls_lang=AMERICAN_AMERICA.WE8ISO8859P1

Solaris:

export nls_lang=AMERICAN_AMERICA.WE8ISO8859P1

除了上面的参数外，还要将操作系统的 Rigional Option 设置成需要的语言环境。在 windows 上比较容易。那么在 Solaris 下，就要配置 profile 包了。 E.g.

export LANG=zh_TW.BIG5

export LC_ALL=zh_TW.BIG5

其实，这里只是将数据库存储的数据类型改成了 nvarchar2 ，其它没有改变。我个人感觉，保留原有数据类型 varchar2 也可以，不过没有具体测试过，有兴趣者可以自己测试一下。

详见：《影响 ORACLE 汉字显示的字符集问题》 http://www.trainlinux.com/d/2002-05-18/5131.html

2 OCI 9.2

OCI 9.2 是一套 Oracle 提供的数据库 C 程序接口，在 Windows 和 Solaris 上有不同的版本。据说， Oracle 有一套 C++ 程序接口，不过不稳定， bug 其多。

在我们的程序里，仍然使用已有的方法来 retrieve 数据，代码如下：

2.1 Open EnvNls

#ifdef _UNICODE

long status = OCIEnvNlsCreate((OCIEnv **)&m_env,

(ub4)OCI_DEFAULT,

(void *)0,

(size_t) 0,

(void **)0,

0, /* Metadata and SQL CHAR character set */

0);//(ub2)OCI_UTF16ID /* SQL NCHAR character set */);

//cos, the currect data is not pure UNICODE database, so we do not use OCI_UTF16ID to access data.

throwOnError( status, (OCIError*)NULL, _T("Failed to create language environment for database connection") );

#else

long status = OCIEnvCreate( &m_env,

OCI_THREADED | OCI_OBJECT,

0, // memory callback

0, // memory allocation function

0, // memory re-allocation

0, // memory freefunction

0, // size of extra mem for m_environment

0 ); // ptr to extra memory

throwOnError( status, (OCIError*)NULL, _T("Failed to create environment for database connection") );

#endif

这里，没有使用 OCI_UTF16ID ，目的就是无论读取还是存储都按照 byte 存取。由于代码中已经改成 unicode support 的了，所以，在调用 OCI 接口函数的时候就要注意将参数转换成窄字符来存取。大部分 OCI 接口函数是按照 byte 传递字符串的。

3 CODE

3.1 重定义数据类型

在程序中，要将原来已有的数据类型改成 unicode support 的定义，这个也是导致工作量最大的方面。因为要批量修改已有的代码。不过可以自己编一个文件处理小程序，可以大大降低繁复的工作。

具体数据重新定义如下：

#include <string>

#include <wchar.h>

#include <iostream>

#include <stdio.h>

#include <stdarg.h>

#if defined(WIN32)

#include <TCHAR.h>

#define IMBUE_NULL_CODECVT( outputFile ) /

{ /

NullCodecvt* pNullCodecvt = new NullCodecvt ; /

std::locale loc = std::locale::classic() ; /

loc._Addfac( pNullCodecvt , NullCodecvt::id, NullCodecvt::_Getcat() ) ; /

(outputFile).imbue(loc) ; /

}

#else

#if defined(_UNICODE)

#define _T(x) __T(x)

//for the _T("asd") => L"asd" under solaris gcc//Solaris 下，没有 _T() 的定义

#define __T(x) L ## x

#include <widec.h>

#include <string.h>

#else

#define _T(x) x

#endif

#ifdef _UNICODE

typedef wchar_t TUchar;

typedef wchar_t Tchar;

typedef std::wstring tstring;

typedef std::wfstream tfstream;

typedef std::wostringstream tostringstream;

typedef std::wistringstream tistringstream;

typedef std::wostream tostream;

typedef std::wstringstream tstringstream;

typedef std::wifstream tifstream;

typedef std::wofstream tofstream;

#define tcout std::wcout

#define tcerr std::wcerr

#define tendl L"/r/n"

#define _tcserror TAConvertUTF::wcserror

#define _tCORBAstring_var CORBA::WString_var

#define _tCORBAstring_member CORBA::WString_member

#define _tCORBAstring_out CORBA::WString_out

#define tcin std::wcin

#define tstrcmp wcscmp

#define tstricmp _wcsicmp

#define tstrncmp wcsncmp

#define tstrncpy wcsncpy

#define tstrcpy wcscpy

#if defined( WIN32 )

#define tsprintf swprintf

#else //Solaris

void s_swprintf(wchar_t * str, const wchar_t * format, ...);

int s_swnprintf(wchar_t * str, int size, const wchar_t * format, ...);

int s_swnprintf_list(wchar_t * str, int size, const wchar_t * format, va_list& args);

#define tsprintf s_swprintf

#define twnsprintf s_swnprintf

#endif

unsigned long toLong(const wchar_t* pstr);

int toInt(const wchar_t* pstr);

#define tatol toLong

#define tatoi toInt

#define tprintf wprintf

#define tfprintf fwprintf

#define tstrtod wcstod

#define tfilebuf wfilebuf

#define tstrlwr _wcslwr

#define tstrtoul wcstoul

#define tgetenv _wgetenv

#define tstrlen wcslen

#define tstrstr wcsstr

#define tstrtok wcstok

#define tstrdup _wcsdup

#define tsscanf swscanf

#define tstrcat wcscat

#define tltoa _ltow

#define tmain wmain

#define tultoa _ultow

#define tstrrchr wcsrchr

#define tstrupr _wcsupr

#define tstrftime wcsftime

#define _tCORBAAnyfrom_string CORBA::Any::from_wstring

#define _tCORBAAnyto_string CORBA::Any::to_wstring

#else //non-unicode

int toInt(const char* pstr);

long toLong(const char* pstr);

typedef unsigned char TUchar;

typedef char Tchar;

typedef std::string tstring;

typedef std::fstream tfstream;

typedef std::ostringstream tostringstream;

typedef std::istringstream tistringstream;

typedef std::ostream tostream;

typedef std::stringstream tstringstream;

typedef std::ifstream tifstream;

typedef std::ofstream tofstream;

#define tcout std::cout

#define tcerr std::cerr

#define tendl std::endl

#define _tcserror strerror

#define _tCORBAstring_var CORBA::String_var

#define _tCORBAstring_member CORBA::String_member

#define _tCORBAstring_out CORBA::String_out

#define tcin std::cin

#define tstrcmp strcmp

#define tstricmp stricmp

#define tstrncmp strncmp

#define tstrncpy strncpy

#define tstrcpy strcpy

#define tfprintf fprintf

#define tsprintf sprintf

#define tprintf printf

#define tatol atol

#define tatoi atoi

#define tstrtod strtod

#define tfilebuf filebuf

#define tstrlwr strlwr

#define tstrtoul strtoul

#define tgetenv getenv

#define tstrlen strlen

#define tstrstr strstr

#define tstrtok strtok

#define tstrdup strdup

#define tsscanf sscanf

#define tstrcat strcat

#define tltoa _ltoa

#define tmain main

#define tultoa _ultoa

#define tstrrchr strrchr

#define tstrupr _strupr

#define tstrftime strftime

#define _tCORBAAnyfrom_string CORBA::Any::from_string

#define _tCORBAAnyto_string CORBA::Any::to_string

#endif

在 Windows 和 Solaris 下，一些函数的参数是不一样的，所以为了兼容以前的代码，我们重新定义了函数在 Solaris 下的实现以支持 wide char （ wchar_t ），它们是： wnsprintf; sprintf 。另外，函数 atol; atoi; 在 Solaris 下， g++ 压根就不支持。

3.2 字符转换

无论程序运行在 Windows 还是 Solaris ，从数据库直接取出来的字符串都是窄形数据，如 char 和 std::string, 这些数据类型存储的都是实际的中文字符，例如 :BIG5,GB2312. 这里必须澄清一下， BIG5 以及 GB2312 字符都不是 UNICODE 字符，具体可以参看文档：

http://www.unicode.org/

http://www.cnblogs.com/waterflier

2 ． 1 节中已经说过了，从数据库直接获得的数据是字节数据，那么为了配合操作系统显示和操作，就得转换成 UNICODE 编码，即 char=>wchar;std::string => std::wstring 之类的 encoding 。顺便说一句，在 Windows 下如果使用 wchar_t 、 std::wstring 存储 BIG5,GB2312 之类的字符，然后直接显示到 GUI 上是不会显示正确的，是一堆乱码；反而如果用 char 和 std::string 存储 BIG5,GB2312 ，就可以正确显示。这是因为操作系统对双字符集的处理是自动的，即直接将 unicode 编码的双字符显示成对应的 BIG5,GB2312 字符。但是， Solaris 是不是这样呢？本人没有做测试，待做几个测试再说吧！反正在 Solaris 下，我们的系统没有 GUI 部分。所以，如果要在进程间传递字符串的话，在这里我们要求在 Solaris 与 Windows 之间使用 unicode 编码字符进行数据交换。

基本原理说了，下面贴上 encoding 的代码。

3.2.1 Windows

char * convert (const wchar_t *wstr)

{

// Short circuit null pointer case

if (wstr == 0)

return 0;

int len = ::WideCharToMultiByte (CP_OEMCP,

wstr,

-1,

0);

#if defined (ACE_LACKS_WCSLEN)

const wchar_t *wtemp = wstr;

while (wtemp != 0)

{

++wtemp;

}

int len = wtemp - wstr + 1;

#endif

char *str = new char[len];

::WideCharToMultiByte (CP_OEMCP, 0, wstr, -1, str, len, 0, 0);

return str;

}

wchar_t * convert (const char *str)

{

// Short circuit null pointer case

if (str == 0)

return 0;

int len = ::MultiByteToWideChar (CP_OEMCP, 0, str, -1, 0, 0);

wchar_t *wstr = new wchar_t[len];

::MultiByteToWideChar (CP_OEMCP, 0, str, -1, wstr, len);

return wstr;

}

3.2.2 Solaris

在 Solaris 下，要找到与 Windows 对应 unicode 名称才能正确的将 Solaris 下转换过来的 unicode 字符显示出来，本人写了一个测试程序才找到 UTF-32BE ， faint ！

// BIG5 is for Taiwan; for other language, this should be changed.

#if defined (DB_CODE_SET) //get the database language from condition compilation

#if DB_CODE_SET == BIG5

const static char * LOCAL_PAGE_SET = "BIG5";

#elif DB_CODE_SET == GB2312

const static char * LOCAL_PAGE_SET = "GB2312";

#else

error: unsupported DB_CODE_SET, must be updated for this platform!

#endif

#else

error: must define the database language code set name

#endif

//unicode code page name working with Windows OS.

const static char * UNICODE_PAGE_SET = "UTF-32BE";

int wchar2char(/*in*/const wchar_t* in,int in_len,/*out*/char* out,int out_max)

{

size_t result;

iconv_t env;

//tocode_page, fromcode_page

env = iconv_open(LOCAL_PAGE_SET, UNICODE_PAGE_SET);

result = iconv(env,(const char**)&in,(size_t*)&in_len,(char**)&out,(size_t*)&out_max);

iconv_close(env);

return (int) result;

}

int wstring2string(/*in*/const wstring& in,/*out*/string& out)

{

int len = in.length() + 1;

int result;

char* pBuffer = new char[len*3];

memset(pBuffer,0,len*3);

result = wchar2char(in.c_str(),in.length() * sizeof(wchar_t),pBuffer,len*3);

//printf("wstring2string result is %d,errno is %s/n",result,strerror(errno));

if(result >= 0)

{

out = pBuffer;

}

else

{

out = "";

}

delete[] pBuffer;

return result;

}

int char2wchar(/*in*/const char* in,int in_len, /*out*/wchar_t* out,int out_max)

{

size_t result;

iconv_t env;

env = iconv_open(UNICODE_PAGE_SET, LOCAL_PAGE_SET);

result = iconv(env,(const char**)&in,(size_t*)&in_len,(char**)&out,(size_t*)&out_max);

iconv_close(env);

return (int) result;

}

int string2wstring(/*in*/const string& in,/*out*/wstring& out)

{

int len = in.length() + 1;

int result;

//wstring temp;

wchar_t* pBuffer = new wchar_t[len];

memset(pBuffer,0,len*sizeof(wchar_t));

result = char2wchar(in.c_str(),in.length(),pBuffer,len*sizeof(wchar_t));

//printf("string2wstring result is %d,errno is %s/n",result,strerror(errno));

if(result >= 0)

{

out = pBuffer;

}

else

{

out.clear();

}

delete[] pBuffer;

return result;

}

注意：如果要使用以上函数，最好在你的调用函数中加上 try{}..catch()..

3.3 CORBA

本人以前做过 COM+ 和 DCOM 的应用开发，对比 CORBA 要灵活一些，但是操作起来比较笨重，而且不支持非 Windows 平台； CORBA 也有一些不尽人意的地方，正好让我遇到了，而且是 CORBA 标准的致命问题――对 wchar, wstring 兼容得不好，虽然从各个中间件商提供的 free software 那里能看到支持此数据类型的 code 。这里用到的 CORBA 中间件是 omniORB 4.0.1 。

下面贴上几封邮件就明白我在说什么了。

eMail 1:

> In our real application, I defined an interface which has a function

> to pass the wstring, but the client throws an exception

> "BAD_PARAM_WCharTCSNotKnown" when it called the server.

There are different ways that a wstring can be encoded for communication between ORBs. This exception is raised by the client ORB when it cannot determine what encoding is expected by the server ORB. Usually this happens when your object reference did not come from the server ORB (eg if you used a corbaloc: URL).

So if you want to pass a wstring to an object, then get your object reference from a factory object or name service.

Regards,

Luke.

**********************************************************************************************

eMail2:

On Friday 28 July, JiangWei wrote:

> server nativeCharCodeSet: UTF-8

> client nativeCharCodeSet: UTF-8

> str = ... # UTF-8 string

> o = orb.string_to_object('corbaloc::..............')

> object = o._narrow(...)

> object.echo_string(str) #throw omniORB.CORBA.DATA_CONVERSION:

> CORBA.DATA_CONVERSION(omniORB.DATA_CONVERSION_BadInput,CORBA.COMPLETED

> _NO)

> object2 = orb.string_to_object('IOR:......................')

> object2.echo_string(str) #OK

This is due to the way codesets are handled in CORBA. It is not a bug in omniORB. Servers publish the codesets they understand in IORs. Based on that, clients decide which codesets they can use to communicate with the server. A corbaloc URI is equivalent to an IOR with no codeset information in it, so the CORBA spec requires omniORB to treat it as though the only codeset it supports is ISO 8859-1. omniORB therefore tries to convert your UTF-8 string to ISO 8859-1, and fails with a DATA_CONVERSION exception when it encounters a character it cannot convert.

I'm afraid you can't use codesets other than ISO 8859-1 with corbaloc URIs. It's not an omniORB issue but a CORBA specification issue.

Cheers,

Duncan .

-- Duncan Grisby --

-- duncan@grisby.org --

-- http://www.grisby.org –

3.3.1 Solution 1 – 提供另一个接口来解释wchar参数以及其函数

IDL ：

interface IWideIORProvider

{

// Returns an object reference containing the TAG_CODE_SETS TaggedComponent.

Object getWideIOR();

};

IWideIORProvider 避免了从 server 端创建得对象不那个传递给 clientd 问题。 Client 端根据

Corbaloc 机制从此接口获得对对象引用得解释，这样就可以存取宽字符对象了。

Server ：

class WideIORProvider : public virtual POA_TA_Base_Core::IWideIORProvider

{

public:

WideIORProvider();

virtual ~WideIORProvider();

void setWideIOR(/*takes*/ CORBA::Object_ptr object);

// Implementation of IWideIORProvider

::CORBA::Object_ptr getWideIOR();

private:

CORBA::Object_var m_object;

private:

//virtual void* _ptrToInterface(const char*)(return (void*)0;);

//virtual const char* _mostDerivedRepoId()(return 0;);

};

int main(int argc, char* argv[])

{

try

{

const char* GIOP_MAX_MSG_SIZE = "104857600";

const char* options[][2] =

{ { "endPoint", "giop:tcp::6868" },

{ "clientCallTimeOutPeriod", "200000" },

{ "serverCallTimeOutPeriod", "200000" },

{ "giopMaxMsgSize", GIOP_MAX_MSG_SIZE },

{ "nativeWCharCodeSet", "UTF-16" },

{ "nativeCharCodeSet", "UTF-8" },

{ 0, 0 } };

// Initialize orb

CORBA::ORB_var orb = CORBA::ORB_init(argc, argv, "omniORB4", options);

omniORB::setLogFunction(writeOmniToDbgLog);

omniORB::traceLevel = 10;

omniORB::traceInvocations = true;

// Get reference to Root POA.

CORBA::Object_var obj = orb->resolve_initial_references("omniINSPOA");

PortableServer::POA_var poa = PortableServer::POA::_narrow(obj);

PortableServer::POAManager_var pmgr = poa->the_POAManager();

pmgr->activate();

Foo_impl foo_impl;

foo_impl.m_name = "MyFoo";

PortableServer::ObjectId_var foo_oid = PortableServer::string_to_ObjectId("MyFoo");

poa->activate_object_with_id(foo_oid, &foo_impl);

TA_Base_Core::WideIORProvider iorProvider;

iorProvider.setWideIOR(foo_impl._this());

PortableServer::ObjectId_var provider_oid = PortableServer::string_to_ObjectId("MyFooWrapper");

poa->activate_object_with_id(provider_oid, &iorProvider);

// Accept requests

orb->run();

}

catch (const CORBA::SystemException& e)

{

const char* buf = e.NP_minorString();

std::cout << buf << std::endl;

}

omniORB::setLogFunction(NULL);

return 0;

}

Client ：

引用一个模板方法如下：

// Convenient helper to obtain an object reference of type T_ptr given a

// corbaloc url that accesses an IWideIORProvider implementation.

// 'url' can be a corbaloc url of the form corbaloc::server:port/objectkey.

// If the url can't be resolved, T::_nil() will be returned, or an exception

// will be propagated to the caller.

template <class T>

/*gives*/ T::_ptr_type gGetWideIOR(const std::string& url)

{

CORBA::Object_var obj = CorbaUtil::getOrb()->string_to_object(url.c_str());

if (CORBA::is_nil(obj))

{

return T::_nil();

}

IWideIORProvider_var iorProvider = IWideIORProvider::_narrow(obj);

if (CORBA::is_nil(iorProvider))

{

return T::_nil();

}

CORBA::Object_var wideObj = iorProvider->getWideIOR();

T::_ptr_type wideT = T::_narrow(wideObj);

return wideT;

}

int main(int argc, char* argv[])

{

try

{

const char* GIOP_MAX_MSG_SIZE = "104857600";

const char* options[][2] =

{{ "clientCallTimeOutPeriod", "200000" },

{ "serverCallTimeOutPeriod", "200000" },

{ "giopMaxMsgSize", GIOP_MAX_MSG_SIZE },

{ "nativeWCharCodeSet", "UTF-16" },

{ "nativeCharCodeSet", "UTF-8" },

{ 0, 0 } };

// Initialize orb

CorbaUtil::getOrb() = CORBA::ORB_init(argc, argv, "omniORB4", options);

omniORB::setLogFunction(writeOmniToDbgLog);

omniORB::traceLevel = 10;

omniORB::traceInvocations = true;

{

// Test the use of a wide object reference

std::string strCorbaLoc = "corbaloc::occ-com-1:6868/MyFooWrapper";

My_Space::Foo_ptr foo = TA_Base_Core::gGetWideIOR<My_Space::Foo>(strCorbaLoc);

if (!CORBA::is_nil(foo))

{

foo->setNameNarrow("I am narrow foo");

foo->setNameWide(L"I am wide foo");

}

{

// Test the use of a narrow object reference (ie. contact the Foo object directly)

std::string strCorbaLoc = "corbaloc::occ-com-1:6868/MyFoo";

CORBA::Object_var obj = CorbaUtil::getOrb()->string_to_object(strCorbaLoc.c_str());

My_Space::Foo_ptr foo = My_Space::Foo::_narrow(obj);

if (!CORBA::is_nil(foo))

{

foo->setNameNarrow("I am narrow foo");

// This call should FAIL with a BAD_PARAM_WCharTCSNotKnown exception

foo->setNameWide(L"I am wide foo");

}

catch (const CORBA::SystemException& e)

{

const char* buf = e.NP_minorString();

}

omniORB::setLogFunction(NULL);

return 0;

}

缺点：

能够绕过 corbaloc 得问题，但是没有执行一个 CORBA call ，都要再从 server 端获取一下 IOR ，从而重新在客户端获得对象得 reference ，网络代价比较大，最后影响系统得 perfomance 。不过，学院派比较喜欢此中方法。

3.3.2 Solution 2 – 将 wstring 拆成sequence进行传递

既然是要找一条路绕过去，当然不止一条。为什么不能将 wstring 变量解释成 sequence 当作参数传递呢？ (sequence 其实就是数组 ) 。

e.g. IDL

interface Foo

{

typedef unsigned long widechar;

typedef sequence<widechar> WideString;

void setName(in WideString strName);

void setSession(inout WideString sessionID);

WideString modifySession(inout WideString sessionID);

};

当然，引入 WideString 的定义，需要补充一些函数以期获得对应得 std::wstring 数据，最终方便操作。

它们是：

WideString_var toCorbaWideString(const std::wstring & wstringValue)

{

//LOG_GENERIC(SourceInfo, TA_Base_Core::DebugUtil::DebugInfo, _T("CorbaUtil::toCorbaWideString(), return WideString is [%s]"), wstringValue.c_str());

WideString_var wideString = new WideString;

wideString->length(wstringValue.length());

for (unsigned long i = 0; i < wstringValue.length(); i++)

{

wideString[i] = wstringValue.at(i);

}

return wideString;

}

WideString_var CorbaUtil::toCorbaWideString(const TA_Base_Core::Tchar * wstringValue)

{

std::wstring strValue(wstringValue);

return toCorbaWideString(strValue);

}

std::wstring CorbaUtil::fromCorbaWideString(const WideString & wideString)

{

std::wstring wideStr = _T("");

int iSize = wideString.length();

for (int i = 0; i < iSize; i++)

{

wchar_t achar[2];

memset(achar, 0,2);

#ifdef SOLARIS

swprintf(achar, 2, L"%c", wideString[i]);

#else

swprintf(achar, L"%c", wideString[i]);

#endif

std::wostringstream stm;

stm << achar[0];

wideStr.append(stm.str());

}

return wideStr;

}

需要注意的是，以上转换关系如果在单一 OS 下是没有问题的，他们是否跨系统平台呢？只要在操作系统之间传递的字符数据是 unicode 编码，就不会有问题，因为 unicode 编码字符是两字节的， Windows 上宽字符是两个字节存储的。尽管 Solaris 平台上 wchar_t 是四个字节的，但是不会出现数据丢失的情况，因为我们知道一些语言编码是会有三字节、四字节。这也是我们为什么首先在 3.2 节介绍字符集转换的原因。

null

别小看了L10N&amp;I18N

别小看了L10N&I18N