Unicode和ANSI字符串

最新推荐文章于 2025-05-31 12:37:55 发布

南浦秋叶

最新推荐文章于 2025-05-31 12:37:55 发布

阅读量587

点赞数

CC 4.0 BY-SA版权

分类专栏： c/c++

本文链接：https://blog.youkuaiyun.com/zhengtao1989/article/details/8209578

c/c++ 专栏收录该内容

23 篇文章

订阅专栏

本文总结了WinCE编程中字符集之间的转换方法，包括使用Microsoft API函数和C Run-time Library函数进行Unicode字符串与ANSI字符串的互转。在WinCE中，由于仅支持Unicode字符集，因此字符集转换显得尤为重要。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

这几天在WinCE编程中，总是遇到一些字符集之间的转换，所以就总结一下。

Unicode字符串也就是wide-char string，采用双字节编码，ANSI字符串也就是ASCII字符串，采用单字节编码，也可以称为Single-byte Character Sets(SBCS),为了扩展SBCS字符，Microsoft也采用了一种叫做Double-bytes Character Sets(DBCS)的字符集，无论是ANSI字符集，还是DBCS字符集，都提供了相应的API函数来转化为Unicode字符集。一般情况下，MultiByte也就是指的是DBCS字符集，也包括ANSI字符集，因为如果采用DBCS编码，输入英文就是Single-byte，输入中文就是MultiByte，所以一般情况下，统称为MultiByte。
一般采用以下两种方式转换：

第一种方式是调用Microsoft提供的API函数，主要有：

MultiByteToWideChar

WideCharToMultiByte

以上函数的包含在Stringapiset.h (include Windows.h)中

#include <windows.h>
//-------------------------------------------------------------------------------------
//Description:
// This function maps a character string to a wide-character (Unicode) string
//
//Parameters:
// lpcszStr: [in] Pointer to the character string to be converted
// lpwszStr: [out] Pointer to a buffer that receives the translated string.
// dwSize: [in] Size of the buffer
//
//Return Values:
// TRUE: Succeed
// FALSE: Failed
//
//Example:
// AnsiToUnicode(szA,szW,sizeof(szW)/sizeof(szW[0]));
//---------------------------------------------------------------------------------------
BOOL AnsiToUnicode(LPCSTR lpcszStr, LPWSTR lpwszStr, DWORD dwSize)
{
	// Get the required size of the buffer that receives the Unicode string.
	DWORD dwMinSize;
	dwMinSize = MultiByteToWideChar (CP_ACP, 0, lpcszStr, -1, NULL, 0);
	if(dwSize < dwMinSize)
	{
		return FALSE;
	}
	// Convert headers from ASCII to Unicode.
	MultiByteToWideChar (CP_ACP, 0, lpcszStr, -1, lpwszStr, dwMinSize); 
	return TRUE;
}

//-------------------------------------------------------------------------------------
//Description:
// This function maps a wide-character string to a new character string
//
//Parameters:
// lpcwszStr: [in] Pointer to the character string to be converted
// lpszStr: [out] Pointer to a buffer that receives the translated string.
// dwSize: [in] Size of the buffer
//
//Return Values:
// TRUE: Succeed
// FALSE: Failed
//
//Example:
// UnicodeToAnsi(szW,szA,sizeof(szA)/sizeof(szA[0]));
//---------------------------------------------------------------------------------------
BOOL UnicodeToAnsi(LPCWSTR lpcwszStr, LPSTR lpszStr, DWORD dwSize)
{
	DWORD dwMinSize;
	dwMinSize = WideCharToMultiByte(CP_ACP,NULL,lpcwszStr,-1,NULL,0,NULL,NULL);
	if(dwSize < dwMinSize)
	{
		return FALSE;
	}
	WideCharToMultiByte(CP_ACP,NULL,lpcwszStr,-1,lpszStr,dwSize,NULL,NULL);
	return TRUE;
}

第二种方式是调用C Run-time Library 函数

size_t wcstombs(
   char *mbstr,
   const wchar_t *wcstr,
   size_t count 
);

size_t mbstowcs(
   wchar_t *wcstr,
   const char *mbstr,
   size_t count 
);

这些函数的包含在<stdlib.h>中

示例代码：
// crt_wcstombs.c
// compile with: /W3
// This example demonstrates the use
// of wcstombs, which converts a string
// of wide characters to a string of 
// multibyte characters.

#include <stdlib.h>
#include <stdio.h>

#define BUFFER_SIZE 100

int main( void )
{
    size_t  count;
    char    *pMBBuffer = (char *)malloc( BUFFER_SIZE );
    wchar_t *pWCBuffer = L"Hello, world.";

    printf("Convert wide-character string:\n" );

    count = wcstombs(pMBBuffer, pWCBuffer, BUFFER_SIZE ); // C4996
    // Note: wcstombs is deprecated; consider using wcstombs_s instead
    printf("   Characters converted: %u\n",
            count );
    printf("    Multibyte character: %s\n\n",
           pMBBuffer );

    free(pMBBuffer);
}

在WinCE中，只支持Unicode字符集，所以要时刻注意字符集之间的转换。
另外为了编写跨平台代码，尽量使用通用数据类型和通用数据类型的函数
在标准C库中，可以用TCHAR来表示数据类型，可以用_tcs前缀的函数，当要采用Unicode编码时，添加宏定义
#define _UNICODE //标准C库
#define UNICODE //Microsoft Windows运行时库

#include <tchar.h>
#include <wchar.h>