<Boost> 正則表達式boost::regex-优快云博客

本文链接：https://blog.youkuaiyun.com/Meta_Cpp/article/details/42146559

本文介紹了Boost庫中的正則表達式功能，包括編譯、使用方法及示例代碼。涵蓋regex_match、regex_search、regex_replace等函數應用，並展示了regex_iterator與regex_token_iterator的使用技巧。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

1. 編譯boost regex

使用boost庫的regex先需要編譯，方法如下：

C:\Users\Administrator>cd "C:\Program Files\boost_1_57_0"
C:\Program Files\boost_1_57_0>bootstrap
C:\Program Files\boost_1_57_0>.\b2

2. regex的使用

regex是regular expression的縮寫, 即正則表達式。boost庫使用的是Perl正則表達式。

使用说明:

1. 创建regex对象：

#include<boost/regex.hpp>
boost::regex reg(“(.*)”);

2. regex_match

该函数用来对一个字符串的完全匹配，在很多校验信息中可以广泛使用，具体使用示例见附后的测试代码

3. regex_rearch

说到这个函数，必须要说明下boost.match_result。 regex_rearch在执行查找时，通过一个match_result类型的对象来报告匹配的自表达式。

match_result主要封装了一个std::vector<sub_match<<…>> >类型的对象,而sub_match类继承自std::pair,主要记录匹配的结果信息。

4. regex_replace

该函数根据指定的fmt格式化通过正则表达式匹配的子串。需要注意的是，该函数不会修改原字符串，只是将格式化后的结果返回。具体使用示例见附后测试源码。

5. regex_iterator

通过多次调用regex_rearch我们可以处理所有满足匹配的字串。但是，Regex库还给我们提供了一个更优雅的方法——即通过regex_iterator。通过字符串和正则表达式构造regex_iterator的时候会构建一个match_result的对象用于保存匹配结果信息，再通过重载++运算符达到遍历所有匹配信息的目的。

6. regex_token_iterator

与regex_iterator相似，regex还提供了一个列举与正则表达式不匹配的子表达式，就是regex_token_iterator。与stl的设计类似，是通过迭代器适配器实现的。这个特性让我们很容易的分割字符串。

以下是示例代碼：

#include <boost/lambda/lambda.hpp>
#include <boost/regex.hpp>
#include <iostream>
#include <iterator>
#include <algorithm>

using std::cout;
using std::endl;
using namespace std;

class regex_callback
{
public:
	template <typename T>
	void operator()(const T& what) 
	{
		std::cout << what << std::endl;
	}
};


void BoostRegex()
{
	using namespace boost::lambda;
	//////////////////////////////////////////////////////////////////////////
	// regex 
	boost::regex reg("select ([a-zA-Z]*) from ([a-zA-Z]*)");
	cout << "Status: " << reg.empty() << endl;				// 正則表達式是否有效： 0表示正常
	cout << "Mark count: " << reg.mark_count() << endl;		// 正則表達式的組數:小括號對數+1

	//////////////////////////////////////////////////////////////////////////
	// 完全匹配
	boost::regex reg1("select ([a-zA-Z]*) from ([a-zA-Z]*)");
	boost::cmatch match1;
	std::string str1 = "select me from dest";
	bool bRet = boost::regex_match(str1, reg1);	// 只測試匹不匹配，不保存結果
	cout << (bRet ? "匹配" : "不匹配") << endl;

	bRet = boost::regex_match(str1.c_str(), match1, reg1 ); // 測試匹配，並保存結果
	std::for_each(match1.begin(), match1.end(), /*std::cout << _1 << " "*/regex_callback());
	cout << "-----------------------------" << endl;

	//////////////////////////////////////////////////////////////////////////
	// 部分匹配
	boost::cmatch match2;
	std::string str2 = "my select me from dest oh baby";
	bRet = boost::regex_search(str2.c_str(), match2, reg1);
	cout << match2.prefix() << endl;	// 匹配成功部分的前綴字段
	cout << match2.suffix() << endl;	// 匹配成功部分的後綴字段
	std::for_each(match2.begin(), match2.end(), /*std::cout << _1 << " "*/regex_callback());
	cout << "-----------------------------" << endl;

	//////////////////////////////////////////////////////////////////////////
	// 替換
	boost::regex reg3("(Colo)(u)(r)", boost::regex::icase | boost::regex::perl); // 對大小寫不敏感
	std::string str3 = "Colour, colours, color, colourize";
	std::string sRslt = boost::regex_replace(str3, reg3, "$1$3");		// (Colo)(u)(r)三部分只取第一部分和第三部分
	cout << sRslt << endl;
	cout << "-----------------------------" << endl;

	//////////////////////////////////////////////////////////////////////////
	// regex_iterator
	boost::regex reg4("(\\d+),?");
	std::string str4 = "1,2,3,4,5,6,7,85,ad2348(,hj";
	boost::sregex_iterator it(str4.begin(), str4.end(), reg4);
	boost::sregex_iterator itend;
	std::for_each(it, itend, cout << _1 << " ");
	cout << "\n-----------------------------" << endl;

	//////////////////////////////////////////////////////////////////////////
	// regex_token_iterator 分割字符串
	boost::regex reg5("/");
	std::string str5 = "Split/Vulue/Teather/Neusoft/Write/By/Lanwei";
	boost::sregex_token_iterator tit(str5.begin(), str5.end(), reg5, -1);
	boost::sregex_token_iterator titend;
	while (tit != titend)
	{
		cout << *tit << " ";
		tit++;
	}
	cout << "\n-----------------------------" << endl;
}

其中正則表達式：

"select ([a-zA-Z]*) from ([a-zA-Z]*)"：匹配SQL查詢語句, ([a-zA-Z]*)即匹配若干個字母，如: select me from dest, 第一個"([a-zA-Z]*)"匹配"me", 第二個匹配"dest".

"(\d+),?"：搜索字符串中的數字，直到遇到","，(\d+)即匹配若干個數字，"?"在這裏代表非貪婪匹配.

運行效果如下：