百度的笔试测试居然是用c++写一个简单正则表达式匹配。。。像regex/chrono/random这些库,我基本上都是用的时候再查,因为要记的东西还是比较多(其实不多)。那就顺便做个记录。
1.regex_search进行查找,而非全字符匹配
2.regex_match进行全字符匹配
3.std::sregex_iterator 指向所有匹配到的字符,依次遍历可以得到所有匹配的子串,类型是smatch。
4.使用regex_search匹配到的字符类型为smatch或cmatch,以smatch为例子,调用smatch::str()获得匹配的字符,smatch::prefix()获得前缀,smatch::suffix()获得后缀,显然,通过后缀我们可以一直遍历到所有匹配的字符,这似乎在暗示c++的匹配是非贪婪匹配。而对于当前smatch,smatch::size是匹配到的串的数量+子表达式的数量和。也就是说smatch[0]是整个子串,smatch[1]-smatch[size-1]是子表达式内容。
std::sregex_iterator 进行regex_search
const std::string s = "Quick brown fox."; std::regex words_regex("[^\\s]+");//拒绝空白 auto words_begin = std::sregex_iterator(s.begin(), s.end(), words_regex); auto words_end = std::sregex_iterator();//无效迭代器 std::cout << "Found " << std::distance(words_begin, words_end) //匹配个数 << " words:\n"; for (std::sregex_iterator i = words_begin; i != words_end; ++i) { std::smatch match = *i;//smatch,cmatch对应一次匹配 std::string match_str = match.str(); std::cout << match_str << '\n'; }
std::string log(R"( Speed: 366 Mass: 35 Speed: 378 Mass: 32 Speed: 400 Mass: 30)"); std::regex r(R"(Speed:\t\d*)"); std::smatch sm; while(regex_search(log, sm, r)) { std::cout << sm.str() << '\n'; log = sm.suffix(); } // C 风格字符串演示 std::cmatch cm; if(std::regex_search("this is a test", cm, std::regex("test"))) std::cout << "\nFound " << cm[0] << " at position " << cm.prefix().length();
std::string lines[] = {"Roses are #ff0000", "violets are #0000ff", "all of my base are belong to you"}; std::regex color_regex("#([a-f0-9]{2})" "([a-f0-9]{2})" "([a-f0-9]{2})"); // 简单匹配 for (const auto &line : lines) { std::cout << line << ": " << std::boolalpha << std::regex_search(line, color_regex) << '\n'; } std::cout << '\n'; // 展示每个匹配中有标记子表达式的内容 std::smatch color_match; for (const auto& line : lines) { if(std::regex_search(line, color_match, color_regex)) { std::cout << "matches for '" << line << "'\n"; std::cout << "Prefix: '" << color_match.prefix() << "'\n"; for (size_t i = 0; i < color_match.size(); ++i) std::cout << i << ": " << color_match[i] << '\n'; std::cout << "Suffix: '" << color_match.suffix() << "\'\n\n"; } }
std::string target("192.168.1.144:8080"); std::regex re("([\\d]{1,3}.)([\\d]{1,3}.)([\\d]{1,3}.)[\\d]+"); std::smatch sm; std::cout << sm.size() << '\n'; std::regex_search(target, sm, re); std::cout << sm.size() << '\n';//4,一个ip地址和3个子表达式通过operator[]访问