glibc中的正则表达式处理

本文详细介绍了glibc库中的正则表达式函数regexec的线程安全特性,并提供了使用示例。此外,文章还强调了在使用正则表达式时释放regex_t资源的重要性,以及正则表达式结构体regex_t的内部实现。通过阅读本文,读者可以更好地理解和应用正则表达式功能。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

(1) regexec是否线程安全?

这个函数是glibc中的,至于他是否是线程安全,我先搜了一下,好像没有人说。

对于这种库函数最好小心使用,摸清底细先。

看来只能到libc中寻找答案了,

实现代码如下:

int
regexec (preg, string, nmatch, pmatch, eflags)
    const regex_t *__restrict preg;
    const char *__restrict string;
    size_t nmatch;
    regmatch_t pmatch[];
    int eflags;
{
  reg_errcode_t err;
  int start, length;
  re_dfa_t *dfa = (re_dfa_t *)preg->buffer;

  if (eflags & ~(REG_NOTBOL | REG_NOTEOL | REG_STARTEND))
    return REG_BADPAT;

  if (eflags & REG_STARTEND)
    {
      start = pmatch[0].rm_so;
      length = pmatch[0].rm_eo;
    }
  else
    {
      start = 0;
      length = strlen (string);
    }

  __libc_lock_lock (dfa->lock);
  if (preg->no_sub)
    err = re_search_internal (preg, string, length, start, length - start,
         length, 0, NULL, eflags);
  else
    err = re_search_internal (preg, string, length, start, length - start,
         length, nmatch, pmatch, eflags);
  __libc_lock_unlock (dfa->lock);
  return err != REG_NOERROR;
}

 

确实是线程安全的。

 

(2) 记得用regfree释放regex_t

 

其实看看regex_t的实现就知道了, 里面有一块buffer来的。

 

typedef struct re_pattern_buffer regex_t;

 

struct re_pattern_buffer
{
/* [[[begin pattern_buffer]]] */
 /* Space that holds the compiled pattern.  It is declared as
          `unsigned char *' because its elements are
           sometimes used as array indexes.  */
  unsigned char *buffer;

 /* Number of bytes to which `buffer' points.  */
  unsigned long int allocated;

 /* Number of bytes actually used in `buffer'.  */
  unsigned long int used;

        /* Syntax setting with which the pattern was compiled.  */
  reg_syntax_t syntax;

        /* Pointer to a fastmap, if any, otherwise zero.  re_search uses
           the fastmap, if there is one, to skip over impossible
           starting points for matches.  */
  char *fastmap;

        /* Either a translate table to apply to all characters before
           comparing them, or zero for no translation.  The translation
           is applied to a pattern when it is compiled and to a string
           when it is matched.  */
  RE_TRANSLATE_TYPE translate;

 /* Number of subexpressions found by the compiler.  */
  size_t re_nsub;

        /* Zero if this pattern cannot match the empty string, one else.
           Well, in truth it's used only in `re_search_2', to see
           whether or not we should use the fastmap, so we don't set
           this absolutely perfectly; see `re_compile_fastmap' (the
           `duplicate' case).  */
  unsigned can_be_null : 1;

        /* If REGS_UNALLOCATED, allocate space in the `regs' structure
             for `max (RE_NREGS, re_nsub + 1)' groups.
           If REGS_REALLOCATE, reallocate space if necessary.
           If REGS_FIXED, use what's there.  */
#define REGS_UNALLOCATED 0
#define REGS_REALLOCATE 1
#define REGS_FIXED 2
  unsigned regs_allocated : 2;

        /* Set to zero when `regex_compile' compiles a pattern; set to one
           by `re_compile_fastmap' if it updates the fastmap.  */
  unsigned fastmap_accurate : 1;

        /* If set, `re_match_2' does not return information about
           subexpressions.  */
  unsigned no_sub : 1;

        /* If set, a beginning-of-line anchor doesn't match at the
           beginning of the string.  */
  unsigned not_bol : 1;

        /* Similarly for an end-of-line anchor.  */
  unsigned not_eol : 1;

        /* If true, an anchor at a newline matches.  */
  unsigned newline_anchor : 1;

/* [[[end pattern_buffer]]] */
};

 

(3)  使用例子:

string strRegex = "....";  //正则表达式字符串
string strValue = "...";    //拿来匹配的字符串

regex_t reg;
if (regcomp(&reg, strRegex.c_str(), REG_EXTENDED) != 0)
{
  cout<<"表达式错误"<<endl;
  return false;
}

regmatch_t stRegMatch;
int nRet = regexec(&reg, strValue.c_str(), 1, &stRegMatch, 0);
if (nRet == REG_NOMATCH)
{
 return false;
}
else if (nRet != 0)
{
 char ebuf[128];
 regerror(nRet, pRule, ebuf, sizeof(ebuf));
 fprintf(stderr, "regmatch error%s: /n", ebuf);
 return false;
}

//rm_so 正则表达式在目标串中的起始位置, rm_eo匹配到哪里
if(stRegMatch.rm_so == 0 && stRegMatch.rm_eo == (int)(strValue.length()))
{
 return true;
}

regfree(&reg);
return false;

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值