lpeg使用

最新推荐文章于 2021-08-01 14:11:52 发布

bosswanghai

最新推荐文章于 2021-08-01 14:11:52 发布

阅读量1.5k

点赞数

CC 4.0 BY-SA版权

分类专栏： lua 正则表达式文章标签： lpeg

本文链接：https://blog.youkuaiyun.com/bosswanghai/article/details/53859735

lua 同时被 2 个专栏收录

6 篇文章

订阅专栏

正则表达式

1 篇文章

订阅专栏

本文介绍了Lua的LPeg库，包括其作为正则表达式库的功能、规则构造、函数应用，如`lpeg.match`、`lpeg.P`、`lpeg.B`等，并详细阐述了如何构建模式、定义语法规则和使用捕获功能来实现递归模式和内容获取。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

lpeg使用

简介

lua的正则表达式库

doc

规则

lpeg可以创造和组合规则

操作符	描述
lpeg.P(string)	Matches string literally

lpeg.P(n)	Matches exactly n characters

lpeg.S(string)	Matches any character in string (Set)

lpeg.R(“xy”)	Matches any character between x and y (Range)

patt^n	Matches at least n repetitions of patt

patt^-n	Matches at most n repetitions of patt

patt1 * patt2	Matches patt1 followed by patt2

patt1 + patt2	Matches patt1 or patt2 (ordered choice)

patt1 - patt2	Matches patt1 if patt2 does not match

-patt	Equivalent to (“” - patt)

`#patt`	Matches patt but consumes no input

lpeg.B(patt)	Matches patt behind the current position, consuming no input

LPeg also offers the re module, which implements patterns following a regular-expression style (e.g., [09]+). (This module is 260 lines of Lua code, and of course it uses LPeg to parse regular expressions and translate them to regular LPeg patterns.)

使用

函数功能

lpeg.match (pattern, subject [, init])

pattern

subject

init 指定subject的开始位置

ret 返回匹配的字符串后面字符的位置

lpeg.type (value)

ret = "pattern" if value == pattern else nil

3.lpeg.version ()

ret string of lpeg

lpeg.setmaxstack (max)

The default limit is 400

构造pattern

lpeg.P (value)

返回值根据value输入模式确定

Converts the given value into a proper pattern, according to the following rules:

If the argument is a pattern, it is returned unmodified.

If the argument is a string, it is translated to a pattern that matches the string literally.

If the argument is a non-negative number n, the result is a pattern that matches exactly n characters.

If the argument is a negative number -n, the result is a pattern that succeeds only if the input string has less than n characters left: lpeg.P(-n) is equivalent to -lpeg.P(n) (see the unary minus operation).

If the argument is a boolean, the result is a pattern that always succeeds or always fails (according to the boolean value), without consuming any input.

If the argument is a table, it is interpreted as a grammar (see Grammars).

If the argument is a function, returns a pattern equivalent to a match-time capture over the empty string.

lpeg.B(patt)

Returns a pattern that matches only if the input string at the current position is preceded by patt. Pattern patt must match only strings with some fixed length, and it cannot contain captures.

Like the and predicate, this pattern never consumes any input, independently of success or failure

lpeg.R ({range})

Returns a pattern that matches any single character belonging to one of the given ranges. Each range is a string xy of length 2, representing all characters with code between the codes of x and y (both inclusive).

As an example, the pattern lpeg.R("09") matches any digit, and lpeg.R("az", "AZ") matches any ASCII letter.

lpeg.S (string)

Returns a pattern that matches any single character that appears in the given string. (The S stands for Set.)

As an example, the pattern lpeg.S("+-*/") matches any arithmetic operator.

Note that, if s is a character (that is, a string of length 1), then lpeg.P(s) is equivalent to lpeg.S(s) which is equivalent to lpeg.R(s..s). Note also that both lpeg.S("") and lpeg.R() are patterns that always fail.

lpeg.V (v)

This operation creates a non-terminal (a variable) for a grammar. The created non-terminal refers to the rule indexed by v in the enclosing grammar.

lpeg.locale ([table])

Returns a table with patterns for matching some character classes according to the current locale. The table has fields named alnum, alpha, cntrl, digit, graph, lower, print, punct, space, upper, and xdigit, each one containing a correspondent pattern. Each pattern matches any single character that belongs to its class.

If called with an argument table, then it creates those fields inside the given table and returns that table.

#patt

-patt
patt1 + patt2
patt1 - patt2
patt1 * patt2
patt^n

Grammars

Grammars存在的意义：普通的lpeg语法允许我们用增长的模式进行匹配，不允许使用递归模式；
如果要使用递归模式，我们需要使用grammars

eg:

equalcount = lpeg.P{
  "S";   -- initial rule name
  S = "a" * lpeg.V"B" + "b" * lpeg.V"A" + "",
  A = "a" * lpeg.V"S" + "b" * lpeg.V"A" * lpeg.V"A",
  B = "b" * lpeg.V"S" + "a" * lpeg.V"B" * lpeg.V"B",
} * -1

每条目录代表着一条规则，

Captures

正常的匹配返回的都是匹配成功的下一个的位置，如果要取得返回内容，需要使用capture

捕获分组的使用

lpeg.C
lpeg.Ct
lpeg.Cg lpeg.C的聚合，不用Cg直接使用lpeg.C的组合效果是相同的
/ 也是一种捕获，不需要使用C，将匹配的子串丢给另外的函数或者table
lpeg.Cs 替换捕获，类似于string.gsub

cal name = lpeg.C(lpeg.alpha^1)
local colon = ':'
local type = lpeg.C(lpeg.digit^-1)
local value = lpeg.C(lpeg.P(1)^1)
local C = lpeg.Cg(name * colon * type * colon * value)

local str = "name:1:value"

print(C:match(str))

result: name 1 value

捕获的使用
- 设置捕获模式
- 进行匹配

捕获组的使用
- 设置捕获模式
- 设置捕获组模式
- 进行match匹配

Ct使用

local name = lpeg.Ct(lpeg.C(lpeg.alpha^1))

local str = "name:1:value"

t = name:match(str)
for k, v in pairs(t) do
    print(k, " : ", v)
end

print(name:match(str))