JavaScript正则表达式——4.贪婪、非贪婪与锚点详解

原创于 2025-10-13 20:22:30 发布 · 置顶 · 1.1k 阅读

5 ·

CC 4.0 BY-SA版权

文章标签：

#javascript #正则表达式 #开发语言 #新人首发

JavaScript正则表达式指南专栏收录该内容

5 篇文章

订阅专栏

摘要
正则中，贪婪 / 非贪婪模式决定量词匹配行为，锚点则定位匹配位置：量词（、+、{n,m} 等）默认贪婪，尽可能匹配最长内容，回溯调整；加？（如?、+?）切换为非贪婪，优先匹配最短内容。锚点不匹配字符只定位置：^ 匹配开头，$ 匹配结尾，\b 匹配单词边界（词字符与非词字符 / 字符串首尾交界处）

1. 贪婪模式

在上一篇文章中的量词默认情况下以贪婪模式工作，可以称为贪婪量词。在以下例子中想要匹配"hello"和"hi"，但是由于贪婪模式最终结果是"hello",she said "hi"也是满足表达式的。贪婪模式在匹配到结"hello"时仍会继续向后匹配到字符串结束，然后回溯到最后匹配到"hi"结束引号，返回结果。

let str = 'he said "hello",she said "hi".'
let reg = /".+"/g
console.log(str.match(reg))//['"hello",she said "hi"',index: 8,input: 'he said "hello",she said "hi".',groups: undefined]

2. 非贪婪模式

非贪婪模式又称懒惰模式与贪婪模式相反，其前导元素的次数尽可能少，并返回最小的匹配项。只需在量词之后添加一个问号即可。懒惰模式的量词又称懒惰量词。

let str = 'he said "hello",she said "hi".'
let reg = /".+?"/g
console.log(str.match(reg))//[ '"hello"', '"hi"' ]

贪婪量词	非贪婪量词
*	*?
+	+?
?	??
{ n }	{ n }?
{ n ,}	{ n ,}?
{ n , m }	{ n , m }?

3. 锚点

锚点是正则表达式中的特殊字符，其本身不会匹配任何字符，用于匹配字符串的开始和结束位置。

^脱字符：匹配字符串的开始位置，位于要匹配字符的前面。

let str = 'hello hello'
let reg = /^hello/g 
console.log(str.match(reg))//[ 'hello' ] 只会匹配以h作为行开头的hello

reg = /hello/g
console.log(str.match(reg))//[ 'hello', 'hello' ]

$美元符：匹配字符串的结束位置，位于要匹配字符的后面。

let str = 'hello hello'
let reg = /hello$/g 
console.log(str.match(reg))//[ 'hello' ] 只会匹配以o行结尾开头的hello

\b：匹配单词边界，即单词的开始和结束位置。与^和$只能匹配开头和结尾，\b可以匹配字串内部的单词边界。以下是词边界的规则，词字符就是\w能匹配[a-zA-Z0-9_]的字符。

如果第一个字符是词字符，则位于字符串第一个字符之前。
如果最后一个字符是词字符，则位于字符串最后一个字符之后。
如果字符串中的两个字符之间一个是词字符，另一个不是，则位于这两个字符之间。

 |_hello|
 |hello|,|world|!

let str = '_hello'
let reg = /\bhello\b/g
console.log(str.match(reg))//[null
reg = /\b_hello\b/g
console.log(str.match(reg))//[ '_hello' ]
str = 'hello,world!'
reg = /\b.{5}\b/g
console.log(str.match(reg))//[ 'hello', 'world' ]

4. 总结

正则中，贪婪 / 非贪婪模式决定量词匹配行为，锚点则定位匹配位置：量词（、+、{n,m} 等）默认贪婪，尽可能匹配最长内容，回溯调整；加？（如?、+?）切换为非贪婪，优先匹配最短内容。锚点不匹配字符只定位置：^ 匹配开头，$ 匹配结尾，\b 匹配单词边界（词字符与非词字符 / 字符串首尾交界处）

上次的题目答案

//题目1
/*
某电商项目需编写代码，从商品列表页的 URL 中提取 “分类 ID” 参数（参数名固定为categoryId），具体规则为：
1. URL 可包含任意合法前缀（如http://example.com/list或https://shop.cn/goods）；
2. categoryId参数仅存在于 URL 的查询字符串中（即?之后），参数值仅由数字组成（至少 1 位）；
3. 该参数可位于查询字符串的开头、中间或结尾（如?categoryId=123、?page=2&categoryId=45、?sort=price&categoryId=6），要求匹配出“categoryid=xx"的形式。
*/
let test ="http://example.com/list?categoryId=123"
let reg=/categoryId=\d+/
test ="https://shop.cn/goods?page=2&categoryId=45&sort=price"
console.log(test.match(reg))//'categoryId=45'
test ="https://m.shop.com?sort=new&categoryId=6"
console.log(test.match(reg))//'categoryId=6'
test ="http://example.com/list?cid=123"
console.log(test.match(reg))//null
test ="https://shop.cn/goods?categoryId=abc"
console.log(test.match(reg))//null
test ="https://m.shop.com?sort=new"
console.log(test.match(reg))//null

//题目2
/*
某后台管理系统需编写代码，校验管理端 URL 的路径层级是否合法，具体规则为：1. URL 协议仅允许https（需包含https://）；2. 域名固定为admin.company.com（不允许其他域名）；3. 路径需严格遵循 “/admin/[模块名]/[操作名]” 结构，其中模块名仅由小写字母和_组成（至少 2 位，如user、goods_manage），操作名仅由小写字母组成（至少 1 位，如list、edit），不允许额外层级（如/admin/user/list/detail为非法）。
*/
let test="https://admin.company.com/admin/user/list"
let reg = /^https:\/\/admin\.company\.com\/admin\/[a-z_]{2,}\/[a-z]+$/
test="https://admin.company.com/admin/goods_manage/edit"
console.log(test.match(reg))//'https://admin.company.com/admin/goods_manage/edit'
test="https://admin.company.com/admin/order/pay"
console.log(test.match(reg))//'https://admin.company.com/admin/order/pay'
test="http://admin.company.com/admin/user/list"
console.log(test.match(reg))//null
test="https://admin.other.com/admin/user/list"
console.log(test.match(reg))//null
test="https://admin.company.com/admin/user_list"
console.log(test.match(reg))//null
test="https://admin.company.com/admin/user/list/detail"
console.log(test.match(reg))//null