ScriptCat脚本管理器特殊域名匹配问题解析-优快云博客

ScriptCat脚本管理器特殊域名匹配问题解析

【免费下载链接】scriptcat 脚本猫，一个可以执行用户脚本的浏览器扩展项目地址: https://gitcode.com/gh_mirrors/sc/scriptcat

你是否曾经遇到过这样的问题：精心编写的用户脚本在某些网站上无法正常运行，明明域名看起来匹配，但ScriptCat就是没有激活？或者脚本意外地在不相关的网站上运行，造成了意想不到的副作用？这些问题往往源于对ScriptCat域名匹配机制的误解。

本文将深入解析ScriptCat脚本管理器的特殊域名匹配机制，帮助你彻底理解各种匹配模式的差异、常见陷阱以及最佳实践，让你的脚本能够精准地在目标网站上运行。

ScriptCat域名匹配机制概述

ScriptCat支持三种主要的URL匹配模式，每种模式都有其特定的语法和行为：

mermaid

核心匹配引擎

ScriptCat使用基于Chrome扩展Match Pattern规范的匹配引擎，但在此基础上进行了扩展和优化：

// 核心匹配类型定义
export const enum RuleType {
  MATCH_INCLUDE = 1,    // @match 包含
  MATCH_EXCLUDE = 2,    // @match 排除
  GLOB_INCLUDE = 3,     // @include 包含
  GLOB_EXCLUDE = 4,     // @include 排除
  REGEX_INCLUDE = 5,    // 正则包含
  REGEX_EXCLUDE = 6,    // 正则排除
}

三种匹配模式详解

1. Match Pattern (`@match`)

Match Pattern是Chrome扩展的标准匹配模式，语法最为严格但也最精确。

基本语法

<scheme>://<host>/<path>

特殊字符说明

字符	含义	示例	匹配结果
`*`	通配符	`https://.example.com/`	匹配所有子域名
`?`	单字符	`https://example.com/???.html`	匹配三个字符的文件名

协议支持

// 支持的协议
@match https://example.com/*      // HTTPS协议
@match http://example.com/*       // HTTP协议  
@match *://example.com/*          // 所有协议
@match file:///path/to/file.html  // 本地文件协议

2. Glob Pattern (`@include`/`@exclude`)

Glob Pattern提供更灵活的匹配方式，支持通配符和魔法TLD。

基本语法

// 通配符示例
@include *://*.google.com/search*
@exclude *://*.google.com/search?q=test*

// 魔法TLD示例
@include https://example.tld/*

魔法TLD (Magic TLD) 处理

ScriptCat会自动将 .tld 转换为 .??*/ 来匹配各种顶级域名：

// 原始配置
@include https://example.tld/

// 实际处理为
@include https://example.??*/

这将匹配：

https://example.com/
https://example.org/
https://example.co.uk/
等所有顶级域名

3. Regex Pattern (正则表达式)

正则表达式提供最强大的匹配能力，但需要谨慎使用。

基本语法

// 正则匹配示例
@include /^https?:\/\/(www\.)?example\.com\/.*$/

// 带标志的正则
@include /^https:\/\/example\.com\/api\/.+/i

特殊域名匹配问题解析

1. 端口号处理问题

ScriptCat在处理端口号时有一个特殊行为：自动忽略端口号。

// 以下所有配置都是等价的
@match https://example.com:443/*
@match https://example.com:80/*  
@match https://example.com:*/*
@match https://example.com/*

问题场景：如果你的脚本需要特定端口的网站，这种自动忽略会导致意外匹配。

解决方案：使用正则表达式进行精确端口匹配

// 精确匹配443端口
@include /^https:\/\/example\.com:443\/.*$/

2. 本地文件协议匹配

file:// 协议有特殊的匹配规则：

// 正确用法
@match file:///Users/username/test.html
@match file:///C:/path/to/file.html

// 错误用法（会导致无法匹配）
@match file://localhost/Users/username/test.html

特殊限制：file:// 协议不支持通配符域名，必须指定具体路径。

3. 查询字符串匹配陷阱

在处理包含查询字符串的URL时，有一个重要的边界情况：

@match https://example.com/path?

// 这会匹配：
// https://example.com/path?
// https://example.com/path?param=value
// 但不会匹配：https://example.com/path

原理：ScriptCat将单独的 ? 视为空查询字符串的有效表示。

4. 子域名通配符的误解

// 常见误解：认为这会匹配所有子域名
@match *://*.example.com/*

// 实际行为：只匹配直接子域名，不匹配多级子域名
// 匹配：sub.example.com
// 不匹配：deep.sub.example.com

正确做法：使用双星号或多级明确指定

// 匹配所有层级子域名
@include *://*.example.com/*
@include *://*.*.example.com/*

5. 路径匹配的边界情况

路径匹配中的星号行为需要特别注意：

@match https://example.com/path*

// 匹配：
// https://example.com/path
// https://example.com/path/
// https://example.com/path123
// https://example.com/path/other

// 不匹配：
// https://example.com/otherpath

常见陷阱与解决方案

陷阱1：过度匹配

问题：过于宽松的匹配模式导致脚本在不相关的网站上运行。

// 危险配置：匹配所有网站
@match *://*/*

// 更安全的配置：限制域名和路径
@match https://specific-site.com/specific-path/*

陷阱2：协议不匹配

问题：HTTP和HTTPS协议需要分别处理。

// 只匹配HTTPS
@match https://example.com/*

// 同时匹配HTTP和HTTPS  
@match *://example.com/*

陷阱3：正则表达式性能问题

问题：复杂的正则表达式会影响脚本加载性能。

// 性能差的复杂正则
@include /^https?:\/\/(?:www\.)?example\.com\/(?:path1|path2|path3)\/.+$/

// 优化后的方案
@match *://example.com/path1/*
@match *://example.com/path2/* 
@match *://example.com/path3/*

陷阱4：魔法TLD的误用

问题：.tld 魔法值可能匹配到意外的域名。

@include https://example.tld/

// 可能意外匹配：
// https://example.com/
// https://example.org/
// https://example.test.website.com/  // 注意这个！

解决方案：明确指定域名或使用更精确的模式

// 明确指定常见顶级域名
@include https://example.com/*
@include https://example.org/*
@include https://example.net/*

最佳实践指南

1. 匹配模式选择策略

场景	推荐模式	示例
精确域名匹配	`@match`	`@match https://example.com/*`
灵活通配匹配	`@include`	`@include ://.example.com/*`
复杂模式匹配	正则表达式	`@include /^https:\/\/api\.example\.com\/v\d+\/.*$/`
排除特定页面	`@exclude`	`@exclude ://example.com/admin/`

2. 多模式组合使用

// 主域名匹配
@match https://example.com/*

// 包含所有子域名
@include *://*.example.com/*

// 排除管理后台
@exclude *://*.example.com/admin/*

// 排除API文档
@exclude *://*.example.com/api/docs/*

3. 性能优化建议

// 避免过度使用正则表达式
// ❌ 性能差
@include /^https?:\/\/(?:www\.)?example\.com\/(?:path1|path2|path3)\/.+$/

// ✅ 性能好
@match *://example.com/path1/*
@match *://example.com/path2/*
@match *://example.com/path3/*

4. 可维护性建议

// 使用注释说明匹配意图
// 匹配主站点的所有页面，但不包含管理后台
@match https://example.com/*
@exclude https://example.com/admin/*

// 匹配所有子域名的用户相关页面
@include *://*.example.com/user/*
@include *://*.example.com/profile/*

调试与测试技巧

1. 使用控制台测试匹配

ScriptCat提供了强大的匹配测试工具，你可以在浏览器控制台中测试匹配模式：

// 示例测试代码
const testUrl = "https://example.com/path?param=value";
const pattern = "*://example.com/path*";

// 使用ScriptCat的内部匹配函数进行测试
// 注意：这需要相应的权限和上下文

2. 匹配模式验证表

创建测试用例表格来验证你的匹配模式：

测试URL	匹配模式	预期结果
`https://example.com/`	`://example.com/`	✅ 匹配
`https://sub.example.com/`	`://example.com/`	❌ 不匹配
`https://example.com/path`	`://example.com/path`	✅ 匹配
`https://example.com/other`	`://example.com/path`	❌ 不匹配

3. 常见问题排查流程

mermaid

4. 实时调试技巧

在脚本开发阶段，可以使用以下技巧进行实时调试：

// 在脚本开头添加调试信息
console.log('ScriptCat匹配信息:', {
    url: window.location.href,
    matched: true, // 这里应该使用实际的匹配检测
    timestamp: new Date().toISOString()
});

// 或者使用更高级的调试方法
if (typeof GM_info !== 'undefined') {
    console.log('GM信息:', GM_info);
}

总结

ScriptCat的域名匹配机制虽然强大，但也存在许多细微的陷阱和特殊行为。通过本文的解析，你应该能够：

理解三种匹配模式的区别和适用场景
识别常见匹配问题并知道如何解决

【免费下载链接】scriptcat 脚本猫，一个可以执行用户脚本的浏览器扩展项目地址: https://gitcode.com/gh_mirrors/sc/scriptcat

创作声明：本文部分内容由AI辅助生成（AIGC），仅供参考

ScriptCat脚本管理器特殊域名匹配问题解析