pgsql 筛选中文字符正则,PostgreSQL:正则表达式转义函数

在PostgreSQL中,为避免插入重复名称,一个过程会在现有名称后附加带计数的括号。当前的查询存在潜在问题,可能会因不正确的括号格式或包含正则字符的名称而引发错误。问题集中在如何正确使用正则表达式进行筛选,并转义可能存在的正则字符。解决方案提出了一种新的查询方法,通过`regexp_replace`和`substring`函数结合,来获取序列中的下一个数字并确保正则匹配的准确性。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

To forgo reading the entire problem, my basic question is:

Is there a function in PostgreSQL to escape regular expression characters in a string?

I've probed the documentation (but perhaps not thoroughly enough) but was unable to find such a function.

Here is the full problem:

In a PostgreSQL database, I have a column with unique names in it. I also have a process which periodically inserts names into this field, and, to prevent duplicates, if it needs to enter a name that already exists, it appends a space and parentheses with a count to the end.

i.e. Name, Name (1), Name (2), Name (3), etc.

As it stands, I use the following code to find the next number to add in the series (written in plpgsql):

var_name_id := 1;

SELECT CAST(substring(a.name from E'\\((\\d+)\\)$') AS int)

INTO var_last_name_id

FROM my_table.names a

WHERE a.name LIKE var_name || ' (%)'

ORDER BY CAST(substring(a.name from E'\\((\\d+)\\)$') AS int) DESC

LIMIT 1;

IF var_last_name_id IS NOT NULL THEN

var_name_id = var_last_name_id + 1;

END IF;

var_new_name := var_name || ' (' || var_name_id || ')';

("var_name" contains the name I'm trying to insert)

This works for now, but the problem lies in the WHERE statement:

WHERE a.name LIKE var_name || ' (%)'

This check doesn't verify that the % in question is a number, and it doesn't account for multiple parentheses, as in something like "Name ((1))", and if either case existed a cast exception would be thrown.

The WHERE statement really needs to be something more like:

WHERE a.r1_name ~* var_name || E' \\(\\d+\\)'

Now we come to the real problem, which is that "var_name" could contain regular expression characters.

So, my question as stated above is: Is there a function in PostgreSQL that escapes regular expression characters in a string, so I could do something like:

WHERE a.r1_name ~* regex_escape(var_name) || E' \\(\\d+\\)'

Any suggestions are much appreciated, including a possible reworking of my duplicate name solution.

解决方案

how about trying something like this, substituting var_name for my hard-coded 'John Bernard':

create table my_table(name text primary key);

insert into my_table(name) values ('John Bernard'),

('John Bernard (1)'),

('John Bernard (2)'),

('John Bernard (3)');

select max(regexp_replace(substring(name, 13), ' |\(|\)', '', 'g')::integer+1)

from my_table

where substring(name, 1, 12)='John Bernard'

and substring(name, 13)~'^ \([1-9][0-9]*\)$';

max

-----

4

(1 row)

one caveat: I am assuming single-user access to the database while this process is running (and so are you in your approach). If that is not the case then the max(n)+1 approach will not be a good one.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值