MongoDB regular expression with indexed field

本文探讨了在MongoDB中使用索引与正则表达式查询时的性能差异原因。通过对比带索引与不带索引的情况,解释了为何有时带索引查询可能比不带索引查询慢,并提出了提高查询效率的方法。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

http://stackoverflow.com/questions/8107102/mongodb-regular-expression-with-indexed-field

Q:

I was creating my first app using MongoDB. Created index for a field, and tried a find query with $regex param, launched in a shell

> db.foo.find({A:{$regex:'BLABLA!25500[0-9]'}}).explain()
{
        "cursor" : "BtreeCursor A_1 multi",
        "nscanned" : 500001,
        "nscannedObjects" : 10,
        "n" : 10,
        "millis" : 956,
        "nYields" : 0,
        "nChunkSkips" : 0,
        "isMultiKey" : false,
        "indexOnly" : false,
        "indexBounds" : {
                "A" : [
                        [
                                "",
                                {

                                }
                        ],
                        [
                                /BLABLA!25500[0-9]/,
                                /BLABLA!25500[0-9]/
                        ]
                ]
        }
}

It's very strange, because when i'm launching the same query, but with no index in collection, the performance is much better.

> db.foo.find({A:{$regex:'BLABLA!25500[0-9]'}}).explain()
{
        "cursor" : "BasicCursor",
        "nscanned" : 500002,
        "nscannedObjects" : 500002,
        "n" : 10,
        "millis" : 531,
        "nYields" : 0,
        "nChunkSkips" : 0,
        "isMultiKey" : false,
        "indexOnly" : false,
        "indexBounds" : {

        }
}

Obviously, searching a field with index without regex is working much faster(i.e. searching document with constant field) , but i'm really interested in reason of such behavior.

A:

The reason for the performance differential here is likely that, with the index enabled, your query must traverse the index (load into memory), then load the matching documents to be returned into memory also. Since you are not using the prefix query all values in the index will be scanned and tested against the regular expression. Not very efficient.

When you remove the index you are just doing a table scan and matching the regex there - essentially you simplified things from the first one slightly.

You might be able to make the indexed version quicker if it were a covered index query, it would also likely be faster if this were a compound index and you needed to combine it with the criteria for another field.

When you use a prefix query, it's not that it only uses an index then, but you use the index efficiently, which is key, and hence you see the real performance gains.



评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值