最近因为想了解一下MongoDB,就找了《MongoDB权威指南》来看,后来又想多了解一下别人的一些思路,遂报了官网的一个线上培训,以下是从他们上星期开始的一期在线培训的第二课中Inequalities On Strings中摘下来的一段,觉得有意思就手打出来了,算是给自己的一个备忘。
简言之,MongoDB将字符串按UTF-8进行字典排序,并可将$gt($ greater than,大于)和$lt($ less than,小于)用于字符串查询,原文如下:
Right now MongoDB has exactly zero knowledge of locales. In effect, the comparisons that we perform for $ less than, $ greater than an so forth are going to sort according to the total ordering of the UTF-8 code units. That is to say, according to a lexicographic sorting of the bytes in the UTF-8 representation of the string. This happens to be correct only in the POSIX or C locales. That is to say, MongoDB compares and sorts things in a Ascii-betically correct fashion. If it so happens that there's a local for which sorting things in UTF-8 byte order happens to be correct, then MongoDB happens to agree with that as well. A future release of MongoDB is likely to have better support locale-aware sorting collation, and so forth.
以下是培训课程中举的一个例子,列出所有name字段首字母在D前的文档:
>db.people.find( { name: { $lt:"D" } } );