mysql查询所有空间如何优化_优化mysql查询以使用空间索引选择多边形中的所有点...

最新推荐文章于 2024-06-18 10:54:17 发布

洋溢最棒

最新推荐文章于 2024-06-18 10:54:17 发布

阅读量193

点赞数

文章标签： mysql查询所有空间如何优化

本文链接：https://blog.youkuaiyun.com/weixin_32880357/article/details/113148676

版权

本文探讨了在MySQL中针对大型多边形进行空间索引查询时遇到的性能瓶颈问题，并提出了三种可能的解决方案：使用LIMIT限制结果集大小、强制使用特定的空间索引以及将大型多边形拆分为较小的多边形。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

Firstly, I admit that my experience with spatial functions is very minimal. I have a table in MySQL with 20 fields and 23549187 records that contain geographical data. One of the fields is 'point' which is of point data type and has spatial index on it. I have a query that selects all points within a polygon which looks like this,

select * from `table_name` where ST_CONTAINS(ST_GEOMFROMTEXT('POLYGON((151.186 -23.497,151.207 -23.505,151.178 -23.496,151.174 -23.49800000000001,151.176 -23.496,151.179 -23.49500000000002,151.186 -23.497))'), `point`)

This works well as the polygon is small. However, if the polygon gets massive, the execution times gets really slow and the slowest query until now ran for 15 mins. Adding the index had really helped to bring it down to 15 mins which otherwise would have taken close to an hour. Is there anything I can do here for further improvement.

This query will be run by a PHP script that runs as a daemon and I am worried if this slow queries will bring the MySQL server down.

All suggestions to make it better are welcome. Thanks.

EDIT:

show create table;

CREATE TABLE `table_name` (

`id` int(10) unsigned NOT NULL AUTO_INCREMENT,

`lat` float(12,6) DEFAULT NULL,

`long` float(12,6) DEFAULT NULL,

`point` point NOT NULL,

PRIMARY KEY (`id`),

KEY `lat` (`lat`,`long`),

SPATIAL KEY `sp_index` (`point`)

) ENGINE=MyISAM AUTO_INCREMENT=47222773 DEFAULT CHARSET=utf8mb4

There are few more fields that I am not supposed to disclose it here however the filter won

Explain sql output for the slow query:

+----+-------------+------------+------+---------------+------+---------+------+----------+-------------+

+----+-------------+------------+------+---------------+------+---------+------+----------+-------------+

+----+-------------+------------+------+---------------+------+---------+------+----------+-------------+

Explain sql output for query with smaller polygons,

+----+-------------+------------+-------+---------------+----------+---------+------+------+-------------+

+----+-------------+------------+-------+---------------+----------+---------+------+------+-------------+

+----+-------------+------------+-------+---------------+----------+---------+------+------+-------------+

Looks like the biggest polygon does not use the index.

解决方案

MySQL uses R-Trees for indexing spatial data. Like B-Tree indexes, these are optimal for queries targeting a small subset of the total number. As your bounding polygon gets larger the number of possible matches increases and, at some point, the optimizer decides it is more efficient to switch to a full table scan. That appears to be the scenario here, and I see three options:

First, try adding a LIMIT to your query. Normally, MySQL ignores the index if the optimizer concludes fewer I/O seeks would occur in a full table scan. But, with B-Tree indexes at least, MySQL will short-circuit that logic and always perform the B-Tree dive when LIMIT is present. I hypothesize R-Tree have a similar short-circuiting.

Second, and similar in spirit to the first, try forcing MySQL to use the index. This instructs MySQL that the table scan is more expensive than the optimizer decides. Understand that the optimizer only has heuristics and doesn't really know how "expensive" things are beyond what its internal statistics conclude. We humans have intuition, which sometimes - sometimes - knows better.

select * force index (`sp_index`) from `table_name` where ST_CONTAINS(ST_GEOMFROMTEXT('POLYGON((151.186 -23.497,151.207 -23.505,151.178 -23.496,151.174 -23.49800000000001,151.176 -23.496,151.179 -23.49500000000002,151.186 -23.497))'), `point`)

Finally, if those don't work, then what you need to do is break up your bounding polygon into smaller polygons. For example, if your bounding polygon is a square 500km per side, break it up into 4 squares 250km on each side, or 16 squares 125km per side, etc. Then UNION all of these together. The index will be used on each one, and the cumulative result may be faster. (Note it's important to UNION them together: MySQL cannot apply multiple range scans on a spatial query.)