PostgreSQL in & = any (values|array)
2014-10-16 16:35:26| 分类: PgSQL Develop|举报|字号 订阅
A simple but non-obvious one-line change (ANY(ARRAY[...])
to ANY(VALUES(...))
) in a (bad) PostgreSQL query cuts query time from 20s to 0.2s. Starting with low-level metrics we make our way to your best friend: EXPLAIN ANALYZE
. The amount of time invested will pay off a hundred times over. The Postgres community is your second best friend.
Monitoring slow Postgres queries with Postgres
Earlier this week the performance of one of our (many) databases was plagued by a few pathologically large, primary-key queries in a smallish table (10 GB, 15 million rows) used to feed our graph editor.
In 99.9% of accounts these queries would be zippy. In a few cases where the number of tags used to annotate metrics is large, these queries would take up to 20 seconds. That much time spent in a database meant that someone was waiting in front of their browser for the graph editor to respond. This is obviously not the best experience for the unlucky 0.1%.
Here is the surprisingly simple culprit:
12345678910 |
SELECT
c
.
key
,
c
.
x_key
,
c
.
tags
,
x
.
name
FROM
context
c
JOIN
x
ON
c
.
x_key
=
x
.
key
WHERE
c
.
key
=
ANY
(
ARRAY
[
15368196
,
-- 11,000 other keys --)])
AND
c
.
x_key
=
1
AND
c
.
tags
@>
ARRAY
[
E
'blah'
];
|
Table X contains a few thousand rows, Table C contains 15 millions rows. Both “key” columns are primary keys with proper indexing. Pretty straightforward, it’s a simple primary key lookup. Where things get interesting is when you increase the number of keys. At 11,000 keys we get the following plan, by prefixing EXPLAIN (ANALYZE, BUFFERS)
to the query:
123456789101112131415161718 |
Nested
Loop
(
cost
=
6923
.
33
..
11770
.
59
rows
=
1
width
=
362
)
(
actual
time
=
17128
.
188
..
22109
.
283
rows
=
10858
loops
=
1
)
Buffers
:
shared
hit
=
83494
->
Bitmap
Heap
Scan
on
context
c
(
cost
=
6923
.
33
..
11762
.
31
rows
=
1
width
=
329
)
(
actual
time
=
17128
.
121
..
22031
.
783
rows
=
10858
loops
=
1
)
Recheck
Cond
:
((
tags
@>
'{blah}'
::
text
[])
AND
(
x_key
=
1
))
Filter
:
(
key
=
ANY
(
'{15368196,(a lot more keys here)}'
::
integer
[]))
Buffers
:
shared
hit
=
50919
->
BitmapAnd
(
cost
=
6923
.
33
..
6923
.
33
rows
=
269
width
=
0
)
(
actual
time
=
132
.
910
..
132
.
910
rows
=
0
loops
=
1
)
Buffers
:
shared
hit
=
1342
->
Bitmap
Index
Scan
on
context_tags_idx
(
cost
=
0
.
00
..
1149
.
61
rows
=
15891
width
=
0
)
(
actual
time
=
64
.
614
..
64
.
614
rows
=
264777
loops
=
1
)
Index
Cond
:
(
tags
@>
'{blah}'
::
text
[])
Buffers
:
shared
hit
=
401
->
Bitmap
Index
Scan
on
context_x_id_source_type_id_idx
(
cost
=
0
.
00
..
5773
.
47
rows
=
268667
width
=
0
)
(
actual
time
=
54
.
648
..
54
.
648
rows
=
267659
loops
=
1
)
Index
Cond
:
(
x_id
=
1
)
Buffers
:
shared
hit
=
941
->
Index
Scan
using
x_pkey
on
x
(
cost
=
0
.
00
..
8
.
27
rows
=
1
width
=
37
)
(
actual
time
=
0
.
003
..
0
.
004
rows
=
1
loops
=
10858
)
Index
Cond
:
(
x
.
key
=
1
)
Buffers
:
shared
hit
=
32575
Total
runtime
:
22117
.
417
ms
|
As you can see at the bottom of the plan, the query took 22 seconds to execute. These 22 seconds can be visualized on the following graph as pure CPU execution split 90/10 between Postgres and the OS; very little disk I/O.
At the lowest level these queries would look like these spikes of CPU utilization. CPU graphs are seldom useful but in this case it confirmed a crucial point: the database was not waiting for the disks to read data. It was doing things like sorting, hashing and comparing rows.
The second interesting metric that tracks the spikes very closely is the number of rows that are “fetched” by Postgres (in this case not returned, just looked at and discarded).
Clearly something is regularly and methodically going through a lot of rows: our query.
The Postgres Performance Problem: Bitmap Heap Scan
The rows_fetched metric is consistent with the following part of the plan:
12345 |
Buffers
:
shared
hit
=
83494
->
Bitmap
Heap
Scan
on
context
c
(
cost
=
6923
.
33
..
11762
.
31
rows
=
1
width
=
329
)
(
actual
time
=
17128
.
121
..
22031
.
783
rows
=
10858
loops
=
1
)
Recheck
Cond
:
((
tags
@>
'{blah}'
::
text
[])
AND
(
x_key
=
1
))
Filter
:
(
key
=
ANY
(
'{15368196,(a lot more keys here)}'
::
integer
[]))
Buffers
:
shared
hit
=
50919
|
Postgres is reading Table C using a Bitmap Heap Scan. When the number of keys to check stays small, it can efficiently use the index to build the bitmap in memory. If the bitmap gets too large, the query optimizer changes the way it looks up data. In our case it has a large number of keys to check so it uses the more approximative way to retrieve the candidate rows and checks each row individually for a match on x_key and tags. All this “loading in memory” and “checking individual row” takes time (the Recheck Cond in the plan).
Luckily for us the table is 30% loaded in RAM so it is not as bad as retrieving the rows from disk. It still has a very noticeable impact on performance. Remember that the query is quite simple. It’s a primary key lookup so there aren’t many obvious ways to fix it without dramatically re-architecting the database or the application. We turned to the community for help via the PGSQL-Performance mailing list.
The solution
This is yet another reason why we love open-source and its culture of helping users.Tom Lane, one of the most prolific open-source authors around, suggested to try this instead:
12345678910 |
SELECT
c
.
key
,
c
.
x_key
,
c
.
tags
,
x
.
name
FROM
context
c
JOIN
x
ON
c
.
x_key
=
x
.
key
WHERE
c
.
key
=
ANY
(
VALUES
(
15368196
),
-- 11,000 other keys --)
AND
c
.
x_key
=
1
AND
c
.
tags
@>
ARRAY
[
E
'blah'
];
|
Can you spot the difference? ARRAY
has been changed to VALUES
.
The query optimizer is fooled by the use of ARRAY[...]
to list all the primary keys to look up. VALUES (...), (...)
lets the optimizer make full use of the primary key index instead. It is literally a one-line change, which makes no semantic difference.
Here is the plan of the new query. The 2 major differences are on lines 3 and 14.
1234567891011121314 |
Nested
Loop
(
cost
=
168
.
22
..
2116
.
29
rows
=
148
width
=
362
)
(
actual
time
=
22
.
134
..
256
.
531
rows
=
10858
loops
=
1
)
Buffers
:
shared
hit
=
44967
->
Index
Scan
using
x_pkey
on
x
(
cost
=
0
.
00
..
8
.
27
rows
=
1
width
=
37
)
(
actual
time
=
0
.
071
..
0
.
073
rows
=
1
loops
=
1
)
Index
Cond
:
(
id
=
1
)
Buffers
:
shared
hit
=
4
->
Nested
Loop
(
cost
=
168
.
22
..
2106
.
54
rows
=
148
width
=
329
)
(
actual
time
=
22
.
060
..
242
.
406
rows
=
10858
loops
=
1
)
Buffers
:
shared
hit
=
44963
->
HashAggregate
(
cost
=
168
.
22
..
170
.
22
rows
=
200
width
=
4
)
(
actual
time
=
21
.
529
..
32
.
820
rows
=
11215
loops
=
1
)
->
Values
Scan
on
"*VALUES*"
(
cost
=
0
.
00
..
140
.
19
rows
=
11215
width
=
4
)
(
actual
time
=
0
.
005
..
9
.
527
rows
=
11215
loops
=
1
)
->
Index
Scan
using
context_pkey
on
context
c
(
cost
=
0
.
00
..
9
.
67
rows
=
1
width
=
329
)
(
actual
time
=
0
.
015
..
0
.
016
rows
=
1
loops
=
11215
)
Index
Cond
:
(
c
.
key
=
"*VALUES*"
.
column1
)
Filter
:
((
c
.
tags
@>
'{blah}'
::
text
[])
AND
(
c
.
x_id
=
1
))
Buffers
:
shared
hit
=
44963
Total
runtime
:
263
.
639
ms
|
From 22,000ms to 200ms. That’s a 100x speedup for a simple one-line change.