Обсуждение: Low perfomance SUM and Group by large databse
Hi,
--
Sergio Roberto Charpinel Jr.
I'm getting low performance on SUM and GROUP BY queries.
How can I improve my database to perform such queries.
Here is my table schema:
=> \d acct_2010_25
Tabela "public.acct_2010_25"
Coluna | Tipo | Modificadores
----------------+-----------------------------+------------------------------------------------------------------------
ip_src | inet | not null default '0.0.0.0'::inet
ip_dst | inet | not null default '0.0.0.0'::inet
as_src | bigint | not null default 0
as_dst | bigint | not null default 0
port_src | integer | not null default 0
port_dst | integer | not null default 0
tcp_flags | smallint | not null default 0
ip_proto | smallint | not null default 0
packets | integer | not null
flows | integer | not null default 0
bytes | bigint | not null
stamp_inserted | timestamp without time zone | not null default '0001-01-01 00:00:00 BC'::timestamp without time zone
stamp_updated | timestamp without time zone |
Índices:
"acct_2010_25_pk" PRIMARY KEY, btree (stamp_inserted, ip_src, ip_dst, port_src, port_dst, ip_proto)
"ibytes_acct_2010_25" btree (bytes)
Here is my one query example (could add pk to flow and packet fields):
=> explain analyze SELECT ip_src, port_src, ip_dst, port_dst, tcp_flags, ip_proto,SUM("bytes"),SUM("packets"),SUM("flows") FROM "acct_2010_25" WHERE "stamp_inserted">='2010-06-20 10:10' AND "stamp_inserted"<'2010-06-21 10:10' GROUP BY ip_src, port_src, ip_dst, port_dst, tcp_flags, ip_proto order by SUM(bytes) desc LIMIT 50 OFFSET 0;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=3998662.81..3998662.94 rows=50 width=50) (actual time=276981.107..276981.133 rows=50 loops=1)
-> Sort (cost=3998662.81..4001046.07 rows=953305 width=50) (actual time=276981.105..276981.107 rows=50 loops=1)
Sort Key: sum(bytes)
-> GroupAggregate (cost=3499863.27..3754872.33 rows=953305 width=50) (actual time=165468.257..182677.580 rows=8182616 loops=1)
-> Sort (cost=3499863.27..3523695.89 rows=9533049 width=50) (actual time=165468.022..168908.828 rows=9494165 loops=1)
Sort Key: ip_src, port_src, ip_dst, port_dst, tcp_flags, ip_proto
-> Seq Scan on acct_2010_25 (cost=0.00..352648.10 rows=9533049 width=50) (actual time=0.038..50860.391 rows=9494165 loops=1)
Filter: ((stamp_inserted >= '2010-06-20 10:10:00'::timestamp without time zone) AND (stamp_inserted < '2010-06-21 10:10:00'::timestamp without time zone))
Total runtime: 278791.661 ms
(9 registros)
Another one just summing bytes (still low):
=> explain analyze SELECT ip_src, port_src, ip_dst, port_dst, tcp_flags, ip_proto,SUM("bytes") FROM "acct_2010_25" WHERE "stamp_inserted">='2010-06-20 10:10' AND "stamp_inserted"<'2010-06-21 10:10' GROUP BY ip_src, port_src, ip_dst, port_dst, tcp_flags, ip_proto LIMIT 50 OFFSET 0;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=3395202.50..3395213.12 rows=50 width=42) (actual time=106261.359..106261.451 rows=50 loops=1)
-> GroupAggregate (cost=3395202.50..3602225.48 rows=974226 width=42) (actual time=106261.357..106261.435 rows=50 loops=1)
-> Sort (cost=3395202.50..3419558.14 rows=9742258 width=42) (actual time=106261.107..106261.169 rows=176 loops=1)
Sort Key: ip_src, port_src, ip_dst, port_dst, tcp_flags, ip_proto
-> Seq Scan on acct_2010_25 (cost=0.00..367529.72 rows=9742258 width=42) (actual time=0.073..8058.598 rows=9494165 loops=1)
Filter: ((stamp_inserted >= '2010-06-20 10:10:00'::timestamp without time zone) AND (stamp_inserted < '2010-06-21 10:10:00'::timestamp without time zone))
Total runtime: 109911.882 ms
(7 registros)
The server has 2 Intel(R) Xeon(R) CPU E5430 @ 2.66GHz and 16GB RAM.
I'm using PostgreSQL 8.1.18 default config from Centos 5.5 (just increased checkpoint_segments to 50).
What can I change to increase performance?
Thanks in advance.
Cheers.
Sergio Roberto Charpinel Jr.
On 21/06/10 22:42, Sergio Charpinel Jr. wrote: > Hi, > > I'm getting low performance on SUM and GROUP BY queries. > How can I improve my database to perform such queries. > -> Sort (cost=3499863.27..3523695.89 rows=9533049 > width=50) (actual time=165468.022..168908.828 rows=9494165 loops=1) > Sort Key: ip_src, port_src, ip_dst, port_dst, > tcp_flags, ip_proto > -> Seq Scan on acct_2010_25 (cost=0.00..352648.10 > rows=9533049 width=50) (actual time=0.038..50860.391 rows=9494165 loops=1) > Filter: ((stamp_inserted >= '2010-06-20 > 10:10:00'::timestamp without time zone) AND (stamp_inserted < > '2010-06-21 10:10:00'::timestamp without time zone)) Provide an index on at least (ip_src,port_src,ip_dst,port_dst). If you frequently do other queries that only want some of that information you could create several individual indexes for those columns instead, as Pg will combine them for a query, but that is much less efficient than an index across all four columns. CREATE INDEX ip_peers_idx ON acct_2010_25(ip_src,port_src,ip_dst_port_dst); Every index added costs you insert/update/delete speed, so try to find the smallest/simplest index that gives you acceptable performance. > Another one just summing bytes (still low): > > => explain analyze SELECT ip_src, port_src, ip_dst, port_dst, tcp_flags, > ip_proto,SUM("bytes") FROM "acct_2010_25" WHERE > "stamp_inserted">='2010-06-20 10:10' AND "stamp_inserted"<'2010-06-21 > 10:10' GROUP BY ip_src, port_src, ip_dst, port_dst, tcp_flags, ip_proto > LIMIT 50 OFFSET 0; Same deal. You have no suitable index, so Pg has to do a sequential scan of the table. Since you appear to query on stamp_inserted a lot, you should index it. -- Craig Ringer
On 22/06/10 00:42, Sergio Charpinel Jr. wrote: > Hi, > [snip] > > => explain analyze SELECT ip_src, port_src, ip_dst, port_dst, > tcp_flags, ip_proto,SUM("bytes"),SUM("packets"),SUM("flows") FROM > "acct_2010_25" WHERE "stamp_inserted">='2010-06-20 10:10' AND > "stamp_inserted"<'2010-06-21 10:10' GROUP BY ip_src, port_src, ip_dst, > port_dst, tcp_flags, ip_proto order by SUM(bytes) desc LIMIT 50 OFFSET 0; > > QUERY PLAN > > -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > Limit (cost=3998662.81..3998662.94 rows=50 width=50) (actual > time=276981.107..276981.133 rows=50 loops=1) > -> Sort (cost=3998662.81..4001046.07 rows=953305 width=50) > (actual time=276981.105..276981.107 rows=50 loops=1) > Sort Key: sum(bytes) > -> GroupAggregate (cost=3499863.27..3754872.33 rows=953305 > width=50) (actual time=165468.257..182677.580 rows=8182616 loops=1) > -> Sort (cost=3499863.27..3523695.89 rows=9533049 > width=50) (actual time=165468.022..168908.828 rows=9494165 loops=1) > Sort Key: ip_src, port_src, ip_dst, port_dst, > tcp_flags, ip_proto You are having to sort and aggregate a large number of rows before you can get the top 50. That's 9 million rows in this case, width 50 = 400MB+ sort. That's going to be slow as you are going to have to sort it on disk unless you bump up sort mem to 500Mb (bad idea). So unless you have really fast storage for temporary tables it's going to take a while. About 2.5 minutes you are experiencing at the moment is probably not too bad. I'm sure improvements have been made in the area since 8.1 and if you are able to upgrade to 8.4 which is also offered by Centos5 now, you might get benefit there. I can't remember the specific benefits, but I believe sorting speed has improved, your explain analyze will also give you more information about what's going on with disk/memory sorting. > -> Seq Scan on acct_2010_25 > (cost=0.00..352648.10 rows=9533049 width=50) (actual > time=0.038..50860.391 rows=9494165 loops=1) > Filter: ((stamp_inserted >= '2010-06-20 > 10:10:00'::timestamp without time zone) AND (stamp_inserted < > '2010-06-21 10:10:00'::timestamp without time zone)) > Total runtime: 278791.661 ms > (9 registros) > > Another one just summing bytes (still low): > > => explain analyze SELECT ip_src, port_src, ip_dst, port_dst, > tcp_flags, ip_proto,SUM("bytes") FROM "acct_2010_25" WHERE > "stamp_inserted">='2010-06-20 10:10' AND "stamp_inserted"<'2010-06-21 > 10:10' GROUP BY ip_src, port_src, ip_dst, port_dst, tcp_flags, > ip_proto LIMIT 50 OFFSET 0; > > QUERY PLAN > > -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > Limit (cost=3395202.50..3395213.12 rows=50 width=42) (actual > time=106261.359..106261.451 rows=50 loops=1) > -> GroupAggregate (cost=3395202.50..3602225.48 rows=974226 > width=42) (actual time=106261.357..106261.435 rows=50 loops=1) > -> Sort (cost=3395202.50..3419558.14 rows=9742258 width=42) > (actual time=106261.107..106261.169 rows=176 loops=1) > Sort Key: ip_src, port_src, ip_dst, port_dst, > tcp_flags, ip_proto > -> Seq Scan on acct_2010_25 (cost=0.00..367529.72 > rows=9742258 width=42) (actual time=0.073..8058.598 rows=9494165 loops=1) > Filter: ((stamp_inserted >= '2010-06-20 > 10:10:00'::timestamp without time zone) AND (stamp_inserted < > '2010-06-21 10:10:00'::timestamp without time zone)) > Total runtime: 109911.882 ms > (7 registros) > > > The server has 2 Intel(R) Xeon(R) CPU E5430 @ 2.66GHz and 16GB RAM. > I'm using PostgreSQL 8.1.18 default config from Centos 5.5 (just > increased checkpoint_segments to 50). Checkpoint segments won't help you as the number of segments is about writing to the database and how fast that can happen. > > What can I change to increase performance? Increasing sort-memory (work_mem) will give you speed benefits even though you are going to disk. I don't know how much spare memory you have, but trying other values between 8MB and 128MB may be useful just for the specific query runs. If you can afford 512Mb for each of the two sorts, go for that, but it's dangerous as mentioned due to the risk of using more RAM than you have. work_mem allocates that amount of memory per sort. If you are running these queries all the time, a summary table the produces there reports on a regular basis, maybe daily or even hourly would be useful. Basically the large amount of information that needs to be processed and sorted is what's taking all the time here. Regards Russell
Craig, Russel,
I appreciate your help.
Thanks.
--
Sergio Roberto Charpinel Jr.
2010/6/22 Russell Smith <mr-russ@pws.com.au>
On 22/06/10 00:42, Sergio Charpinel Jr. wrote:[snip]
> Hi,
>>You are having to sort and aggregate a large number of rows before you
> => explain analyze SELECT ip_src, port_src, ip_dst, port_dst,
> tcp_flags, ip_proto,SUM("bytes"),SUM("packets"),SUM("flows") FROM
> "acct_2010_25" WHERE "stamp_inserted">='2010-06-20 10:10' AND
> "stamp_inserted"<'2010-06-21 10:10' GROUP BY ip_src, port_src, ip_dst,
> port_dst, tcp_flags, ip_proto order by SUM(bytes) desc LIMIT 50 OFFSET 0;
>
> QUERY PLAN
>
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Limit (cost=3998662.81..3998662.94 rows=50 width=50) (actual
> time=276981.107..276981.133 rows=50 loops=1)
> -> Sort (cost=3998662.81..4001046.07 rows=953305 width=50)
> (actual time=276981.105..276981.107 rows=50 loops=1)
> Sort Key: sum(bytes)
> -> GroupAggregate (cost=3499863.27..3754872.33 rows=953305
> width=50) (actual time=165468.257..182677.580 rows=8182616 loops=1)
> -> Sort (cost=3499863.27..3523695.89 rows=9533049
> width=50) (actual time=165468.022..168908.828 rows=9494165 loops=1)
> Sort Key: ip_src, port_src, ip_dst, port_dst,
> tcp_flags, ip_proto
can get the top 50. That's 9 million rows in this case, width 50 =
400MB+ sort. That's going to be slow as you are going to have to sort
it on disk unless you bump up sort mem to 500Mb (bad idea). So unless
you have really fast storage for temporary tables it's going to take a
while. About 2.5 minutes you are experiencing at the moment is probably
not too bad.
I'm sure improvements have been made in the area since 8.1 and if you
are able to upgrade to 8.4 which is also offered by Centos5 now, you
might get benefit there. I can't remember the specific benefits, but I
believe sorting speed has improved, your explain analyze will also give
you more information about what's going on with disk/memory sorting.Checkpoint segments won't help you as the number of segments is about
> -> Seq Scan on acct_2010_25
> (cost=0.00..352648.10 rows=9533049 width=50) (actual
> time=0.038..50860.391 rows=9494165 loops=1)
> Filter: ((stamp_inserted >= '2010-06-20
> 10:10:00'::timestamp without time zone) AND (stamp_inserted <
> '2010-06-21 10:10:00'::timestamp without time zone))
> Total runtime: 278791.661 ms
> (9 registros)
>
> Another one just summing bytes (still low):
>
> => explain analyze SELECT ip_src, port_src, ip_dst, port_dst,
> tcp_flags, ip_proto,SUM("bytes") FROM "acct_2010_25" WHERE
> "stamp_inserted">='2010-06-20 10:10' AND "stamp_inserted"<'2010-06-21
> 10:10' GROUP BY ip_src, port_src, ip_dst, port_dst, tcp_flags,
> ip_proto LIMIT 50 OFFSET 0;
>
> QUERY PLAN
>
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Limit (cost=3395202.50..3395213.12 rows=50 width=42) (actual
> time=106261.359..106261.451 rows=50 loops=1)
> -> GroupAggregate (cost=3395202.50..3602225.48 rows=974226
> width=42) (actual time=106261.357..106261.435 rows=50 loops=1)
> -> Sort (cost=3395202.50..3419558.14 rows=9742258 width=42)
> (actual time=106261.107..106261.169 rows=176 loops=1)
> Sort Key: ip_src, port_src, ip_dst, port_dst,
> tcp_flags, ip_proto
> -> Seq Scan on acct_2010_25 (cost=0.00..367529.72
> rows=9742258 width=42) (actual time=0.073..8058.598 rows=9494165 loops=1)
> Filter: ((stamp_inserted >= '2010-06-20
> 10:10:00'::timestamp without time zone) AND (stamp_inserted <
> '2010-06-21 10:10:00'::timestamp without time zone))
> Total runtime: 109911.882 ms
> (7 registros)
>
>
> The server has 2 Intel(R) Xeon(R) CPU E5430 @ 2.66GHz and 16GB RAM.
> I'm using PostgreSQL 8.1.18 default config from Centos 5.5 (just
> increased checkpoint_segments to 50).
writing to the database and how fast that can happen.Increasing sort-memory (work_mem) will give you speed benefits even
>
> What can I change to increase performance?
though you are going to disk. I don't know how much spare memory you
have, but trying other values between 8MB and 128MB may be useful just
for the specific query runs. If you can afford 512Mb for each of the
two sorts, go for that, but it's dangerous as mentioned due to the risk
of using more RAM than you have. work_mem allocates that amount of
memory per sort.
If you are running these queries all the time, a summary table the
produces there reports on a regular basis, maybe daily or even hourly
would be useful. Basically the large amount of information that needs
to be processed and sorted is what's taking all the time here.
Regards
Russell
--
Sergio Roberto Charpinel Jr.
Hi,
--
Sergio Roberto Charpinel Jr.
One more question about two specifics query behavior: If I add "AND (ip_dst = x.x.x.x)", it uses another plan and take a much more time. In both of them, I'm using WHERE clause. Why this behavior?
=> explain analyze SELECT ip_src, port_src, ip_dst, port_dst, tcp_flags, ip_proto, bytes, packets, flows FROM "acct_2010_26" WHERE "stamp_inserted">='2010-06-28 09:07' AND "stamp_inserted"<'2010-06-29 08:07' AND (ip_dst = '8.8.8.8') ORDER BY bytes DESC LIMIT 50 OFFSET 0;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=496332.56..496332.69 rows=50 width=50) (actual time=125390.523..125390.540 rows=50 loops=1)
-> Sort (cost=496332.56..496351.35 rows=7517 width=50) (actual time=125390.520..125390.525 rows=50 loops=1)
Sort Key: bytes
-> Index Scan using acct_2010_26_pk on acct_2010_26 (cost=0.00..495848.62 rows=7517 width=50) (actual time=0.589..125385.680 rows=1011 loops=1)
Index Cond: ((stamp_inserted >= '2010-06-28 09:07:00'::timestamp without time zone) AND (stamp_inserted < '2010-06-29 08:07:00'::timestamp without time zone) AND (ip_dst = '8.8.8.8'::inet))
Total runtime: 125390.711 ms
(6 registros)
=> explain analyze SELECT ip_src, port_src, ip_dst, port_dst, tcp_flags, ip_proto, bytes, packets, flows FROM "acct_2010_26" WHERE "stamp_inserted">='2010-06-28 09:07' AND "stamp_inserted"<'2010-06-29 08:07' ORDER BY bytes DESC LIMIT 50 OFFSET 0;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..268.25 rows=50 width=50) (actual time=0.150..70.780 rows=50 loops=1)
-> Index Scan Backward using ibytes_acct_2010_26 on acct_2010_26 (cost=0.00..133240575.70 rows=24835384 width=50) (actual time=0.149..70.762 rows=50 loops=1)
Filter: ((stamp_inserted >= '2010-06-28 09:07:00'::timestamp without time zone) AND (stamp_inserted < '2010-06-29 08:07:00'::timestamp without time zone))
Total runtime: 70.830 ms
(4 registros)
Thanks in advance.
2010/6/23 Sergio Charpinel Jr. <sergiocharpinel@gmail.com>
Craig, Russel,I appreciate your help.Thanks.2010/6/22 Russell Smith <mr-russ@pws.com.au>On 22/06/10 00:42, Sergio Charpinel Jr. wrote:[snip]
> Hi,
>>You are having to sort and aggregate a large number of rows before you
> => explain analyze SELECT ip_src, port_src, ip_dst, port_dst,
> tcp_flags, ip_proto,SUM("bytes"),SUM("packets"),SUM("flows") FROM
> "acct_2010_25" WHERE "stamp_inserted">='2010-06-20 10:10' AND
> "stamp_inserted"<'2010-06-21 10:10' GROUP BY ip_src, port_src, ip_dst,
> port_dst, tcp_flags, ip_proto order by SUM(bytes) desc LIMIT 50 OFFSET 0;
>
> QUERY PLAN
>
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Limit (cost=3998662.81..3998662.94 rows=50 width=50) (actual
> time=276981.107..276981.133 rows=50 loops=1)
> -> Sort (cost=3998662.81..4001046.07 rows=953305 width=50)
> (actual time=276981.105..276981.107 rows=50 loops=1)
> Sort Key: sum(bytes)
> -> GroupAggregate (cost=3499863.27..3754872.33 rows=953305
> width=50) (actual time=165468.257..182677.580 rows=8182616 loops=1)
> -> Sort (cost=3499863.27..3523695.89 rows=9533049
> width=50) (actual time=165468.022..168908.828 rows=9494165 loops=1)
> Sort Key: ip_src, port_src, ip_dst, port_dst,
> tcp_flags, ip_proto
can get the top 50. That's 9 million rows in this case, width 50 =
400MB+ sort. That's going to be slow as you are going to have to sort
it on disk unless you bump up sort mem to 500Mb (bad idea). So unless
you have really fast storage for temporary tables it's going to take a
while. About 2.5 minutes you are experiencing at the moment is probably
not too bad.
I'm sure improvements have been made in the area since 8.1 and if you
are able to upgrade to 8.4 which is also offered by Centos5 now, you
might get benefit there. I can't remember the specific benefits, but I
believe sorting speed has improved, your explain analyze will also give
you more information about what's going on with disk/memory sorting.Checkpoint segments won't help you as the number of segments is about
> -> Seq Scan on acct_2010_25
> (cost=0.00..352648.10 rows=9533049 width=50) (actual
> time=0.038..50860.391 rows=9494165 loops=1)
> Filter: ((stamp_inserted >= '2010-06-20
> 10:10:00'::timestamp without time zone) AND (stamp_inserted <
> '2010-06-21 10:10:00'::timestamp without time zone))
> Total runtime: 278791.661 ms
> (9 registros)
>
> Another one just summing bytes (still low):
>
> => explain analyze SELECT ip_src, port_src, ip_dst, port_dst,
> tcp_flags, ip_proto,SUM("bytes") FROM "acct_2010_25" WHERE
> "stamp_inserted">='2010-06-20 10:10' AND "stamp_inserted"<'2010-06-21
> 10:10' GROUP BY ip_src, port_src, ip_dst, port_dst, tcp_flags,
> ip_proto LIMIT 50 OFFSET 0;
>
> QUERY PLAN
>
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Limit (cost=3395202.50..3395213.12 rows=50 width=42) (actual
> time=106261.359..106261.451 rows=50 loops=1)
> -> GroupAggregate (cost=3395202.50..3602225.48 rows=974226
> width=42) (actual time=106261.357..106261.435 rows=50 loops=1)
> -> Sort (cost=3395202.50..3419558.14 rows=9742258 width=42)
> (actual time=106261.107..106261.169 rows=176 loops=1)
> Sort Key: ip_src, port_src, ip_dst, port_dst,
> tcp_flags, ip_proto
> -> Seq Scan on acct_2010_25 (cost=0.00..367529.72
> rows=9742258 width=42) (actual time=0.073..8058.598 rows=9494165 loops=1)
> Filter: ((stamp_inserted >= '2010-06-20
> 10:10:00'::timestamp without time zone) AND (stamp_inserted <
> '2010-06-21 10:10:00'::timestamp without time zone))
> Total runtime: 109911.882 ms
> (7 registros)
>
>
> The server has 2 Intel(R) Xeon(R) CPU E5430 @ 2.66GHz and 16GB RAM.
> I'm using PostgreSQL 8.1.18 default config from Centos 5.5 (just
> increased checkpoint_segments to 50).
writing to the database and how fast that can happen.Increasing sort-memory (work_mem) will give you speed benefits even
>
> What can I change to increase performance?
though you are going to disk. I don't know how much spare memory you
have, but trying other values between 8MB and 128MB may be useful just
for the specific query runs. If you can afford 512Mb for each of the
two sorts, go for that, but it's dangerous as mentioned due to the risk
of using more RAM than you have. work_mem allocates that amount of
memory per sort.
If you are running these queries all the time, a summary table the
produces there reports on a regular basis, maybe daily or even hourly
would be useful. Basically the large amount of information that needs
to be processed and sorted is what's taking all the time here.
Regards
Russell
--
Sergio Roberto Charpinel Jr.
--
Sergio Roberto Charpinel Jr.
On Tue, Jun 29, 2010 at 7:59 AM, Sergio Charpinel Jr. <sergiocharpinel@gmail.com> wrote: > One more question about two specifics query behavior: If I add "AND (ip_dst > = x.x.x.x)", it uses another plan and take a much more time. In both of > them, I'm using WHERE clause. Why this behavior? With either query, the planner is choosing to scan backward through the acct_2010_26_pk index to get the rows in descending order by the "bytes" column. It keeps scanning until it finds 50 rows that match the WHERE clause. With just the critieria on stamp_inserted, matches are pretty common, so it doesn't have to scan very far before finding 50 suitable rows. But when you add the ip_dst = 'x.x.x.x' criterion, suddenly a much smaller percentage of the rows match and so it has to read much further into the index before it finds 50 that do. A second index on just the ip_dst column might help a lot - then it could consider index-scanning for the matching rows and sorting them afterwards. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company