Slow index scan on B-Tree index over timestamp field

Поиск
Список
Период
Сортировка
От Caio Casimiro
Тема Slow index scan on B-Tree index over timestamp field
Дата
Msg-id CAK42QYGWHKk+5HZAW4+z1+A2xMYUd1KmwsXp9OVtSud7ztA=4g@mail.gmail.com
обсуждение исходный текст
Ответы Re: Slow index scan on B-Tree index over timestamp field  (Kevin Grittner <kgrittn@ymail.com>)
Re: Slow index scan on B-Tree index over timestamp field  (Jeff Janes <jeff.janes@gmail.com>)
Re: Slow index scan on B-Tree index over timestamp field  (Merlin Moncure <mmoncure@gmail.com>)
Список pgsql-performance
Hello all,

I have one query running at ~ 7 seconds and I would like to know if it's possible to make it run faster, once this query runs lots of time in my experiment.

Basically the query return the topics of tweets published by users that the user N follows and that are published between D1 and D2.

Query:

SELECT tt.tweet_id, tt.topic, tt.topic_value
            FROM tweet_topic AS tt LEFT JOIN tweet AS t ON tt.tweet_id = t.id
            WHERE creation_time BETWEEN 'D1' AND 'D2' AND user_id in
            (SELECT followed_id FROM relationship WHERE follower_id = N) ORDER BY tt.tweet_id;

Explain (Analyze, Buffers):

 Sort  (cost=3950701.24..3950708.22 rows=2793 width=20) (actual time=24062.951..24064.475 rows=1640 loops=1)
   Sort Key: tt.tweet_id
   Sort Method: quicksort  Memory: 97kB
   Buffers: shared hit=2390 read=32778
   I/O Timings: read=15118.402
   ->  Nested Loop  (cost=247.58..3950541.38 rows=2793 width=20) (actual time=532.578..24057.319 rows=1640 loops=1)
         Buffers: shared hit=2387 read=32778
         I/O Timings: read=15118.402
         ->  Hash Semi Join  (cost=229.62..73239.03 rows=1361 width=8) (actual time=391.768..15132.889 rows=597 loops=1)
               Hash Cond: (t.user_id = relationship.followed_id)
               Buffers: shared hit=539 read=31862
               I/O Timings: read=6265.279
               ->  Index Scan using tweet_creation_time_index on tweet t  (cost=0.57..68869.39 rows=1472441 width=16) (actual time=82.752..11418.043 rows=175
9645 loops=1)
                     Index Cond: ((creation_time >= '2013-05-05 00:00:00-03'::timestamp with time zone) AND (creation_time <= '2013-05-06 00:00:00-03'::times
tamp with time zone))
                     Buffers: shared hit=534 read=31859
                     I/O Timings: read=6193.764
               ->  Hash  (cost=227.12..227.12 rows=154 width=8) (actual time=72.175..72.175 rows=106 loops=1)
                     Buckets: 1024  Batches: 1  Memory Usage: 3kB
                     Buffers: shared hit=5 read=3
                     I/O Timings: read=71.515
                     ->  Index Only Scan using relationship_id on relationship  (cost=0.42..227.12 rows=154 width=8) (actual time=59.395..71.972 rows=106 loo
ps=1)
                           Index Cond: (follower_id = 335093362)
                           Heap Fetches: 0
                           Buffers: shared hit=5 read=3
                           I/O Timings: read=71.515
         ->  Bitmap Heap Scan on tweet_topic tt  (cost=17.96..2841.63 rows=723 width=20) (actual time=14.909..14.917 rows=3 loops=597)
               Recheck Cond: (tweet_id = t.id)
               Buffers: shared hit=1848 read=916
               I/O Timings: read=8853.123
               ->  Bitmap Index Scan on tweet_topic_pk  (cost=0.00..17.78 rows=723 width=0) (actual time=9.793..9.793 rows=3 loops=597)
                     Index Cond: (tweet_id = t.id)
                     Buffers: shared hit=1764 read=631
                     I/O Timings: read=5811.532
 Total runtime: 24066.145 ms
(34 rows)



Table structure:

                                     Table "public.tweet"
    Column        |           Type                     | Modifiers | Storage  | Stats target | Description 
-----------------------+--------------------------------------+--------------+-------------+------------------+-----------------
 id                   | bigint                               | not null    | plain    |              | 
 content           | text                                 |                | extended |              | 
 creation_time  | timestamp with time zone |                | plain    |              | 
 user_id           | bigint                               |                | plain    |              | 
 retweeted       | bigint                               |                | plain    |              | 
 retweet_count | integer                             |                | plain    |              | 
Indexes:
    "tweet_plk" PRIMARY KEY, btree (id) CLUSTER
    "tweet_creation_time_index" btree (creation_time)
    "tweet_id_index" hash (id)
    "tweet_ios_index" btree (id, user_id, creation_time)
    "tweet_retweeted_idx" hash (retweeted)
    "tweet_user_id_creation_time_index" btree (creation_time, user_id)
    "tweet_user_id_index" hash (user_id)

System Information:
OS: Slackware 14.0
Postgresql Version: 9.3 Beta2

postgresql.conf Settings:

work_mem = 128MB
shared_buffers = 1GB
maintenance_work_mem = 1536MB
fsync = off
synchronous_commit = off
effective_cache_size = 2GB

Additional information:

All tables in this database are read only tables. I haven't post the details about other tables to not let the email big, as it seems the problem is with the 'tweet' table.

Any help would be much appreciated.
Best regards,
Caio Casimiro.

В списке pgsql-performance по дате отправления:

Предыдущее
От: Tomas Vondra
Дата:
Сообщение: Re: Hot Standby performance issue
Следующее
От: Kevin Grittner
Дата:
Сообщение: Re: Slow index scan on B-Tree index over timestamp field