Обсуждение: Using BOOL in indexes

Поиск
Список
Период
Сортировка

Using BOOL in indexes

От
Andrew McMillan
Дата:
Hi,

I'm trying to convert an application from MS SQL / ASP / IIS to
PostgreSQL / PHP / Apache.  I am having trouble getting efficient
queries on one of my main tables, which tends to have some fairly large
records in it.  Currently there are around 20000 records, and it looks
like they average around 500 bytes from the VACUUM ANALYZE statistics
below.

I don't really want any query on this table to return more than about 20
records, so it seems to me that indexed access should be the answer, but
I am having some problems with indexes containing BOOLEAN types.

I can't see any reason why BOOL shouldn't work in an index, and in other
systems I have commonly used them as the first component of an index,
which is what I want to do here.

Also, I can't see why the estimator should see a difference between
"WHERE head1" and "WHERE head1=TRUE".

Any help appreciated,                Andrew.


newsroom=# \d story               Table "story" Attribute   |   Type    |     Modifier      
--------------+-----------+-------------------story_id     | integer   | not nullauthor       | integer   | written
| timestamp | released     | timestamp | withdrawn    | timestamp | sent         | timestamp | wcount       | integer
|default 0chunk_count  | integer   | head1        | boolean   | default 'f'::boolheadpriority | integer   | default
999internal    | boolean   | default 'f'::boolislive       | boolean   | default 'f'::boolstory_type   | char(4)   |
title       | text      | precis       | text      | 
 
Indices: story_oid_skey,        story_pkey,        story_sk1,        story_sk2,        story_sk4

newsroom=# \d story_sk4  Index "story_sk4"Attribute |   Type    
-----------+-----------head1     | booleanwritten   | timestamp
btree

newsroom=# explain SELECT DISTINCT story.story_id, written, released,
title, precis, author, head1 FROM story WHERE head1 ORDER BY written
DESC LIMIT 15;
NOTICE:  QUERY PLAN:

Unique  (cost=2623.87..2868.99 rows=1401 width=49) ->  Sort  (cost=2623.87..2623.87 rows=14007 width=49)       ->  Seq
Scanon story  (cost=0.00..1421.57 rows=14007 width=49)
 

EXPLAIN

newsroom=# set enable_seqscan to 'off';
SET VARIABLE

newsroom=# explain SELECT DISTINCT story.story_id, written, released,
title, precis, author, head1 FROM story WHERE head1 ORDER BY written
DESC LIMIT 15;
NOTICE:  QUERY PLAN:

Unique  (cost=100002623.87..100002868.99 rows=1401 width=49) ->  Sort  (cost=100002623.87..100002623.87 rows=14007
width=49)      ->  Seq Scan on story  (cost=100000000.00..100001421.57
 
rows=14007 width=49)

EXPLAIN

newsroom=# explain SELECT DISTINCT story.story_id, written, released,
title, precis, author FROM story WHERE head1=TRUE LIMIT 15;
NOTICE:  QUERY PLAN:

Unique  (cost=8846.22..9056.33 rows=1401 width=48) ->  Sort  (cost=8846.22..8846.22 rows=14007 width=48)       ->
IndexScan using story_sk4 on story  (cost=0.00..7645.97
 
rows=14007 width=48)

EXPLAIN

newsroom=# vacuum verbose analyze story;
NOTICE:  --Relation story--
NOTICE:  Pages 1238: Changed 0, reaped 0, Empty 0, New 0; Tup 18357: Vac
0, Keep/VTL 0/0, Crash 0, UnUsed 0, MinLen 84, MaxLen 3115; Re-using:
Free/Avail. Space 0/0; EndEmpty/Avail. Pages 0/0. CPU 0.16s/1.90u sec.
NOTICE:  Index story_oid_skey: Pages 39; Tuples 18357. CPU 0.00s/0.07u
sec.
NOTICE:  Index story_sk4: Pages 94; Tuples 18357. CPU 0.01s/0.08u sec.
NOTICE:  Index story_sk2: Pages 51; Tuples 18357. CPU 0.01s/0.08u sec.
NOTICE:  Index story_sk1: Pages 70; Tuples 18357. CPU 0.02s/0.06u sec.
NOTICE:  Index story_pkey: Pages 59; Tuples 18357. CPU 0.02s/0.06u sec.
VACUUM


-- 
_____________________________________________________________________           Andrew McMillan, e-mail:
Andrew@cat-it.co.nz
Catalyst IT Ltd, PO Box 10-225, Level 22, 105 The Terrace, Wellington
Me: +64 (21) 635 694, Fax: +64 (4) 499 5596, Office: +64 (4) 499 2267


RE: Using BOOL in indexes

От
"Hiroshi Inoue"
Дата:
> -----Original Message-----
> From: pgsql-hackers-owner@hub.org [mailto:pgsql-hackers-owner@hub.org]On
> Behalf Of Andrew McMillan
> 
> Hi,
> 
> I'm trying to convert an application from MS SQL / ASP / IIS to
> PostgreSQL / PHP / Apache.  I am having trouble getting efficient
> queries on one of my main tables, which tends to have some fairly large
> records in it.  Currently there are around 20000 records, and it looks
> like they average around 500 bytes from the VACUUM ANALYZE statistics
> below.
> 
> I don't really want any query on this table to return more than about 20
> records, so it seems to me that indexed access should be the answer, but
> I am having some problems with indexes containing BOOLEAN types.
> 
> I can't see any reason why BOOL shouldn't work in an index, and in other
> systems I have commonly used them as the first component of an index,
> which is what I want to do here.
>
> Also, I can't see why the estimator should see a difference between
> "WHERE head1" and "WHERE head1=TRUE".
> 
> 
> newsroom=# explain SELECT DISTINCT story.story_id, written, released,
> title, precis, author, head1 FROM story WHERE head1 ORDER BY written

Please add head1 to ORDER BY clause i.e. ORDER BY head1,written.

Regards.

Hiroshi Inoue
Inoue@tpf.co.jp



RE: Using BOOL in indexes

От
"Hiroshi Inoue"
Дата:
> -----Original Message-----
> From: Hiroshi Inoue [mailto:Inoue@tpf.co.jp]
> > -----Original Message-----
> > From: pgsql-hackers-owner@hub.org [mailto:pgsql-hackers-owner@hub.org]On
> > Behalf Of Andrew McMillan
> >
> > Hi,
> >
> > I'm trying to convert an application from MS SQL / ASP / IIS to
> > PostgreSQL / PHP / Apache.  I am having trouble getting efficient
> > queries on one of my main tables, which tends to have some fairly large
> > records in it.  Currently there are around 20000 records, and it looks
> > like they average around 500 bytes from the VACUUM ANALYZE statistics
> > below.
> >
> > I don't really want any query on this table to return more than about 20
> > records, so it seems to me that indexed access should be the answer, but
> > I am having some problems with indexes containing BOOLEAN types.
> >
> > I can't see any reason why BOOL shouldn't work in an index, and in other
> > systems I have commonly used them as the first component of an index,
> > which is what I want to do here.
> >
> > Also, I can't see why the estimator should see a difference between
> > "WHERE head1" and "WHERE head1=TRUE".
> >
> >
> > newsroom=# explain SELECT DISTINCT story.story_id, written, released,
> > title, precis, author, head1 FROM story WHERE head1 ORDER BY written
>
> Please add head1 to ORDER BY clause i.e. ORDER BY head1,written.
>

Sorry,it wouldn't help unless there's an index e.g. on (head1,written,
story_id, released, title, precis, author).
However isn't (story_id) a primary key ?
If so,couldn't you change your query as follows ?

SELECT story.story_id, written, released, title, precis, author, head1
FROM story WHERE head1=TRUE ORDER BY head1, written DESC
LIMIT 15.

Regards.

Hiroshi Inoue
Inoue@tpf.co.jp



Re: Using BOOL in indexes

От
Andrew McMillan
Дата:
Hiroshi Inoue wrote:
> Hiroshi Inoue wrote:
> > Andrew McMillan wrote:
> > >
> > > Hi,
> > >
> > > I'm trying to convert an application from MS SQL / ASP / IIS to
> > > PostgreSQL / PHP / Apache.  I am having trouble getting efficient
> > > queries on one of my main tables, which tends to have some fairly large
> > > records in it.  Currently there are around 20000 records, and it looks
> > > like they average around 500 bytes from the VACUUM ANALYZE statistics
> > > below.
> > >
> > > I don't really want any query on this table to return more than about 20
> > > records, so it seems to me that indexed access should be the answer, but
> > > I am having some problems with indexes containing BOOLEAN types.
> > >
> > > I can't see any reason why BOOL shouldn't work in an index, and in other
> > > systems I have commonly used them as the first component of an index,
> > > which is what I want to do here.
> > >
> > > Also, I can't see why the estimator should see a difference between
> > > "WHERE head1" and "WHERE head1=TRUE".
> > >
> > >
> > > newsroom=# explain SELECT DISTINCT story.story_id, written, released,
> > > title, precis, author, head1 FROM story WHERE head1 ORDER BY written
> >
> > Please add head1 to ORDER BY clause i.e. ORDER BY head1,written.
> >
> 
> Sorry,it wouldn't help unless there's an index e.g. on (head1,written,
> story_id, released, title, precis, author).
> However isn't (story_id) a primary key ?
> If so,couldn't you change your query as follows ?
> 
> SELECT story.story_id, written, released, title, precis, author, head1
> FROM story WHERE head1=TRUE ORDER BY head1, written DESC
> LIMIT 15.

Thanks Hiroshi,

I already have such an index, but as you can see below, it is still not
used:

newsroom=# explain SELECT story.story_id, written, released, title,
precis, author, head1 FROM story WHERE head1=TRUE ORDER BY head1,
written DESC LIMIT 15;
NOTICE:  QUERY PLAN:

Sort  (cost=2669.76..2669.76 rows=14007 width=49) ->  Seq Scan on story  (cost=0.00..1467.46 rows=14007 width=49)

EXPLAIN
newsroom=# \d story_sk4  Index "story_sk4"Attribute |   Type    
-----------+-----------head1     | booleanwritten   | timestamp
btree

Regards,                Andrew.

-- 
_____________________________________________________________________           Andrew McMillan, e-mail:
Andrew@cat-it.co.nz
Catalyst IT Ltd, PO Box 10-225, Level 22, 105 The Terrace, Wellington
Me: +64 (21) 635 694, Fax: +64 (4) 499 5596, Office: +64 (4) 499 2267


RE: Using BOOL in indexes

От
"Hiroshi Inoue"
Дата:
> -----Original Message-----
> From: andrew@socrates.catalyst.net.nz
> 
> Hiroshi Inoue wrote:
> > Hiroshi Inoue wrote:
> > > Andrew McMillan wrote:
> > > >
> > > > Hi,
> > > >
> > 
> > Sorry,it wouldn't help unless there's an index e.g. on (head1,written,
> > story_id, released, title, precis, author).
> > However isn't (story_id) a primary key ?
> > If so,couldn't you change your query as follows ?
> > 
> > SELECT story.story_id, written, released, title, precis, author, head1
> > FROM story WHERE head1=TRUE ORDER BY head1, written DESC
> > LIMIT 15.
> 
> Thanks Hiroshi,
> 
> I already have such an index, but as you can see below, it is still not
> used:
> 
> newsroom=# explain SELECT story.story_id, written, released, title,
> precis, author, head1 FROM story WHERE head1=TRUE ORDER BY head1,
> written DESC LIMIT 15;

Oops,please add DESC to head1 also i.e ORDER BY head1 DESC,written DESC.

Regards.

Hiroshi Inoue
Inoue@tpf.co.jp