Обсуждение: Effect of large text field data on queries

Поиск
Список
Период
Сортировка

Effect of large text field data on queries

От
Glen Parker
Дата:
I am having some query problems on a table with large text fields.

The table contains 6.7M rows.  It is vacuumed every night, and since the
last vacuum, rows have been inserted only, never updated or deleted.

There are many large text field values in one text field, some in excess
of 6MB.

I have another table with 13.8M rows, several times as many columns, but
no large text values.

I am testing select count() queries on both tables, counting on an
integer field only.  The first table takes almost 300 seconds to
complete, while the second table takes about 90 seconds.

Why would very large text values effect the speed of a seq scan that
does not actually evaluate those values?

Oh, PG 8.1.3 on FC 5, 64-bit.

TIA
-Glen Parker

Re: Effect of large text field data on queries

От
Tom Lane
Дата:
Glen Parker <glenebob@nwlink.com> writes:
> Why would very large text values effect the speed of a seq scan that
> does not actually evaluate those values?

More bytes to scan over?  Have you checked the physical table sizes?

I'd actually suppose it's the smaller values (up to a few hundred bytes)
that impact this the most.  Really wide fields would be pushed
out-of-line.

            regards, tom lane

Re: Effect of large text field data on queries

От
Glen Parker
Дата:
Tom Lane wrote:
> Glen Parker <glenebob@nwlink.com> writes:
>> Why would very large text values effect the speed of a seq scan that
>> does not actually evaluate those values?
>
> I'd actually suppose it's the smaller values (up to a few hundred bytes)
> that impact this the most.  Really wide fields would be pushed
> out-of-line.

Meaning that the portion of the text value stored in the companion TOAST
table would be ignored for this type of query, correct?  That's why I'm
concerned.

How much of a TOAST'd field is actually stored in the main heap table?
Is there a way to configure that amount?


-Glen

Re: Effect of large text field data on queries

От
Alan Hodgson
Дата:
On Thursday 26 October 2006 12:51, Glen Parker <glenebob@nwlink.com> wrote:
> Why would very large text values effect the speed of a seq scan that
> does not actually evaluate those values?

Seq scan reads the whole table.  The limiting factor is the size of the
table on disk.

--
"If a nation expects to be ignorant and free, in a state of civilization,
it expects what never was and never will be." -- Thomas Jefferson


Re: Effect of large text field data on queries

От
Tom Lane
Дата:
Glen Parker <glenebob@nwlink.com> writes:
> How much of a TOAST'd field is actually stored in the main heap table?
> Is there a way to configure that amount?

A pushed-out-of-line value is replaced by a 20-byte pointer structure.
There's no such thing as partially out-of-line.  See
http://www.postgresql.org/docs/8.1/static/storage-toast.html

            regards, tom lane

Re: Effect of large text field data on queries

От
Glen Parker
Дата:
Tom Lane wrote:
> Glen Parker <glenebob@nwlink.com> writes:
>> How much of a TOAST'd field is actually stored in the main heap table?
>> Is there a way to configure that amount?
>
> A pushed-out-of-line value is replaced by a 20-byte pointer structure.
> There's no such thing as partially out-of-line.  See
> http://www.postgresql.org/docs/8.1/static/storage-toast.html

Ah, thanks.  I think possibly changing the storage on this field may
help some (I will gladly trade storage space for speed).  How can I
determine the default storage type for a given TOASTable data type (text
in this case)?

-Glen

Re: Effect of large text field data on queries

От
Tom Lane
Дата:
Glen Parker <glenebob@nwlink.com> writes:
> How can I determine the default storage type for a given TOASTable
> data type (text in this case)?

Look in pg_type ... but they mostly default to "extended".

            regards, tom lane