Обсуждение: Question regarding indices

Поиск
Список
Период
Сортировка

Question regarding indices

От
"Steve"
Дата:
Hello List,

I have a small question about the order of values in a query. Assume I have a table with the following fields: uid INT,
dataBIGINT, hits INT
 

And an unique index on (uid, data). I use libpq C API to query data from the table. The query is something like this:
SELECT uid,data,hits FROM mytable WHERE uid=2 AND data IN (2033,2499,590,19,201,659)

Would the speed of the query be influenced if I would sort the data? I can imagine that just querying a bunch of bigint
wouldnot make a big difference but what about several thousand of values? Would sorting them and sending the SQL query
withordered data influence the speed of the query?
 


// Steve
-- 
GRATIS: Spider-Man 1-3 sowie 300 weitere Videos!
Jetzt freischalten! http://portal.gmx.net/de/go/maxdome


Re: Question regarding indices

От
Tom Lane
Дата:
"Steve" <steeeeeveee@gmx.net> writes:
> I have a small question about the order of values in a query. Assume I have a table with the following fields:
>   uid INT,
>   data BIGINT,
>   hits INT

> And an unique index on (uid, data). I use libpq C API to query data from the table. The query is something like
this:
> SELECT uid,data,hits FROM mytable WHERE uid=2 AND data IN (2033,2499,590,19,201,659)

> Would the speed of the query be influenced if I would sort the data? I can imagine that just querying a bunch of
bigintwould not make a big difference but what about several thousand of values? Would sorting them and sending the SQL
querywith ordered data influence the speed of the query?
 

It's unlikely to make enough difference to be worth the trouble.
        regards, tom lane


Re: Question regarding indices

От
"Steve"
Дата:
-------- Original-Nachricht --------
> Datum: Sat, 11 Sep 2010 10:05:18 -0400
> Von: Michael Gould <mgould@intermodalsoftwaresolutions.net>
> An: Steve <steeeeeveee@gmx.net>
> Betreff: Re: [SQL] Question regarding indices

> Steve,
> 
Hello Michael,


> If I remember correctly the sort only works on the final result set and so
> doing a sort before the select isn't going to help
> 
I think you got me wrong. It's not that I want the RESULT to be sorted. I only asked if sorting the values in the query
wouldhave an effect on the speed of the query.
 

Or to ask it the other way around: Is an index in PostgreSQL sorted or not? If it is sorted and the PostgreSQL parser
islooking for entries in the index then having the values pre-sorted would probably be beneficial. Right?
 


// Steve

> Sent from Samsung mobile
> 
> Steve <steeeeeveee@gmx.net> wrote:
> 
> >Hello List,
> >
> >I have a small question about the order of values in a query. Assume I
> have a table with the following fields:
> >  uid INT,
> >  data BIGINT,
> >  hits INT
> >
> >And an unique index on (uid, data). I use libpq C API to query data from
> the table. The query is something like this:
> >SELECT uid,data,hits FROM mytable WHERE uid=2 AND data IN
> (2033,2499,590,19,201,659)
> >
> >Would the speed of the query be influenced if I would sort the data? I
> can imagine that just querying a bunch of bigint would not make a big
> difference but what about several thousand of values? Would sorting them and
> sending the SQL query with ordered data influence the speed of the query?
> >
> >
> >// Steve
> >-- 
> >GRATIS: Spider-Man 1-3 sowie 300 weitere Videos!
> >Jetzt freischalten! http://portal.gmx.net/de/go/maxdome
> >
> >-- 
> >Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
> >To make changes to your subscription:
> >http://www.postgresql.org/mailpref/pgsql-sql

-- 
GRATIS: Spider-Man 1-3 sowie 300 weitere Videos!
Jetzt freischalten! http://portal.gmx.net/de/go/maxdome


Re: Question regarding indices

От
"Steve"
Дата:
-------- Original-Nachricht --------
> Datum: Sat, 11 Sep 2010 11:04:16 -0400
> Von: Tom Lane <tgl@sss.pgh.pa.us>
> An: "Steve" <steeeeeveee@gmx.net>
> CC: pgsql-sql@postgresql.org
> Betreff: Re: [SQL] Question regarding indices

> "Steve" <steeeeeveee@gmx.net> writes:
> > I have a small question about the order of values in a query. Assume I
> have a table with the following fields:
> >   uid INT,
> >   data BIGINT,
> >   hits INT
> 
> > And an unique index on (uid, data). I use libpq C API to query data from
> the table. The query is something like this:
> > SELECT uid,data,hits FROM mytable WHERE uid=2 AND data IN
> (2033,2499,590,19,201,659)
> 
> > Would the speed of the query be influenced if I would sort the data? I
> can imagine that just querying a bunch of bigint would not make a big
> difference but what about several thousand of values? Would sorting them and
> sending the SQL query with ordered data influence the speed of the query?
> 
> It's unlikely to make enough difference to be worth the trouble.
> 
Making a quick sort is ultra easy in C. Anyway... is there a difference in the speed of the query with pre-sorted
valuesor not? If there is one then I will go and sort the values. Right now I have a quick sort implemented but I will
probablydo a lazy quick sort and then a final insertion sort, because insertion is faster on a slightly ordered dataset
thanquick sort.
 

Probably I am pulling hairs here about the speed but I really want to minimize the time it needs for PostgreSQL to
returnthe data. I personally am happy with the speed when using PostgreSQL but the application I use has an MySQL
drivertoo and I got some users claiming that MySQL is faster than PostgreSQL, transfers less data over the wire, etc...
andI want to optimize the PostgreSQL part to be on the same level as the MySQL part. So everything that helps to
squeezethe last nanosecond out of the PostgreSQL part is welcome. I already switched to binary transmission in order to
minimizethe data send over the wire when using PostgreSQL and I have added an function to do +/- what the MySQL
proprietary"INSERT ON DUPLICATE KEY UPDATE" does. I hate when users compare apples with oranges but what can I do? You
cannot explain them that PostgreSQL is different and more standard compliant and that the chance to loose data is lower
withPostgreSQL then with all that what MySQL is doing (MyISAM tables, etc...). It's pointless to explain to them. It's
liketrying to explain a mole how the sun is shining.
 

So all I want is to explore the available capabilities of PostgreSQL to get the best out of the libpq engine as
possible.If you have any recommendation at what I should look in order to get better speed, then let me know.
 


>             regards, tom lane
>
// Steve
-- 
Achtung Sicherheitswarnung: GMX warnt vor Phishing-Attacken!
http://portal.gmx.net/de/go/sicherheitspaket


Re: Question regarding indices

От
Tom Lane
Дата:
"Steve" <steeeeeveee@gmx.net> writes:
>> Von: Tom Lane <tgl@sss.pgh.pa.us>
>> It's unlikely to make enough difference to be worth the trouble.
>> 
> Making a quick sort is ultra easy in C. Anyway... is there a
> difference in the speed of the query with pre-sorted values or not?
> If there is one then I will go and sort the values.

I didn't opine on whether it was "easy" or not.  I said it was unlikely
to be worth the trouble.  You could very well spend more time sorting
the values than you buy in whatever you might save on the server side.

Each value in the IN list is going to require a separate index probe.
The sorting might buy something in locality of reference for successive
probes, but most likely not enough to notice.
        regards, tom lane


Re: Question regarding indices

От
Lew
Дата:
On 09/11/2010 08:29 AM, Steve wrote:
> I have a small question about the order of values in a query.
> Assume I have a table with the following fields:
>    uid INT,
>    data BIGINT,
>    hits INT
> And an unique index on (uid, data). I use libpq C API to query
> data from the table. The query is something like this:
> SELECT uid,data,hits FROM mytable WHERE uid=2
> AND data IN (2033,2499,590,19,201,659)
>
> Would the speed of the query be influenced if I would sort the data?

What do you mean by "sort the data"?  Which data?

> I can imagine that just querying a bunch of bigint would not make a
> big difference but what about several thousand of values? Would sorting
> them and sending the SQL query with ordered data influence the speed of the query?

Send the query from where to where?

Are you referring to a sort of the items in the IN subselect?  My guess is 
that sorting that won't matter but it's only a WAG.

-- 
Lew


Re: Question regarding indices

От
"Steve"
Дата:
-------- Original-Nachricht --------
> Datum: Sat, 11 Sep 2010 11:08:00 -0400
> Von: Lew <noone@lewscanon.com>
> An: pgsql-sql@postgresql.org
> Betreff: Re: [SQL] Question regarding indices

> On 09/11/2010 08:29 AM, Steve wrote:
> > I have a small question about the order of values in a query.
> > Assume I have a table with the following fields:
> >    uid INT,
> >    data BIGINT,
> >    hits INT
> > And an unique index on (uid, data). I use libpq C API to query
> > data from the table. The query is something like this:
> > SELECT uid,data,hits FROM mytable WHERE uid=2
> > AND data IN (2033,2499,590,19,201,659)
> >
> > Would the speed of the query be influenced if I would sort the data?
> 
> What do you mean by "sort the data"?  Which data?
> 
I mean sorting the values in the brackets. Instead of:
SELECT uid,data,hits FROM mytable WHERE uid=2 AND data IN (2033,2499,590,19,201,659)

I would then send this here:
SELECT uid,data,hits FROM mytable WHERE uid=2 AND data IN (19,201,590,659,2033,2499)

Off course this is a small dataset but the query usually has thousands of elements and not only the above 6 elements.


> > I can imagine that just querying a bunch of bigint would not make a
> > big difference but what about several thousand of values? Would sorting
> > them and sending the SQL query with ordered data influence the speed of
> the query?
> 
> Send the query from where to where?
> 
Sending the query from my application to the PostgreSQL server.


> Are you referring to a sort of the items in the IN subselect?
>
Yes.


> My guess is
> that sorting that won't matter but it's only a WAG.
> 
What is "WAG"?


> -- 
> Lew
> 
SteveB
-- 
Achtung Sicherheitswarnung: GMX warnt vor Phishing-Attacken!
http://portal.gmx.net/de/go/sicherheitspaket


Re: Question regarding indices

От
Frank Bax
Дата:
Steve wrote:
> -------- Original-Nachricht --------
>> Datum: Sat, 11 Sep 2010 11:08:00 -0400
>> Von: Lew <noone@lewscanon.com>
>> An: pgsql-sql@postgresql.org
>> Betreff: Re: [SQL] Question regarding indices
> 
>> On 09/11/2010 08:29 AM, Steve wrote:
>>> I have a small question about the order of values in a query.
>>> Assume I have a table with the following fields:
>>>    uid INT,
>>>    data BIGINT,
>>>    hits INT
>>> And an unique index on (uid, data). I use libpq C API to query
>>> data from the table. The query is something like this:
>>> SELECT uid,data,hits FROM mytable WHERE uid=2
>>> AND data IN (2033,2499,590,19,201,659)
>>>
>>> Would the speed of the query be influenced if I would sort the data?
>> What do you mean by "sort the data"?  Which data?
>>
> I mean sorting the values in the brackets. Instead of:
> SELECT uid,data,hits FROM mytable WHERE uid=2 AND data IN (2033,2499,590,19,201,659)
> 
> I would then send this here:
> SELECT uid,data,hits FROM mytable WHERE uid=2 AND data IN (19,201,590,659,2033,2499)
> 
> Off course this is a small dataset but the query usually has thousands of elements and not only the above 6
elements.


If there will be thousands; why not create a temp table containing these 
values then join to table - might that be faster?