Обсуждение: notify duplicate elimination performance

Поиск
Список
Период
Сортировка

notify duplicate elimination performance

От
Hardy Falk
Дата:
I know that it is not a big problem for most users, but allowing a very
large number of notifications while using linear search is a bit dumb.
I can fix this with a very small modification to
struct Notification:
{char *channel ;char *payload ;uint32 hash ;struct Notification *left ;struct Notification *right ;
}
AsyncExistsPendingNotify does an iterative binary tree search.
The tree is insert-only, there is no need for rebalancing, and the code
is quite simple.
Any comments?



Re: notify duplicate elimination performance

От
Andres Freund
Дата:
Hi,

On 2014-02-08 18:56:41 +0100, Hardy Falk wrote:
> I know that it is not a big problem for most users, but allowing a very
> large number of notifications while using linear search is a bit dumb.
> I can fix this with a very small modification to
> struct Notification:
> {
>     char *channel ;
>     char *payload ;
>     uint32 hash ;
>     struct Notification *left ;
>     struct Notification *right ;
> }
> AsyncExistsPendingNotify does an iterative binary tree search.
> The tree is insert-only, there is no need for rebalancing, and the code
> is quite simple.
> Any comments?

Well, you didn't add any code, so it's hard to say... Simple ways of
doing what I think you describe will remove the queue's order. Do you
preserve the ordering guarantees?

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: notify duplicate elimination performance

От
Hardy Falk
Дата:
> Well, you didn't add any code, so it's hard to say... Simple ways of
> doing what I think you describe will remove the queue's order. Do you
> preserve the ordering guarantees?
> 
> Greetings,
> 
> Andres Freund
> 
Yes, the order is preserved.
I didn't remove the the original list code.
The tree is just an additional access path.

> oldcontext = MemoryContextSwitchTo(CurTransactionContext);
> 
>     n = (Notification *) palloc(sizeof(Notification));
>     n->channel = pstrdup(channel);
>     if (payload)
>         n->payload = pstrdup(payload);
>     else
>         n->payload = "";
>     n->hash = hash ;
>     n->left = NULL ;
>     n->right= NULL ;
>     *tt = n ; 
tt is a Notification** obtained by the search.

> static Notification **search(const char *channel, const char *payload ) 
> {
>     Notification *t,**tt ; 
>     uint32 hash ; 
>     t  = hashroot ;
>     tt = &hashroot ;
>     hash = hashf(channel,691) ;
>     hash = hashf(payload,hash) ;
>     while ( t )
>     {
>         if ( hash < t->hash )
>         {
>             tt = &t->left ;
>             t = t->left ;
>         }
>         else if ( hash > t->hash )
>         {
>             tt = &t->right ;
>             t = t->right ;
>         }
>         else
>         {
>             if (0==strcmp(t->channel,channel) && 0==strcmp(t->payload,payload))
>             {
>                 return NULL 
>             }
>             else
>             {
>                 tt = &t->left ;
>                 t = t->left ;
>             }
>         }
>     } 
>     return tt ; 
> }




Re: notify duplicate elimination performance

От
Andres Freund
Дата:
Hi,

On 2014-02-08 19:28:56 +0100, Hardy Falk wrote:
> > Well, you didn't add any code, so it's hard to say... Simple ways of
> > doing what I think you describe will remove the queue's order. Do you
> > preserve the ordering guarantees?
> > 
> > Greetings,
> > 
> > Andres Freund
> > 
> Yes, the order is preserved.
> I didn't remove the the original list code.
> The tree is just an additional access path.

How about actually attaching a patch?

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: notify duplicate elimination performance

От
Tom Lane
Дата:
Hardy Falk <hardy.falk@blue-cable.de> writes:
>> Well, you didn't add any code, so it's hard to say... Simple ways of
>> doing what I think you describe will remove the queue's order. Do you
>> preserve the ordering guarantees?

> Yes, the order is preserved.
> I didn't remove the the original list code.
> The tree is just an additional access path.

It seems likely that this approach would be a net loss for small numbers
of notify events (which is surely the common case).  Have you done any
measurements of the performance tradeoff?
        regards, tom lane



Re: notify duplicate elimination performance

От
Hardy Falk
Дата:
Tom Lane schrieb:
> Hardy Falk <hardy.falk@blue-cable.de> writes:
>>> Well, you didn't add any code, so it's hard to say... Simple ways of
>>> doing what I think you describe will remove the queue's order. Do you
>>> preserve the ordering guarantees?
> 
>> Yes, the order is preserved.
>> I didn't remove the the original list code.
>> The tree is just an additional access path.
> 
> It seems likely that this approach would be a net loss for small numbers
> of notify events (which is surely the common case).  Have you done any
> measurements of the performance tradeoff?
> 
>             regards, tom lane
> 
> 
I can easily keep the tail test. This avoids the hash computation in a
common case. The tree search compares only uint32 values, this should be
quite fast. Even a degenerate tree is no worse than the list. Im my
first tests I didn't use murmurhash, a select
pg_notify('alasys',format('Hello %s',x) from generate_series(1,1000000)
as x triggered this worst case. With murmurhash2 the average search
depth for 10^6 entries is 24.57.

I am about to prepare a patch, should I do some performance tests with
rtdsc first?