Обсуждение: Implementing a new Join Algorithm

Поиск
Список
Период
Сортировка

Implementing a new Join Algorithm

От
Anagh Lal
Дата:
Hi,
I am trying to test a new join algorithm by
implementing it on Postgresql. 
It would be great if you could give me some start off
pointers so as to where all in the source code I will
have to make changes. (I figure that I need to make
executor nodes, so i might need to write nodeNewjoin.c
etc in src/backend/executor, and somehow get the
planner/optimizer to select the new algorithm)

I tried going through the developers guide (7.3.1) but
it has many figures missing, making my job of
understanding the mechanism  very tough.  The figures
I am referring to are
{PostgreSQL 7.3.1 Documentation 
section 2.3.1: Figure \ref{parsetree},
\ref{where_clause} 
section 2.3.2 Figure \ref{transformed},
\ref{transformed_where} 
2.5.2 \ref{plan} \ref{simple_select}. 
}
Could you please point out where these figures are
available.
Regards
Anagh Lal.



__________________________________________________
Do you Yahoo!?
Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
http://mailplus.yahoo.com


Re: Implementing a new Join Algorithm

От
Tom Lane
Дата:
Anagh Lal <anaghlal2001@yahoo.com> writes:
> I am trying to test a new join algorithm by
> implementing it on Postgresql. 
> It would be great if you could give me some start off
> pointers so as to where all in the source code I will
> have to make changes.

Lots of places ;-).

You will find that a full-text indexing tool is an essential aid for
working with the Postgres source code.  I am partial to "glimpse" but
others use other things (searching for "glimpse" in the pghackers list
archives should turn up previous discussions of useful tools).

Once you've got one, looking for files mentioning both "merge" and
"hash" should locate all the incidental places you will need to hit.

The primary work will certainly be in adding an executor/nodeNewjoin.c
file and in teaching the planner how to plan the join type.  I think
most of the incidental work will be associated with creating new plan,
executor-state, and path node types and adding support code for these
where needed.  This should be largely boilerplate that you can produce
by emulating the existing join types.

Frankly I'd consider a new join type to be an overly ambitious target
for a person's first venture into the Postgres backend.  You'd probably
find it a good idea to tackle some smaller project(s) first ...
        regards, tom lane


Re: Implementing a new Join Algorithm

От
Sailesh Krishnamurthy
Дата:
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
   Tom> Anagh Lal <anaghlal2001@yahoo.com> writes:   >> I am trying to test a new join algorithm by implementing it on
>> Postgresql.  It would be great if you could give me some start   >> off pointers so as to where all in the source
codeI will have   >> to make changes.
 
   Tom> Lots of places ;-).
   Tom> You will find that a full-text indexing tool is an essential   Tom> aid for working with the Postgres source
code. I am partial
 

I've had great success with cscope. Especially with the XEmacs
interfaces. 

I think in fact the easy part is the executor stuff because it's
nicely localized. Getting the planner to choose a new option is a
little messier. 

As part of the TelegraphCQ project we've implemented what smells like
a symmetric hash join. Actually it's more like an N-way symmetric hash
join. This is done using the new SteM operator which can work in a
classical iterator model plan. With the Eddy operator we can avoid
static dataflows.

Since we were a bit chary of jumping headlong into the optimizer, we
"cheated". What we do is let the postgres optimizer do its thing and
produce a plan with Hash Joins. Then we walk the plan and replace Hash
Join nodes with our new SteM nodes. We call our conversion code in
pg_exec_query_string() right after pg_plan_query()

It's more complicated than that because we're actually implementing
the Eddy operator. But if your goal is just to try out your nice new
join algorithm, this would probably work and be a quick fix to get you
started.

-- 
Pip-pip
Sailesh
http://www.cs.berkeley.edu/~sailesh