Re: Use cases for lateral that do not involve a set returning function

Поиск
Список
Период
Сортировка
От Albe Laurenz
Тема Re: Use cases for lateral that do not involve a set returning function
Дата
Msg-id A737B7A37273E048B164557ADEF4A58B17DAA12C@ntex2010a.host.magwien.gv.at
обсуждение исходный текст
Ответ на Use cases for lateral that do not involve a set returning function  (AJ Welch <awelch0100@gmail.com>)
Ответы Re: Use cases for lateral that do not involve a set returning function  (AJ Welch <awelch0100@gmail.com>)
Re: Use cases for lateral that do not involve a set returning function  (Merlin Moncure <mmoncure@gmail.com>)
Список pgsql-general
AJ Welch wrote:
> http://blog.heapanalytics.com/postgresqls-powerful-new-join-type-lateral/
> 
> I suspected some of the claims in the post may not have been accurate. This one in particular:
> 
> "Without lateral joins, we would need to resort to PL/pgSQL to do this analysis. Or, if our data set
> were small, we could get away with complex, inefficient queries."
> 
> 
> The sum(1) and order by time limit 1 approach seemed less than ideal to me and I thought this analysis
> could be done with normal left joins instead of lateral left joins. So I came up with a proof of
> concept:
> 
> https://github.com/ajw0100/snippets/tree/master/SQL/lateral
> 
> 
> Is my conclusion in the README correct? Does anything beyond select...from...where force a nested
> loop? In that case, is lateral really only useful with set returning functions as the docs suggest?
> Does anyone know of any use cases for lateral that do not involve a set returning function?

Only recently I used lateral joins to optimize a query.

This is a sample of how the query looked bfore:

SELECT ...
FROM people p
     LEFT JOIN names n
        ON (n.people_id = p.people_id
            AND current_timestamp > n.validfrom
            AND NOT EXISTS (SELECT 1 FROM names n2
                            WHERE n2.people_id = p.people_id
                            AND current_timestamp > n2.validfrom
                            AND n2.validfrom > n.validfrom)
           )
WHERE p.id = ...

So basically it is supposed to find the latest valid name for a person.

This required two scans of the "names" table per "person" record.

I rewrote it as

SELECT ...
FROM people p
     LEFT JOIN LATERAL (SELECT * FROM names n
                        WHERE n.people_id = p.people_id
                        AND current_timestamp > n.validfrom
                        ORDER BY n.validfrom DESC LIMIT 1) n
        ON TRUE
WHERE p.id = ...

With the correct index this touched fewer blocks and worked faster.
Also, though this is of course a matter of taste, it is more readable.

Of course this forces a nested loop, but that is not bad as such.
In my case it was not problem (I tried to hint at that with the WHERE clause).

So yes, I think that LATERAL is useful even without set returning functions.

Yours,
Laurenz Albe

В списке pgsql-general по дате отправления:

Предыдущее
От: Anthony Nowocien
Дата:
Сообщение: Detecting corruption
Следующее
От: Daniel Begin
Дата:
Сообщение: Re: Removing duplicate records from a bulk upload (rationale behind selecting a method)