Re: Help Optimizing a Summary Query

Поиск
Список
Период
Сортировка
От David G Johnston
Тема Re: Help Optimizing a Summary Query
Дата
Msg-id 1418338537341-5830198.post@n5.nabble.com
обсуждение исходный текст
Ответ на Help Optimizing a Summary Query  (Robert DiFalco <robert.difalco@gmail.com>)
Список pgsql-general
Robert DiFalco wrote
> I have users, friends, and friend_requests. I need a query that
> essentially
> returns a summary containing:
>
>     * user (name, imageURL, bio, ...)
>     * Friend status (relative to an active user)
>        * Is the user a friend of the active user?
>        * Has the user sent a friend request to the active user?
>        * Has the user received a friend request from the active user?
>     * # of mutualFriends
>     * Exclude the active user from the result set.
>
> So I have mocked this up two ways but both have complicated query plans
> that will be problematic with large data sets. I'm thinking that my lack
> of
> deep SQL knowledge is making me miss the obvious choice.
>
> Here's my two query examples:
>
> SELECT u.id, u.name, u.imageURL, u.bio,
>    CASE
>       WHEN EXISTS(SELECT 1 FROM friends f WHERE f.user_id = 33 AND
> f.friend_id = u.id)       THEN 'isFriend'
>       WHEN EXISTS(SELECT 1 FROM friend_requests s WHERE s.to_id = 33   AND
> s.from_id = u.id) THEN 'hasSentRequest'
>       WHEN EXISTS(SELECT 1 FROM friend_requests r WHERE r.to_id = u.id AND
> r.from_id = 33)   THEN 'hasReceivedRequest'
>       ELSE 'none'
>    END AS "friendStatus",
>    (SELECT COUNT(1)
>       FROM friends f1
>          JOIN friends f2 ON f1.friend_id = f2.friend_id
>       WHERE f1.user_id = 33 AND f2.user_id = u.id) AS mutualFriends
> FROM users u
> WHERE u.id != 33 AND u.name LIKE 'John%' ORDER BY u.name;
>
> SELECT u.id, u.name, u.imageURL, u.bio,
>    CASE
>       WHEN f.friend_id IS NOT NULL THEN 'isFriend'
>       WHEN s.to_id IS NOT NULL THEN 'hasSentRequest'
>       WHEN r.to_id IS NOT NULL THEN 'hasReceivedRequest'
>       ELSE 'none'
>    END AS 'friendStatus',
>    (SELECT COUNT(1) AS d
>       FROM friends f1
>          JOIN friends f2 ON f1.fiend_id = f2.friend_id
>       WHERE f1.user_id = 33 AND f2.user_id = u.id)
> FROM users u
> LEFT OUTER JOIN friend_requests s ON s.to_id = 33 AND s.from_id = u.id
> LEFT OUTER JOIN friend_requests r ON r.to_id = u.id AND r.from_id = 33
> WHERE u.id != 33 AND u.name LIKE 'John%' ORDER BY u.name;
>
> 33 is just the id of the active user I am using for testing. The WHERE
> clause could be anything. I'm just using "u.name" here but I'm more
> concerned about the construction of the result set than the WHERE clause.
> These have more or less similar query plans, nothing that would change
> things factorially. Is this the best I can do or am I missing the obvious?

I dislike the multiple LEFT JOIN version though I did not try to prove that
it possible to give incorrect results.

The goal is to avoid looping - so you want to create temporary results that
will contain all of the data you plan to need and then join them together.
CTE/WITH is the feature that can do this most easily.

I have no idea how this will perform relative to the "CASE WHEN EXISTS"
version but it seems like it should be faster.  Again, I don't believe your
original LEFT JOIN query is equivalent to either of these but I cannot be
certain without more effort than I am able to put forth.

Hybrid SQL Code (note in particular that you cannot have literals in the
WITH field alias area...)

WITH user_ref (ref_u_id) AS ( VALUES (33) )
, users_vis_a_vis_ref (u_id, ref_id) AS ( ... WHERE u_id != ref_u_id)
, user_friend (u_id, ref_u_id, 'Friend' AS status_uf) AS ( ... )
, user_sent_request (u_id, ref_u_id, 'Sent' AS status_usr) AS ( ... )
, user_recv_request (u_id, ref_u_id, 'Received' AS status_urr) AS ( ... )
, user_mutuals (u_id, ref_u_id, ## AS mutual_count) AS ( ... )

SELECT u_id, ref_u_id
, COALESCE(status_uf, status_usr, status_urr, 'None') AS FriendStatus
, COALESCE(mutual_count, 0) AS MutualFriendCount
FROM users_vis_a_vis_ref
NATURAL LEFT JOIN user_friend
NATURAL LEFT JOIN user_sent_request
NATURAL LEFT JOIN user_recv_request
NATURAL LEFT JOIN user_mutuals

It is safe to use "NATURAL" here since you are fully controlling the source
relations since they all come from the CTE/WITH structure.

David J.



--
View this message in context: http://postgresql.nabble.com/Help-Optimizing-a-Summary-Query-tp5829941p5830198.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


В списке pgsql-general по дате отправления:

Предыдущее
От: Arthur Silva
Дата:
Сообщение: Re: Help Optimizing a Summary Query
Следующее
От: Manuel Kniep
Дата:
Сообщение: Re: ALTER TYPE ADD SEND AND RECEIVE