Обсуждение: Missed compiler optimization issue in function select_rtable_names_for_explain

Поиск
Список
Период
Сортировка

Hi everyone,

I'm a compiler developer working on detecting missed optimization in real-world applications. Recently, we found that LLVM missed a dead store elimination optimization in the PostgreSQL code (https://github.com/postgres/postgres/blob/master/src/backend/utils/adt/ruleutils.c#L3794) in the master branch.

For the example below:

```

int dst[128];

memset(dst, 0, 128);

*unrelated = some_value;

dst[1] = 0;

dst[2] = 0;

```

LLVM cannot eliminate the useless stores after memset since the store to "unrelated" clobbers. But if we put the stores to "dst" ahead of the store to "unrelated", we could prevent confusing the compiler. See also the compiler explorer link: https://godbolt.org/z/P9jnKod3v and the issue of LLVM: https://github.com/llvm/llvm-project/issues/88632

To improve the codegen quality, I think it's also possible to modify the source code, changing the order of initialization of the member, to get better optimization. But I don't know whether this can be considered as a bug, thus post the issue here.

If anyone could confirm this problem or post a patch for it, let me know please. Thanks!


Best regards. Hongyu.

Re: Missed compiler optimization issue in function select_rtable_names_for_explain

От
Daniel Gustafsson
Дата:
> On 22 May 2024, at 11:27, XChy <xxs_chy@outlook.com> wrote:
>
> Hi everyone,
> I'm a compiler developer working on detecting missed optimization in real-world applications. Recently, we found that
LLVMmissed a dead store elimination optimization in the PostgreSQL code
(https://github.com/postgres/postgres/blob/master/src/backend/utils/adt/ruleutils.c#L3794)in the master branch. 

How is the memset in select_rtable_names_for_explain a dead-store?  Even memset
calls could be optimized away from the EXPLAIN codepath I have a feeling it
would have to be many in a tight loop for it to be measurable even?

--
Daniel Gustafsson




How is the memset in select_rtable_names_for_explain a dead-store? Even memset
calls could be optimized away from the EXPLAIN codepath I have a feeling it
would have to be many in a tight loop for it to be measurable even?

--
Daniel Gustafsson

For the first question, I don't mean that the memset is the dead store. I mean that the stores with value "0" after the memset are dead:

```

    dpns.subplans = NIL;
    dpns.ctes = NIL;
    dpns.appendrels = NULL;
```
since the memset has written zeroes to the object "dpns", and these members are known to be zero.

For the second question, you are right, I don't really profile it or measure the performance impact for it. I just think it's worthwhile to improve codegen quality without affecting readability, as adopting performance tips from some static analyzer.

Best regards, Hongyu.

Re: Missed compiler optimization issue in function select_rtable_names_for_explain

От
Daniel Gustafsson
Дата:
> On 22 May 2024, at 12:12, XChy <xxs_chy@outlook.com> wrote:
>
>> How is the memset in select_rtable_names_for_explain a dead-store? Even memset calls could be optimized away from
theEXPLAIN codepath I have a feeling it 
>> would have to be many in a tight loop for it to be measurable even?

> For the first question, I don't mean that the memset is the dead store.

Gotcha

> I mean that the stores with value "0" after the memset are dead:
> ```
>     dpns.subplans = NIL;
>     dpns.ctes = NIL;
>     dpns.appendrels = NULL;
> ```
> since the memset has written zeroes to the object "dpns", and these members are known to be zero.

They are known to be zero, but that's not entirely equivalent though is it?
NIL is defined as ((List *) NULL) and NULL is typically defined as ((void *)
0), so sizeof(0) would be the size of an int and sizeof(NULL) would be the size
of a void pointer.

--
Daniel Gustafsson




On 2024-May-22, XChy wrote:

> Hi everyone,
> 
> I'm a compiler developer working on detecting missed optimization in
> real-world applications. Recently, we found that LLVM missed a dead store
> elimination optimization in the PostgreSQL code
<https://github.com/postgres/postgres/blob/c37267162e889fe783786b9e28d1b65b82365a00/src/backend/utils/adt/ruleutils.c#L3794>
(https://github.com/postgres/postgres/blob/master/src/backend/utils/adt/ruleutils.c#L3794)
> in the master branch.

I think if we want to improve how this code is seen by the compiler by
modifying it, we should just remove the NULL/NIL assignments.  It's a
pretty arbitrary (and rather small) subset of fields being initialized,
fields which very obviously have been zeroed by memset already, so I
don't see any value in the explicit assignments.

The criteria may be different in cases where some struct has all of its
members initialized to something, because at least you can use that kind
of thing to find places you need to modify when adding new struct
members.  Not so in select_rtable_names_for_explain.

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"I can see support will not be a problem.  10 out of 10."    (Simon Wittber)
      (http://archives.postgresql.org/pgsql-general/2004-12/msg00159.php)




在 2024/5/22 18:55, Daniel Gustafsson 写道:
I mean that the stores with value "0" after the memset are dead:
```     dpns.subplans = NIL;    dpns.ctes = NIL;    dpns.appendrels = NULL;
```
since the memset has written zeroes to the object "dpns", and these members are known to be zero.
They are known to be zero, but that's not entirely equivalent though is it?
NIL is defined as ((List *) NULL) and NULL is typically defined as ((void *)
0), so sizeof(0) would be the size of an int and sizeof(NULL) would be the size
of a void pointer.

The type or size doesn't matter here. At IR or assembly level, they are all zeroes. 

My main point is that "dpns.xxx" are filled with zeroes by the memset firstly, and overwriting them with zeroes in the following stores is redundant. LLVM cannot remove the redundant overwrites due to the initialization order. If we adjust the order of the initialization of "dpns.xxx", the compiler can remove such stores.

Does my explanation make sense to you?

Best regards, Hongyu.

Daniel Gustafsson <daniel@yesql.se> writes:
> They are known to be zero, but that's not entirely equivalent though is it?
> NIL is defined as ((List *) NULL) and NULL is typically defined as ((void *)
> 0), so sizeof(0) would be the size of an int and sizeof(NULL) would be the size
> of a void pointer.

There are other places where we assume that a memset-to-zero will
produce null pointers, so I don't think that that objection has
a lot of force.  My real answer is that this is our coding style
and we are not going to change it: our normal convention is to
initialize struct fields in declaration order, and that's what
we're doing here.  If some particular version of some particular
compiler fails to make an entirely-negligible optimization in
consequence, that is not something we are going to care about.

Maybe we would care if the missed optimization were a serious
performance loss in a very hot code path.  But this is neither.

I'd also say that this is pretty clearly a compiler bug.
If it'd normally optimize away a null-pointer-store following
a memset-to-zero, but does not in

    memset(&dpns, 0, sizeof(dpns));
    dpns.rtable = rtable;
    dpns.subplans = NIL;
    dpns.ctes = NIL;
    dpns.appendrels = NULL;

that would seem to indicate that the optimizer doesn't
really understand that dpns.rtable is a distinct field
from the others.  How is that our problem?  (I kind of
wonder if what's actually blocking the optimization is
the casts inside the NIL macros.  Still not our problem.)

            regards, tom lane



Re: Missed compiler optimization issue in function select_rtable_names_for_explain

От
Daniel Gustafsson
Дата:
> On 22 May 2024, at 13:00, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

> I think if we want to improve how this code is seen by the compiler by
> modifying it, we should just remove the NULL/NIL assignments.

*If* the optimization is measurable, IMHO.

> It's a
> pretty arbitrary (and rather small) subset of fields being initialized,
> fields which very obviously have been zeroed by memset already, so I
> don't see any value in the explicit assignments.

My personal opinion is that there is value in immediately knowing the type of
the members from the inits, in this case I know without looking up the
definition that some of the members are list.  I like that readability.  That
being said, I for sure won't fight if there is concensus to remove.

--
Daniel Gustafsson




Re: Missed compiler optimization issue in function select_rtable_names_for_explain

От
Daniel Gustafsson
Дата:
> On 22 May 2024, at 18:53, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Daniel Gustafsson <daniel@yesql.se> writes:
>> They are known to be zero, but that's not entirely equivalent though is it?
>> NIL is defined as ((List *) NULL) and NULL is typically defined as ((void *)
>> 0), so sizeof(0) would be the size of an int and sizeof(NULL) would be the size
>> of a void pointer.
>
> There are other places where we assume that a memset-to-zero will
> produce null pointers, so I don't think that that objection has
> a lot of force.

It wasn't really an objection, but (perhaps a badly worded) an attempt to
understand the proposal.

> My real answer is that this is our coding style
> and we are not going to change it: our normal convention is to
> initialize struct fields in declaration order, and that's what
> we're doing here.  If some particular version of some particular
> compiler fails to make an entirely-negligible optimization in
> consequence, that is not something we are going to care about.

+1

--
Daniel Gustafsson




Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> I think if we want to improve how this code is seen by the compiler by
> modifying it, we should just remove the NULL/NIL assignments.  It's a
> pretty arbitrary (and rather small) subset of fields being initialized,
> fields which very obviously have been zeroed by memset already, so I
> don't see any value in the explicit assignments.

I have to concede that it's unclear why we're filling just these
fields explicitly and not any others.  We're at least consistent:
all the callers of select_rtable_names do it the same way.

            regards, tom lane



Hash join and picking which result set to build the hash table with.

От
"Dirschel, Steve"
Дата:
The query and execution plan are shown below.  My question is related to the result set the optimizer is choosing to
buildthe hash table from.  My understanding is for a hash join you want to build the hash table out of the smaller
resultset.  If you look at the execution plan below you can see the optimizer estimates 1000 rows from the seq scan of
tablecollection and 120,000 rows from the seq scan of table docloc_test but is building the hash buckets from those
120000rows rather than from the 1000 rows.  In our case under certain volume that causes that to spill to temp and
underhigh load it hangs up the Aurora Postgres database.  But if it were to build the hash table out of those 1000 rows
itwould fit in work_mem so no problems.  Why is it picking the larger result set? 

Another funky thing here-  here are the table definitions:

pgcci01ap=> \d CCI.COLLECTION

        Column        |              Type              | Collation | Nullable |                Default
----------------------+--------------------------------+-----------+----------+----------------------------------------
 collection_name      | character varying(40)          |           | not null |
 l_stage              | numeric(11,0)                  |           |          |
 p_stage              | numeric(11,0)                  |           |          |
 t_stage              | numeric(11,0)                  |           |          |
 last_upd_datetime    | timestamp(6) without time zone |           | not null |
 last_upd_inits       | character varying(30)          |           | not null |
 owner_name           | character varying(30)          |           |          |
 password             | character varying(30)          |           |          |
 email_address        | character varying(2000)        |           |          |
 available_flag       | character(1)                   |           |          |
 collection_id        | numeric(11,0)                  |           | not null |
 collection_type      | character varying(20)          |           |          |
 retrieval_password   | character varying(40)          |           |          |
 partner_coll_name    | character varying(40)          |           |          |
 relation2partner     | character varying(20)          |           |          |
 reload_flag          | character(1)                   |           |          | 'N'::bpchar
 partner_id           | character varying(40)          |           |          |
 content_timezone     | character varying(40)          |           | not null | 'America/Chicago'::character varying
 token_type           | character varying(30)          |           |          |
 cc_collection_dest   | character varying(40)          |           |          |
 auto_reclaim_enabled | character(1)                   |           | not null | 'N'::bpchar
 collection_family    | character varying(40)          |           | not null | '<COLLECTION NAME>'::character varying
 access_password      | character(40)                  |           |          |
 mic_group            | character varying(40)          |           |          |
 mic_type             | character varying(10)          |           |          |
 retrieval_source     | character varying(40)          |           | not null | 'DOC1'::character varying
Indexes:
    "xpkcollection" PRIMARY KEY, btree (collection_name)
    "xak1collection" UNIQUE CONSTRAINT, btree (collection_id)
    "xie1collection" btree (relation2partner)
    "xie2collection" btree (collection_family, collection_name)
    "xie3collection" btree (mic_group)
Referenced by:
    TABLE "cci.index_update_proc" CONSTRAINT "rfk12_index_update_proc" FOREIGN KEY (collection_name) REFERENCES
cci.collection(collection_name)
    TABLE "cci.authority_update_proc" CONSTRAINT "rfk1_authority_update_proc" FOREIGN KEY (collection_name) REFERENCES
cci.collection(collection_name)
    TABLE "cci.collection_event" CONSTRAINT "rfk1_collection_event" FOREIGN KEY (collection_name) REFERENCES
cci.collection(collection_name)
    TABLE "cci.collection_pit" CONSTRAINT "rfk1_collection_pit" FOREIGN KEY (collection_name) REFERENCES
cci.collection(collection_name)
    TABLE "cci.collection_stage" CONSTRAINT "rfk1_collection_stage" FOREIGN KEY (collection_name) REFERENCES
cci.collection(collection_name)
    TABLE "cci.csloc_update_proc" CONSTRAINT "rfk1_csloc_update_proc" FOREIGN KEY (collection_name) REFERENCES
cci.collection(collection_name)
    TABLE "cci.docloc_update_proc" CONSTRAINT "rfk1_docloc_update_proc" FOREIGN KEY (collection_name) REFERENCES
cci.collection(collection_name)
    TABLE "cci.index_set_mrg" CONSTRAINT "rfk1_index_set_mrg" FOREIGN KEY (collection_name) REFERENCES
cci.collection(collection_name)
    TABLE "cci.index_set_stats" CONSTRAINT "rfk1_index_set_stats" FOREIGN KEY (collection_name) REFERENCES
cci.collection(collection_name)
    TABLE "cci.index_system_attr" CONSTRAINT "rfk1_index_system_attr" FOREIGN KEY (collection_name) REFERENCES
cci.collection(collection_name)
    TABLE "cci.load_update_proc" CONSTRAINT "rfk1_load_update_proc" FOREIGN KEY (collection_name) REFERENCES
cci.collection(collection_name)
    TABLE "cci.metadoc_update_proc" CONSTRAINT "rfk1_metadoc_update_proc" FOREIGN KEY (collection_name) REFERENCES
cci.collection(collection_name)
    TABLE "cci.mm_update_process" CONSTRAINT "rfk1_mm_update_process" FOREIGN KEY (collection_name) REFERENCES
cci.collection(collection_name)
    TABLE "cci.reclaim" CONSTRAINT "rfk1_reclaim" FOREIGN KEY (collection_name) REFERENCES
cci.collection(collection_name)
    TABLE "cci.rel_grp_upd_proc" CONSTRAINT "rfk1_rel_grp_upd_proc" FOREIGN KEY (collection_name) REFERENCES
cci.collection(collection_name)
    TABLE "cci.toc_update_process" CONSTRAINT "rfk1_toc_update_process" FOREIGN KEY (collection_name) REFERENCES
cci.collection(collection_name)
    TABLE "cci.view_definition" CONSTRAINT "rfk1_view_definition" FOREIGN KEY (collection_name) REFERENCES
cci.collection(collection_name)
    TABLE "cci.dcsloc_collection_stats" CONSTRAINT "rfk1dcsloc_collection_stats" FOREIGN KEY (collection_name)
REFERENCEScci.collection(collection_name) 
    TABLE "cci.doc_data_domain" CONSTRAINT "rfk1doc_data_domain" FOREIGN KEY (collection_name) REFERENCES
cci.collection(collection_name)
    TABLE "cci.doc_update_process" CONSTRAINT "rfk2_doc_update_process" FOREIGN KEY (collection_name) REFERENCES
cci.collection(collection_name)
    TABLE "cci.dcsloc_partition_map" CONSTRAINT "rfk2dcsloc_partition_map" FOREIGN KEY (collection_name) REFERENCES
cci.collection(collection_name)
    TABLE "cci.meta_update_proc" CONSTRAINT "rfk3_metadoc_update_proc" FOREIGN KEY (collection_name) REFERENCES
cci.collection(collection_name)

pgcci01ap=> \d docloc_test

      Column       |              Type              | Collation | Nullable |   Default
-------------------+--------------------------------+-----------+----------+-------------
 collection_name   | character varying(40)          |           | not null |
 stage_id          | numeric(11,0)                  |           | not null |
 begin_stage_id    | numeric(11,0)                  |           |          |
 last_upd_datetime | timestamp(6) without time zone |           |          |
 last_upd_inits    | character varying(30)          |           |          |
 docloc_check_type | character(1)                   |           | not null | 'S'::bpchar
 replicate_done    | character(1)                   |           | not null | 'N'::bpchar
 docloc_id         | numeric(11,0)                  |           |          |
Indexes:
    "xpkdocloc" PRIMARY KEY, btree (collection_name, stage_id)
Foreign-key constraints:
    "rfk1_docloc" FOREIGN KEY (collection_name, stage_id) REFERENCES cci.collection_stage(collection_name, stage_id)

pgcci01ap=> \d CCI.COLLECTION_PIT

      Column       |              Type              | Collation | Nullable | Default
-------------------+--------------------------------+-----------+----------+---------
 collection_name   | character varying(40)          |           | not null |
 pit_id            | numeric(11,0)                  |           | not null |
 stage_code        | character(1)                   |           | not null |
 stage_id          | numeric(11,0)                  |           | not null |
 last_upd_datetime | timestamp(6) without time zone |           | not null |
 last_upd_inits    | character varying(30)          |           | not null |
Indexes:
    "xpkcollection_pit" PRIMARY KEY, btree (collection_name, stage_code, pit_id, stage_id)
    "xak1collection_pit" UNIQUE, btree (collection_name, stage_code, pit_id, stage_id, last_upd_datetime DESC,
last_upd_initsDESC) 
    "xak2collection_pit" btree (collection_name, stage_code, stage_id)
Foreign-key constraints:
    "rfk1_collection_pit" FOREIGN KEY (collection_name) REFERENCES cci.collection(collection_name)
Triggers:
    td_collection_pit AFTER DELETE ON cci.collection_pit FOR EACH ROW EXECUTE FUNCTION
cci."td_collection_pit$collection_pit"()


When running some tests I forgot to create the PK on table docloc_test.  When the PK was not on the table the optimizer
decidedto create the hash table off the 1000 rows from collection.  But as soon as I put the PK on that table it then
decidesto use docloc_test to build the hash table.  I can understand how the PK missing or not could impact the
executionplan (full scanning the table and hash join vs nested looping to it) but in both cases docloc_test and
collectionwere hash joined and the difference was which result set was used to build the hash table.  I cannot come up
withany theory on why the existence of non-existence of this PK would impact which result set the hash table was built
from. In both cases the row estimates from collection and docloc_test were exactly the same in both plans (1000 from
collection,120000 fro mdocloc_test). 


explain (analyze, buffers)
SELECT  DOCLOC.BEGIN_STAGE_ID, DOCLOC.COLLECTION_NAME, DOCLOC.DOCLOC_CHECK_TYPE, DOCLOC.DOCLOC_ID,
DOCLOC.LAST_UPD_DATETIME,DOCLOC.LAST_UPD_INITS, 
DOCLOC.REPLICATE_DONE, DOCLOC.STAGE_ID, COLLECTION.ACCESS_PASSWORD, COLLECTION.AUTO_RECLAIM_ENABLED,
COLLECTION.AVAILABLE_FLAG,COLLECTION.CC_COLLECTION_DEST, 
COLLECTION.COLLECTION_FAMILY, COLLECTION.COLLECTION_ID, COLLECTION.COLLECTION_NAME, COLLECTION.COLLECTION_TYPE,
COLLECTION.CONTENT_TIMEZONE,COLLECTION.EMAIL_ADDRESS, 
COLLECTION.LAST_UPD_DATETIME, COLLECTION.LAST_UPD_INITS, COLLECTION.L_STAGE, COLLECTION.MIC_GROUP, COLLECTION.MIC_TYPE,
COLLECTION.OWNER_NAME,
COLLECTION.PARTNER_COLL_NAME, COLLECTION.PARTNER_ID, COLLECTION.PASSWORD, COLLECTION.P_STAGE,
COLLECTION.RELATION2PARTNER,COLLECTION.RELOAD_FLAG, 
COLLECTION.RETRIEVAL_PASSWORD, COLLECTION.RETRIEVAL_SOURCE, COLLECTION.TOKEN_TYPE, COLLECTION.T_STAGE
FROM DOCLOC_test docloc, CCI.COLLECTION
WHERE  COLLECTION.COLLECTION_ID IN
(2188,15418,1427,1425,4584,1424,1429,1426,1638,1639,1640,1641,1642,1684,1685,1686,1428,13727,7421,7422,2189,5145,2599,2992,9245,9246,9247,9248,9249,9250,9251,9252,5121,19614,-6322,928,1544,-14765,929,-6323,930,931,932,1542,3890,6594,8986,-6279,19631,1466,19639,1467,1468,1475,1540,3892,-6348,

-6358,4753,1978,1977,1966,2687,2686,2674,-6350,3060,3600,426,3654,2718,1824,445,496,556,432,421,423,471,16152,16362,16154,16155,16156,16157,16799,507,573,574,404,524,743,15388,694,3537,1408,10097,9376,9375,10098,9377,8559,8557,1599,1963,1976,8551,8555,8848,8552,8554,8734,8735,10306,8739,8736,

8737,9253,9254,9255,9256,9257,9258,9063,9194,9065,9066,9067,9068,9069,9070,9071,9072,9177,9178,9179,9180,9181,9182,9183,9184,9093,4222,3048,3116,6971,6972,6973,6974,2675,8387,2360,2672,2673,2681,2682,2683,1528,2671,2676,7438,2678,2677,2679,2685,2684,2361,2680,2688,3533,10605,349,385,396,631,

2541,1396,1979,3022,1980,138,12,6835,10609,7164,5307,6863,6864,20444,695,1815,514,600,601,602,603,604,605,3069,2321,790,446,490,6520,2006,3157,3158,3917,419,461,437,555,748,464,427,3539,16158,8738,8733,18,3628,3626,3627,4805,4806,5054,3630,3629,3631,4807,3634,4714,240,13710,7227,8909,8910,

8911,10730,8912,8951,2019,7315,7078,7930,7928,7929,693,7261,7262,239,242,8952,7228,330,328,7079,4812,4813,6975,14410,6976,6977,6978,6979,6980,6981,6982,6983,6984,6985,6986,6987,6988,6989,6990,6991,6992,6993,6994,6995,6996,6997,6998,6999,7000,7001,7002,7003,7004,7828,7829,4479,6544,10706,

10707,6545,4010,4011,6598,6847,6850,6848,6851,2892,6484,6486,6485,4009,4012,3040,3039,10746,8233,4476,4477,7255,6889,5139,7464,7465,7466,8388,7469,16159,16160,752,3540,831,3147,3148,3149,3137,5481,9161,1825,2028,9230,460,642,1827,8950,751,2833,3162,1990,1523,755,2013,754,1829,467,753,474,

2793,703,1632,412,4814,4815,4816,540,541,1406,2020,6795,7220,9554,9555,9556,2999,17387,17388,17443,17444,10682,10681,10683,10686,10684,10687,10685,10688,20,2510,155,21,22,26,27,28,29,30,31,32,33,7005,7006,7007,7008,7009,7010,7011,7125,7013,7014,7015,7016,7017,7018,7019,7020,6970,1512,2439,

17844,2959,16211,2506,1450,3211,1598,2037,3587,8563,8564,8565,8566,7468,8393,7467,8276,8392,3575,3576,9576,10516,3557,6900,1819,1809,1808,1805,1804,1807,1806,8558,10610,16806,16807,16808,16809,16810,2523,16811,16812,2522,3907,15,17,3817,3798,3810,3794,3821,3808,837,758,15389,757,814,548,

1419,3163,2957,15498,15499,2539,10295,16381,2032,1315,716,1421,16800,762,1833,8612,2039,17684,17685,15500,1834,1566,425,468,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,19880,379,19881,19884,19885,383,19886,19888,203,50,51,52,53,54,55,8567,8568,8569,8570,8571,8572,8573,8574,8575,8576,

8577,8578,8579,8580,8581,8582,8583,8584,8585,8586,8587,8588,8589,8590,8591,8592,8593,8594,8595,8596,8953,3811,7196,3784,3785,7216,7211,3800,9175,4701,3781,7215,7210,3813,3828,3789,7217,7212,3818,3792,3832,3829,3804,3807,7218,7213,5102,3809,7214,5095,3830,10034,4988,2924,2925,4553,4554,7221,

7222,7869,8271,8275,8410,16161,16162,8411,9428,414,476,935,20533,20534,20668,20679,20680,710,1401,433,2040,1838,3164,1853,8835,20681,2844,3165,16163,16164,771,56,2499,2501,235,57,58,59,60,61,62,63,141,202,2502,190,191,192,189,140,534,535,2508,139,533,536,199,200,2500,304,305,306,307,8597,

8598,8599,8600,8601,8602,8785,3586,3585,1612,1613,1614,1616,1617,1647,1648,1649,1652,1820,16375,13728,14411,2046,2047,2186,2263,2318,2432,2448,2442,2487,2525,2527,2535,2537,2544,22013,22014,22015,22016,769,2794,2200,484,1949,7360,8896,1812,3139,7115,2660,6912,659,3071,8647,711,1962,4002,16801,

705,6911,774,16166,16167,4495,778,7205,706,672,3054,449,2958,308,309,310,311,312,313,314,315,316,317,318,319,15400,15401,15402,15403,16685,15405,15406,15407,15408,15409,15410,15411,15412,15413,15414,16210,233,2507,2691,2692,2695,2801,2805,2806,2818,2905,2845,2987,2995,3026,3049,3050,3052,3066,

3091,3095,3186,3526,3649,3888,3884,3886,4288,4378,4517,4916,4917,5003,8561,21903,21911,21927,14644,1411,707,463,15501,1950,405,671,3055,15394,1170,1168,3152,9163,16382,16217,1844,417,2444,1841,2035,781,16363,2721,684,784,10571,782,462,646,3056,443,479,2007,3167,3168,118,2509,17019,17020,17021,

17022,17023,17445,17446,14395,17447,17448,17449,1447,247,3908,2187,1646,2511,3974,3975,3982,3983,3984,3985,3986,3987,3988,3989,3990,3991,3992,21880,21881,21885,21886,21887,21889,21890,21891,21939,21942,13714,21943,21945,21946,21962,4556,16218,4557,6539,4555,7452,1410,420,454,16169,16170,16171,

16172,16173,1872,16174,785,1847,1454,16175,-703,406,525,526,788,470,742,1947,7195,793,3153,503,563,564,2834,444,3993,3994,3995,3996,3997,3998,3999,4297,4298,4299,4300,4301,4302,4303,4434,4435,4436,4437,4438,4439,4440,4441,4442,4443,4444,4445,4446,4447,4448,4449,4450,4451,4452,4453,21940,21941,

21975,21977,21978,21979,21981,18232,8054,8055,8060,8058,8059,8056,8057,18233,18234,18235,18236,18237,8042,8043,8044,8045,8046,8047,8048,8049,8050,8051,8052,21985,21986,21969,21982,21983,4481,2720,741,16176,15395,3602,409,1811,16365,599,3072,4044,799,1869,3141,403,480)
AND COLLECTION.COLLECTION_NAME=DOCLOC.COLLECTION_NAME
AND DOCLOC.STAGE_ID=
    (SELECT MAX (STAGE_ID)
       FROM CCI.COLLECTION_PIT
      WHERE COLLECTION_PIT.COLLECTION_NAME=COLLECTION.COLLECTION_NAME
        AND COLLECTION_PIT.PIT_ID<=2147483647
        AND COLLECTION_PIT.STAGE_CODE='F');

The plan below was executed with work_mem at 12 MB so it's not spilling to temp but this is only for my session.  At
4MBit will spill to temp. 

Hash Join  (cost=4302.50..5658.10 rows=1 width=529) (actual time=67.959..81.185 rows=496 loops=1)
   Hash Cond: (((collection.collection_name)::text = (docloc.collection_name)::text) AND ((SubPlan 2) =
docloc.stage_id))
   Buffers: shared hit=7735
   ->  Seq Scan on collection  (cost=2.50..1040.08 rows=1000 width=478) (actual time=0.058..4.327 rows=1000 loops=1)
         Filter: (collection_id = ANY
('{2188,15418,1427,1425,4584,1424,1429,1426,1638,1639,1640,1641,1642,1684,1685,1686,1428,13727,7421,7422,2189,5145,2599,2992,9245,9246,9247,9248,9249,9250,9251,9252,5

121,19614,-6322,928,1544,-14765,929,-6323,930,931,932,1542,3890,6594,8986,-6279,19631,1466,19639,1467,1468,1475,1540,3892,-6348,-6358,4753,1978,1977,1966,2687,2686,2674,-6350,3060,3600,426,3654,2718,1824,

445,496,556,432,421,423,471,16152,16362,16154,16155,16156,16157,16799,507,573,574,404,524,743,15388,694,3537,1408,10097,9376,9375,10098,9377,8559,8557,1599,1963,1976,8551,8555,8848,8552,8554,8734,8735,103

06,8739,8736,8737,9253,9254,9255,9256,9257,9258,9063,9194,9065,9066,9067,9068,9069,9070,9071,9072,9177,9178,9179,9180,9181,9182,9183,9184,9093,4222,3048,3116,6971,6972,6973,6974,2675,8387,2360,2672,2673,2

681,2682,2683,1528,2671,2676,7438,2678,2677,2679,2685,2684,2361,2680,2688,3533,10605,349,385,396,631,2541,1396,1979,3022,1980,138,12,6835,10609,7164,5307,6863,6864,20444,695,1815,514,600,601,602,603,604,6

05,3069,2321,790,446,490,6520,2006,3157,3158,3917,419,461,437,555,748,464,427,3539,16158,8738,8733,18,3628,3626,3627,4805,4806,5054,3630,3629,3631,4807,3634,4714,240,13710,7227,8909,8910,8911,10730,8912,8

951,2019,7315,7078,7930,7928,7929,693,7261,7262,239,242,8952,7228,330,328,7079,4812,4813,6975,14410,6976,6977,6978,6979,6980,6981,6982,6983,6984,6985,6986,6987,6988,6989,6990,6991,6992,6993,6994,6995,6996

,6997,6998,6999,7000,7001,7002,7003,7004,7828,7829,4479,6544,10706,10707,6545,4010,4011,6598,6847,6850,6848,6851,2892,6484,6486,6485,4009,4012,3040,3039,10746,8233,4476,4477,7255,6889,5139,7464,7465,7466,

8388,7469,16159,16160,752,3540,831,3147,3148,3149,3137,5481,9161,1825,2028,9230,460,642,1827,8950,751,2833,3162,1990,1523,755,2013,754,1829,467,753,474,2793,703,1632,412,4814,4815,4816,540,541,1406,2020,6

795,7220,9554,9555,9556,2999,17387,17388,17443,17444,10682,10681,10683,10686,10684,10687,10685,10688,20,2510,155,21,22,26,27,28,29,30,31,32,33,7005,7006,7007,7008,7009,7010,7011,7125,7013,7014,7015,7016,7

017,7018,7019,7020,6970,1512,2439,17844,2959,16211,2506,1450,3211,1598,2037,3587,8563,8564,8565,8566,7468,8393,7467,8276,8392,3575,3576,9576,10516,3557,6900,1819,1809,1808,1805,1804,1807,1806,8558,10610,1

6806,16807,16808,16809,16810,2523,16811,16812,2522,3907,15,17,3817,3798,3810,3794,3821,3808,837,758,15389,757,814,548,1419,3163,2957,15498,15499,2539,10295,16381,2032,1315,716,1421,16800,762,1833,8612,203

9,17684,17685,15500,1834,1566,425,468,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,19880,379,19881,19884,19885,383,19886,19888,203,50,51,52,53,54,55,8567,8568,8569,8570,8571,8572,8573,8574,8575,8576,85

77,8578,8579,8580,8581,8582,8583,8584,8585,8586,8587,8588,8589,8590,8591,8592,8593,8594,8595,8596,8953,3811,7196,3784,3785,7216,7211,3800,9175,4701,3781,7215,7210,3813,3828,3789,7217,7212,3818,3792,3832,3

829,3804,3807,7218,7213,5102,3809,7214,5095,3830,10034,4988,2924,2925,4553,4554,7221,7222,7869,8271,8275,8410,16161,16162,8411,9428,414,476,935,20533,20534,20668,20679,20680,710,1401,433,2040,1838,3164,18

53,8835,20681,2844,3165,16163,16164,771,56,2499,2501,235,57,58,59,60,61,62,63,141,202,2502,190,191,192,189,140,534,535,2508,139,533,536,199,200,2500,304,305,306,307,8597,8598,8599,8600,8601,8602,8785,3586

,3585,1612,1613,1614,1616,1617,1647,1648,1649,1652,1820,16375,13728,14411,2046,2047,2186,2263,2318,2432,2448,2442,2487,2525,2527,2535,2537,2544,22013,22014,22015,22016,769,2794,2200,484,1949,7360,8896,181

2,3139,7115,2660,6912,659,3071,8647,711,1962,4002,16801,705,6911,774,16166,16167,4495,778,7205,706,672,3054,449,2958,308,309,310,311,312,313,314,315,316,317,318,319,15400,15401,15402,15403,16685,15405,154

06,15407,15408,15409,15410,15411,15412,15413,15414,16210,233,2507,2691,2692,2695,2801,2805,2806,2818,2905,2845,2987,2995,3026,3049,3050,3052,3066,3091,3095,3186,3526,3649,3888,3884,3886,4288,4378,4517,491

6,4917,5003,8561,21903,21911,21927,14644,1411,707,463,15501,1950,405,671,3055,15394,1170,1168,3152,9163,16382,16217,1844,417,2444,1841,2035,781,16363,2721,684,784,10571,782,462,646,3056,443,479,2007,3167,

3168,118,2509,17019,17020,17021,17022,17023,17445,17446,14395,17447,17448,17449,1447,247,3908,2187,1646,2511,3974,3975,3982,3983,3984,3985,3986,3987,3988,3989,3990,3991,3992,21880,21881,21885,21886,21887,

21889,21890,21891,21939,21942,13714,21943,21945,21946,21962,4556,16218,4557,6539,4555,7452,1410,420,454,16169,16170,16171,16172,16173,1872,16174,785,1847,1454,16175,-703,406,525,526,788,470,742,1947,7195,

793,3153,503,563,564,2834,444,3993,3994,3995,3996,3997,3998,3999,4297,4298,4299,4300,4301,4302,4303,4434,4435,4436,4437,4438,4439,4440,4441,4442,4443,4444,4445,4446,4447,4448,4449,4450,4451,4452,4453,2194

0,21941,21975,21977,21978,21979,21981,18232,8054,8055,8060,8058,8059,8056,8057,18233,18234,18235,18236,18237,8042,8043,8044,8045,8046,8047,8048,8049,8050,8051,8052,21985,21986,21969,21982,21983,4481,2720,
741,16176,15395,3602,409,1811,16365,599,3072,4044,799,1869,3141,403,480}'::numeric[]))
         Rows Removed by Filter: 15110
         Buffers: shared hit=598
   ->  Hash  (cost=2500.00..2500.00 rows=120000 width=51) (actual time=67.466..67.467 rows=120000 loops=1)
         Buckets: 131072  Batches: 1  Memory Usage: 11485kB
         Buffers: shared hit=1300
         ->  Seq Scan on docloc_test docloc  (cost=0.00..2500.00 rows=120000 width=51) (actual time=0.004..22.313
rows=120000loops=1) 
               Buffers: shared hit=1300
   SubPlan 2
     ->  Result  (cost=6.25..6.26 rows=1 width=32) (actual time=0.005..0.005 rows=1 loops=1496)
           Buffers: shared hit=5837
           InitPlan 1 (returns $1)
             ->  Limit  (cost=0.42..6.25 rows=1 width=6) (actual time=0.005..0.005 rows=1 loops=1496)
                   Buffers: shared hit=5837
                   ->  Index Scan Backward using xak2collection_pit on collection_pit  (cost=0.42..64.50 rows=11
width=6)(actual time=0.004..0.004 rows=1 loops=1496) 
                         Index Cond: (((collection_name)::text = (collection.collection_name)::text) AND (stage_code =
'F'::bpchar)AND (stage_id IS NOT NULL)) 
                         Filter: (pit_id <= '2147483647'::numeric)
                         Buffers: shared hit=5837
 Planning:
   Buffers: shared hit=78
 Planning Time: 2.287 ms
 Execution Time: 83.361 ms

Thanks in advance.
This e-mail is for the sole use of the intended recipient and contains information that may be privileged and/or
confidential.If you are not an intended recipient, please notify the sender by return e-mail and delete this e-mail and
anyattachments. Certain required legal entity disclosures can be accessed on our website:
https://www.thomsonreuters.com/en/resources/disclosures.html



"Dirschel, Steve" <steve.dirschel@thomsonreuters.com> writes:
> The query and execution plan are shown below.  My question is
> related to the result set the optimizer is choosing to build the
> hash table from.  My understanding is for a hash join you want to
> build the hash table out of the smaller result set.

That's *a* consideration, but not the only one.  We also consider
whether the hash key has a flat distribution; if it is too skewed,
we might find specific hash chains getting too long.

> When running some tests I forgot to create the PK on table
> docloc_test.  When the PK was not on the table the optimizer decided
> to create the hash table off the 1000 rows from collection.  But as
> soon as I put the PK on that table it then decides to use
> docloc_test to build the hash table.

I think that the presence of a unique index overrides the statistics
(or the lack of any) so that the planner knows that the column is
unique and thus safe to use as a hash key.  Now, it should have
known that anyway, unless maybe this is a freshly-built table that
auto-analyze hasn't gotten to yet?

            regards, tom lane