Re: lockup in parallel hash join on dikkop (freebsd 14.0-current)

Поиск
Список
Период
Сортировка
От Alexander Lakhin
Тема Re: lockup in parallel hash join on dikkop (freebsd 14.0-current)
Дата
Msg-id 2132c88f-7e32-6dba-1057-2ecc5ce66509@gmail.com
обсуждение исходный текст
Ответ на Re: lockup in parallel hash join on dikkop (freebsd 14.0-current)  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: lockup in parallel hash join on dikkop (freebsd 14.0-current)  (Thomas Munro <thomas.munro@gmail.com>)
Re: lockup in parallel hash join on dikkop (freebsd 14.0-current)  (Alexander Lakhin <exclusion@gmail.com>)
Список pgsql-hackers
Hello Robert,

01.09.2023 23:21, Robert Haas wrote:
> On Fri, Sep 1, 2023 at 6:13 AM Alexander Lakhin<exclusion@gmail.com>  wrote:
>> (Placing "pg_compiler_barrier();" just after "waiting = true;" fixed the
>> issue for us.)
> Maybe it'd be worth trying something stronger, like
> pg_memory_barrier(). A compiler barrier doesn't prevent the CPU from
> reordering loads and stores as it goes, and ARM64 has weak memory
> ordering.

Indeed, thank you for the tip!
So maybe here we deal with not compiler's, but with CPU's optimization.
The wider code fragment is:
   805c48: 52800028      mov     w8, #1 // true
   805c4c: 52800319      mov     w25, #24
   805c50: 5280073a      mov     w26, #57
   805c54: fd446128      ldr     d8, [x9, #2240]
   805c58: 90000d7b      adrp    x27, 0x9b1000 <ModifyWaitEvent+0xb0>
   805c5c: fd415949      ldr     d9, [x10, #688]
   805c60: f9071d68      str     x8, [x11, #3640] // waiting = true (x8 = w8)
   805c64: f90003f3      str     x19, [sp]
   805c68: 14000010      b       0x805ca8 <WaitEventSetWait+0x108>

   805ca8: f9400a88      ldr     x8, [x20, #16] // if (set->latch && set->latch->is_set)
   805cac: b4000068      cbz     x8, 0x805cb8 <WaitEventSetWait+0x118>
   805cb0: f9400108      ldr     x8, [x8]
   805cb4: b5001248      cbnz    x8, 0x805efc <WaitEventSetWait+0x35c>
   805cb8: f9401280      ldr     x0, [x20, #32]

If that CPU can delay the writing to the variable waiting
(str x8, [x11, #3640]) in it's internal form like
"store 1 to [address]" to 805cb0 or a later instruction, then we can get the
behavior discussed. Something like that is shown in the ARM documentation:
https://developer.arm.com/documentation/102336/0100/Memory-ordering?lang=en
I'll try to test this guess on the target machine...

Best regards,
Alexander



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tomas Vondra
Дата:
Сообщение: Re: Initdb-time block size specification
Следующее
От: Thomas Munro
Дата:
Сообщение: Re: lockup in parallel hash join on dikkop (freebsd 14.0-current)