Обсуждение: Buffer Cache Problem

Поиск
Список
Период
Сортировка

Buffer Cache Problem

От
jacktby jacktby
Дата:
Hi, postgres hackers, I’m studying postgres buffer cache part. So I open this thread to communicate some buffer cache codes design and try to improve some tricky codes.

For Buffer Cache, we know it’s a buffer array, every bucket of this array is consist of a data page and its header which is used to describe the state of the buffer. 

This is the origin code of buffer header:
typedef struct BufferDesc
{
BufferTag tag; /* ID of page contained in buffer */
int buf_id; /* buffer's index number (from 0) */

/* state of the tag, containing flags, refcount and usagecount */
pg_atomic_uint32 state;

int wait_backend_pgprocno; /* backend of pin-count waiter */
int freeNext; /* link in freelist chain */
LWLock content_lock; /* to lock access to buffer contents */
} BufferDesc;

For field wait_backend_pgprocno, the comment is "backend of pin-count waiter”, I have problems below:
1. it means which processId is waiting this buffer, right? 
2. and if wait_backend_pgprocno is valid, so it says this buffer is in use by one process, right?
3. if one buffer is wait by another process, it means all buffers are out of use, right? So let’s try this: we have 5 buffers with ids (1,2,3,4,5), and they  are all in use, now another process  with processId 8017 is coming, and it choose buffer id 1, so  buffer1’s wait_backend_pgprocno is 8017, but later
buffer4 is released, can process 8017 change to get buffer4? how?
4. wait_backend_pgprocno is a “integer” type, not an array, why can one buffer be wait by only one process?

Hope your reply, thanks!! I’m willing to do contributions after I study buffer cache implementations.

Re: Buffer Cache Problem

От
Matthias van de Meent
Дата:
On Tue, 7 Nov 2023 at 14:28, jacktby jacktby <jacktby@gmail.com> wrote:
>
> Hi, postgres hackers, I’m studying postgres buffer cache part. So I open this thread to communicate some buffer cache
codesdesign and try to improve some tricky codes. 
>
> For Buffer Cache, we know it’s a buffer array, every bucket of this array is consist of a data page and its header
whichis used to describe the state of the buffer. 
>
> For field wait_backend_pgprocno, the comment is "backend of pin-count waiter”, I have problems below:

Did you read the README at src/backend/storage/buffer/README, as well
as the comments and documentation in and around the buffer-locking
functions?

> 1. it means which processId is waiting this buffer, right?
> 2. and if wait_backend_pgprocno is valid, so it says this buffer is in use by one process, right?
> 3. if one buffer is wait by another process, it means all buffers are out of use, right? So let’s try this: we have 5
bufferswith ids (1,2,3,4,5), and they  are all in use, now another process  with processId 8017 is coming, and it
choosebuffer id 1, so  buffer1’s wait_backend_pgprocno is 8017, but later 
> buffer4 is released, can process 8017 change to get buffer4? how?

I believe these questions are generally answered by the README and the
comments in bufmgr.c/buf_internal.h for the functions that try to lock
buffers.

> 4. wait_backend_pgprocno is a “integer” type, not an array, why can one buffer be wait by only one process?

Yes, that is correct. It seems like PostgreSQL has yet to find a
workload requires more than one backend to wait for super exclusive
access to a buffer at the same time.
VACUUM seems to be the only workload that currently can wait and sleep
for this exclusive buffer access, and that is already limited to one
process per relation, so there are no explicit concurrent
super-exclusive waits in the system right now.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)



Re: Buffer Cache Problem

От
jacktby jacktby
Дата:
In the bus_internal.h,I see
====================================================
 Note: Buffer header lock (BM_LOCKED flag) must be held to examine or change  tag, state or wait_backend_pgprocno fields.
====================================================
As we all know, this buffer header lock is implemented by a bit in state filed, and this state field is a atomic_u32 type, so in fact we don’t need to 
hold buffer lock when we update state, this comment has error,right?

Re: Buffer Cache Problem

От
jacktby jacktby
Дата:

2023年11月10日 22:31,jacktby jacktby <jacktby@gmail.com> 写道:

In the bus_internal.h,I see
====================================================
 Note: Buffer header lock (BM_LOCKED flag) must be held to examine or change  tag, state or wait_backend_pgprocno fields.
====================================================
As we all know, this buffer header lock is implemented by a bit in state filed, and this state field is a atomic_u32 type, so in fact we don’t need to 
hold buffer lock when we update state, this comment has error,right?
Oh, sorry this is true, in fact we never acquire a spin lock when update the state.

Re: Buffer Cache Problem

От
jacktby jacktby
Дата:
Hi, I have 3 questions here:
1. I see comments in but_internals.h below:
========================================
 * Also, in places we do one-time reads of the flags without bothering to
 * lock the buffer header; this is generally for situations where we don't
 * expect the flag bit being tested to be changing.
========================================
In fact, the flag is in state filed which is an atomic_u32, so we don’t need to acquire buffer header lock in any case, but for this comment, seems it’s saying we need to hold a buffer header lock when read flag in general.

2. Another question:
========================================
 * We can't physically remove items from a disk page if another backend has
 * the buffer pinned.  Hence, a backend may need to wait for all other pins
 * to go away.  This is signaled by storing its own pgprocno into
 * wait_backend_pgprocno and setting flag bit BM_PIN_COUNT_WAITER.  At present,
 * there can be only one such waiter per buffer.
========================================
The comments above,  in fact for now, if a backend plan to remove items from a disk page, this is a mutation operation, so this backend must hold a exclusive lock for this buffer page, then in this case, there are no other backends pinning this buffer, so the pin refcount must be 1 (it’s by this backend), then this backend can remove the items safely and no need to wait other backends (because there are no other backends pinning this buffer). So my question is below:
 The operation “storing its own pgprocno into
 * wait_backend_pgprocno and setting flag bit BM_PIN_COUNT_WAITER” is whether too expensive, we should not do like this, right?

3. Where is the array?
========================================
 * Per-buffer I/O condition variables are currently kept outside this struct in
 * a separate array.  They could be moved in here and still fit within that
 * limit on common systems, but for now that is not done.
========================================