patch: avoid heavyweight locking on hash metapage
От | Robert Haas |
---|---|
Тема | patch: avoid heavyweight locking on hash metapage |
Дата | |
Msg-id | CA+Tgmoaf=nOJxLyzGcbrrY+pe-0VLL0vfHi6tjdM3fFtVwsOmw@mail.gmail.com обсуждение исходный текст |
Ответы |
Re: patch: avoid heavyweight locking on hash metapage
("Dickson S. Guedes" <listas@guedesoft.net>)
Re: patch: avoid heavyweight locking on hash metapage (Jeff Janes <jeff.janes@gmail.com>) |
Список | pgsql-hackers |
I developed the attached patch to avoid taking a heavyweight lock on the metapage of a hash index. Instead, an exclusive buffer content lock is viewed as sufficient permission to modify the metapage, and a shared buffer content lock is used when such modifications need to be prevented. For the most part this is a trivial change, because we were already taking these locks: we were just taking the heavyweight locks in addition. The only sticking point is that, when we're searching or inserting, we previously locked the bucket before releasing the heavyweight metapage lock, which is unworkable when holding only a buffer content lock because (1) we might deadlock and (2) buffer content locks can't be held for long periods of time even when there's no deadlock risk. To fix this, I implemented a simple loop-and-retry system: we release the metapage content lock, acquire the heavyweight lock on the target bucket, and then reacquire the metapage content lock and check that the bucket mapping has not changed. Normally it hasn't, and we're done. But if by chance it has, we simply unlock the metapage, release the heavyweight lock we acquired previously, lock the new bucket, and loop around again. Even in the worst case we cannot loop very many times here, since we don't split the same bucket again until we've split all the other buckets, and 2^N gets big pretty fast. I tested the effect of this by setting up a series of 5-minute read-only pgbench run at scale factor 300 with 8GB of shared buffers on the IBM POWER7 machine. For these runs, I dropped the the primary key constraint on pgbench_accounts (aid) and created a hash index on that column instead. I ran each test three times and took the median result. Here are the results on unpatched master, at various client counts: m01 tps = 9004.070422 (including connections establishing) m04 tps = 34838.126542 (including connections establishing) m08 tps = 70584.356826 (including connections establishing) m16 tps = 128726.248198 (including connections establishing) m32 tps = 123639.248172 (including connections establishing) m64 tps = 104650.296143 (including connections establishing) m80 tps = 88412.736416 (including connections establishing) And here are the results with the patch: h01 tps = 9110.561413 (including connections establishing) [+1.2%] h04 tps = 36012.787524 (including connections establishing) [+3.4%] h08 tps = 72606.302993 (including connections establishing) [+2.9%] h16 tps = 141938.762793 (including connections establishing) [+10%] h32 tps = 205325.232316 (including connections establishing) [+66%] h64 tps = 274156.881975 (including connections establishing) [+162%] h80 tps = 291224.012066 (including connections establishing) [+229%] Obviously, even with this change, there's a lot not to like about hash indexes: they still won't be crash-safe, and they still won't perform as well under high concurrency as btree indexes. But neither of those problems seems like a good reason not to fix this problem. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Вложения
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Tom LaneДата:
Сообщение: We're not lax enough about maximum time zone offset from UTC