Re: Transactions involving multiple postgres foreign servers, take 2
От | Masahiro Ikeda |
---|---|
Тема | Re: Transactions involving multiple postgres foreign servers, take 2 |
Дата | |
Msg-id | c043e14f-2c63-786f-9284-fdf8d6760835@oss.nttdata.com обсуждение исходный текст |
Ответ на | Re: Transactions involving multiple postgres foreign servers, take 2 (Masahiko Sawada <sawada.mshk@gmail.com>) |
Ответы |
RE: Transactions involving multiple postgres foreign servers, take 2
("r.takahashi_2@fujitsu.com" <r.takahashi_2@fujitsu.com>)
Re: Transactions involving multiple postgres foreign servers, take 2 (Masahiko Sawada <sawada.mshk@gmail.com>) |
Список | pgsql-hackers |
On 2021/06/30 10:05, Masahiko Sawada wrote: > On Fri, Jun 25, 2021 at 9:53 AM Masahiro Ikeda <ikedamsh@oss.nttdata.com> wrote: >> >> Hi Jamison-san, sawada-san, >> >> Thanks for testing! >> >> FWIF, I tested using pgbench with "--rate=" option to know the server >> can execute transactions with stable throughput. As sawada-san said, >> the latest patch resolved second phase of 2PC asynchronously. So, >> it's difficult to control the stable throughput without "--rate=" option. >> >> I also worried what I should do when the error happened because to increase >> "max_prepared_foreign_transaction" doesn't work. Since too overloading may >> show the error, is it better to add the case to the HINT message? >> >> BTW, if sawada-san already develop to run the resolver processes in parallel, >> why don't you measure performance improvement? Although Robert-san, >> Tunakawa-san and so on are discussing what architecture is best, one >> discussion point is that there is a performance risk if adopting asynchronous >> approach. If we have promising solutions, I think we can make the discussion >> forward. > > Yeah, if we can asynchronously resolve the distributed transactions > without worrying about max_prepared_foreign_transaction error, it > would be good. But we will need synchronous resolution at some point. > I think we at least need to discuss it at this point. > > I've attached the new version patch that incorporates the comments > from Fujii-san and Ikeda-san I got so far. We launch a resolver > process per foreign server, committing prepared foreign transactions > on foreign servers in parallel. To get a better performance based on > the current architecture, we can have multiple resolver processes per > foreign server but it seems not easy to tune it in practice. Perhaps > is it better if we simply have a pool of resolver processes and we > assign a resolver process to the resolution of one distributed > transaction one by one? That way, we need to launch resolver processes > as many as the concurrent backends using 2PC. Thanks for updating the patches. I have tested in my local laptop and summary is the following. (1) The latest patch(v37) can improve throughput by 1.5 times compared to v36. Although I expected it improves by 2.0 times because the workload is that one transaction access two remote servers... I think the reason is that the disk is bottleneck and I couldn't prepare disks for each postgresql servers. If I could, I think the performance can be improved by 2.0 times. (2) The latest patch(v37) throughput of foreign_twophase_commit = required is about 36% compared to the case if foreign_twophase_commit = disabled. Although the throughput is improved, the absolute performance is not good. It may be the fate of 2PC. I think the reason is that the number of WAL writes is much increase and, the disk writes in my laptop is the bottleneck. I want to know the result testing in richer environments if someone can do so. (3) The latest patch(v37) has no overhead if foreign_twophase_commit = disabled. On the contrary, the performance improved by 3%. It may be within the margin of error. The test detail is following. # condition * 1 coordinator and 3 foreign servers * 4 instance shared one ssd disk. * one transaction queries different two foreign servers. ``` fxact_update.pgbench \set id random(1, 1000000) \set partnum 3 \set p1 random(1, :partnum) \set p2 ((:p1 + 1) % :partnum) + 1 BEGIN; UPDATE part:p1 SET md5 = md5(clock_timestamp()::text) WHERE id = :id; UPDATE part:p2 SET md5 = md5(clock_timestamp()::text) WHERE id = :id; COMMIT; ``` * pgbench generates load. I increased ${RATE} little by little until "maximum number of foreign transactions reached" error happens. ``` pgbench -f fxact_update.pgbench -R ${RATE} -c 8 -j 8 -T 180 ``` * parameters max_prepared_transactions = 100 max_prepared_foreign_transactions = 200 max_foreign_transaction_resolvers = 4 # test source code patterns 1. 2pc patches(v36) based on 6d0eb385 (foreign_twophase_commit = required). 2. 2pc patches(v37) based on 2595e039 (foreign_twophase_commit = required). 3. 2pc patches(v37) based on 2595e039 (foreign_twophase_commit = disabled). 4. 2595e039 without 2pc patches(v37). # results 1. tps = 241.8000TPS latency average = 10.413ms 2. tps = 359.017519 ( by 1.5 times compared to 1. by 0.36% compared to 3.) latency average = 15.427ms 3. tps = 987.372220 ( by 1.03% compared to 4. ) latency average = 8.102ms 4. tps = 955.984574 latency average = 8.368ms The disk is the bottleneck in my environment because disk util is almost 100% in every pattern. If disks for each instance can be prepared, I think we can expect more performance improvements. >> In my understanding, there are three improvement idea. First is that to make >> the resolver processes run in parallel. Second is that to send "COMMIT/ABORT >> PREPARED" remote servers in bulk. Third is to stop syncing the WAL >> remove_fdwxact() after resolving is done, which I addressed in the mail sent >> at June 3rd, 13:56. Since third idea is not yet discussed, there may >> be my misunderstanding. > > Yes, those optimizations are promising. On the other hand, they could > introduce complexity to the code and APIs. I'd like to keep the first > version simple. I think we need to discuss them at this stage but can > leave the implementation of both parallel execution and batch > execution as future improvements. OK, I agree. > For the third idea, I think the implementation was wrong; it removes > the state file then flushes the WAL record. I think these should be > performed in the reverse order. Otherwise, FdwXactState entry could be > left on the standby if the server crashes between them. I might be > missing something though. Oh, I see. I think you're right though what you wanted to say is that it flushes the WAL records then removes the state file. If "COMMIT/ABORT PREPARED" statements execute in bulk, it seems enough to sync the wal only once, then remove all related state files. BTW, I tested the binary building with -O2, and I got the following warnings. It's needed to be fixed. ``` fdwxact.c: In function 'PrepareAllFdwXacts': fdwxact.c:897:13: warning: 'flush_lsn' may be used uninitialized in this function [-Wmaybe-uninitialized] 897 | canceled = SyncRepWaitForLSN(flush_lsn, false); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ``` Regards, -- Masahiro Ikeda NTT DATA CORPORATION
В списке pgsql-hackers по дате отправления: