Обсуждение: PostgreSQL performance on ARM i.MX6

Поиск
Список
Период
Сортировка

PostgreSQL performance on ARM i.MX6

От
"Druckenmueller, Marc"
Дата:

Hi there,

 

I am investigating possible throughput with PostgreSQL 14.4 on an ARM i.MX6 Quad CPU (NXP sabre board).

Testing with a simple python script (running on the same CPU), I get ~1000 request/s.

 

import psycopg as pg

conn = pg.connect('dbname=test')

conn.autocommit = True

cur = conn.cursor()

while True:

    cur.execute("call dummy_call(%s,%s,%s, ARRAY[%s, %s, %s]::real[]);", (1,2,3, 4.0, 5.0, 6.0), binary=True )

 

where the called procedure is basically a no-op:

 

CREATE OR REPLACE PROCEDURE dummy_call(

    in arg1 int,

    in arg2 int,

    in arg3 int,

    in arg4 double precision[])

AS $$

BEGIN

END

$$ LANGUAGE plpgsql;

 

This seems to be a quite low number of requests/s, given that there are no changes to the database.

Looking for suggestions what could cause this poor performance and where to start investigations.

 

Thanks,


Marc



The information contained in this message may be confidential and legally protected under applicable law. The message is intended solely for the addressee(s). If you are not the intended recipient, you are hereby notified that any use, forwarding, dissemination, or reproduction of this message is strictly prohibited and may be unlawful. If you are not the intended recipient, please contact the sender by return e-mail and destroy all copies of the original message.

Re: PostgreSQL performance on ARM i.MX6

От
Daniele Varrazzo
Дата:
On Tue, 23 May 2023 at 13:43, Druckenmueller, Marc
<marc.druckenmueller@philips.com> wrote:

> Testing with a simple python script (running on the same CPU), I get ~1000 request/s.

Is the time spent in the client or in the server? Are there noticeable
differences if you execute that statement in a loop in psql (with the
variables already bound)?

-- Daniele



Re: PostgreSQL performance on ARM i.MX6

От
Tom Lane
Дата:
"Druckenmueller, Marc" <marc.druckenmueller@philips.com> writes:
> I am investigating possible throughput with PostgreSQL 14.4 on an ARM i.MX6 Quad CPU (NXP sabre board).
> Testing with a simple python script (running on the same CPU), I get ~1000 request/s.

That does seem pretty awful for modern hardware, but it's hard to
tease apart the various potential causes.  How beefy is that CPU
really?  Maybe the overhead is all down to client/server network round
trips?  Maybe psycopg is doing something unnecessarily inefficient?

For comparison, on my development workstation I get

[ create the procedure manually in db test ]
$ cat bench.sql
call dummy_call(1,2,3,array[1,2,3]::float8[]);
$ pgbench -f bench.sql -n -T 10 test
pgbench (16beta1)
transaction type: bench.sql
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
maximum number of tries: 1
duration: 10 s
number of transactions actually processed: 353891
number of failed transactions: 0 (0.000%)
latency average = 0.028 ms
initial connection time = 7.686 ms
tps = 35416.189844 (without initial connection time)

and it'd be more if I weren't using an assertions-enabled
debug build.  It would be interesting to see what you get
from exactly that test case on your ARM board.

BTW, one thing I see that's definitely an avoidable inefficiency in
your test is that you're forcing the array parameter to real[]
(i.e. float4) when the procedure takes double precision[]
(i.e. float8).  That forces an extra run-time conversion.  Swapping
between float4 and float8 in my pgbench test doesn't move the needle
a lot, but it's noticeable.

Another thing to think about is that psycopg might be defaulting
to a TCP rather than Unix-socket connection, and that might add
overhead depending on what kernel you're using.  Although, rather
than try to micro-optimize that, you probably ought to be thinking
of how to remove network round trips altogether.  I can get upwards
of 300K calls/second if I push the loop to the server side:

test=# \timing
Timing is on.
test=# do $$
declare x int := 1; a float8[] := array[1,2,3];
begin
for i in 1..1000000 loop
  call dummy_call (x,x,x,a);
end loop;
end $$;
DO
Time: 3256.023 ms (00:03.256)
test=# select 1000000/3.256023;
      ?column?
---------------------
 307123.137643683721
(1 row)

Again, it would be interesting to compare exactly that
test case on your ARM board.

            regards, tom lane



Re: PostgreSQL performance on ARM i.MX6

От
Richard Huxton
Дата:
On 2023-05-23 12:42, Druckenmueller, Marc wrote:
> Hi there,
> 
> I am investigating possible throughput with PostgreSQL 14.4 on an ARM
> i.MX6 Quad CPU (NXP sabre board).
> 
> Testing with a simple python script (running on the same CPU), I get
> ~1000 request/s.

I tweaked your script slightly, but this is what I got on the Raspberry 
Pi 4 that I have in the corner of the room. Almost twice the speed you 
are seeing.

     0: this = 0.58 tot = 0.58
     1: this = 0.55 tot = 1.13
     2: this = 0.59 tot = 1.72
     3: this = 0.55 tot = 2.27
     4: this = 0.56 tot = 2.83
     5: this = 0.57 tot = 3.40
     6: this = 0.56 tot = 3.96
     7: this = 0.55 tot = 4.51
     8: this = 0.59 tot = 5.11
     9: this = 0.60 tot = 5.71

That's with governor=performance and a couple of background tasks 
running as well as the python. PostgreSQL 15 in a container on a Debian 
O.S. I've not done any tuning on PostgreSQL (but your call isn't doing 
anything really) nor the Pi.

The minor tweaks to your script were as below:

     import psycopg as pg
     import time

     conn = pg.connect('')
     conn.autocommit = True
     cur = conn.cursor()
     start = time.time()
     prev = start
     end = start
     for j in range(10):
         for i in range(1000):
             cur.execute("call dummy_call(%s,%s,%s, ARRAY[%s, %s, 
%s]::real[]);", (1,2,3, 4.0, 5.0, 6.0), binary=True )
         end = time.time()
         print(f"{j}: this = {(end - prev):.2f} tot = {(end - 
start):.2f}")
         prev = end

-- 
   Richard Huxton



Re: PostgreSQL performance on ARM i.MX6

От
Ranier Vilela
Дата:
Em ter., 23 de mai. de 2023 às 08:43, Druckenmueller, Marc <marc.druckenmueller@philips.com> escreveu:

Hi there,

 

I am investigating possible throughput with PostgreSQL 14.4 on an ARM i.MX6 Quad CPU (NXP sabre board).

Testing with a simple python script (running on the same CPU), I get ~1000 request/s.

Can you share kernel and python detalis? (version, etc).

regards,
Ranier Vilela