Обсуждение: select limit error in file_fdw
Hello, I have noticed that since ffa4cbd623, a foreign table that pulls data from a PROGRAM (in this case an unzip call) will fail if there is a LIMIT on the SELECT (while succeeding without LIMIT). Below is an example. (Table size matters, so larger machines than mine may need more than those 100000 rows.) A pre-ffa4cbd623 instance did not have this problem (and neither does 11.1) This seems like a bug to me. --------------- 8< ------------------------------ #!/bin/bash # service=prod service=dev11 # this is latest dev11 (in 11.1 it still worked; # commit ffa4cbd623 looks pertinent) csv_file=/tmp/t.txt echo "select n from generate_series(1, 100000) as f(n)" | psql -qtAX service=$service > $csv_file zip ~/t.zip $csv_file echo " drop server if exists test_server cascade; create server if not exists test_server foreign data wrapper file_fdw; create schema if not exists tmp; drop foreign table if exists tmp.t cascade; create foreign table tmp.t (n int) server test_server options ( program 'unzip -p \"/home/aardvark/t.zip\" \"tmp/t.txt\"' , format 'csv' , header 'TRUE' , delimiter E'\t' ); " | psql -X service=$service # this works OK: echo "table tmp.t;" | psql -Xa service=$service | head # this fails in latest dev11 : echo "table tmp.t limit 10;" | psql -Xa service=$service --------------- 8< ------------------------------ Output: updating: tmp/t.txt (deflated 63%) DROP SERVER CREATE SERVER CREATE SCHEMA DROP FOREIGN TABLE CREATE FOREIGN TABLE table tmp.t; n -------- 2 3 4 5 6 7 8 table tmp.t limit 10; ERROR: program "unzip -p "/home/aardvark/t.zip" "tmp/t.txt"" failed DETAIL: child process exited with exit code 141 it would be nice to get this working again. Thanks, Erik Rijkers
Erik Rijkers <er@xs4all.nl> writes: > I have noticed that since ffa4cbd623, a foreign table that pulls data > from a PROGRAM (in this case an unzip call) will fail if there is a > LIMIT on the SELECT > (while succeeding without LIMIT). Below is an example. Um ... this example works for me, in both HEAD and v11 branch tip. Moreover, the behavior you describe is exactly what ffa4cbd623 was intended to fix. Is there any chance that you got 11.1 and v11 branch tip mixed up? If not, there must be some platform-specific behavior involved. What are you testing on, exactly? regards, tom lane
On 2018-12-16 07:03, Tom Lane wrote: > Erik Rijkers <er@xs4all.nl> writes: >> I have noticed that since ffa4cbd623, a foreign table that pulls data >> from a PROGRAM (in this case an unzip call) will fail if there is a >> LIMIT on the SELECT >> (while succeeding without LIMIT). Below is an example. > > Um ... this example works for me, in both HEAD and v11 branch tip. > Moreover, the behavior you describe is exactly what ffa4cbd623 was > intended to fix. Is there any chance that you got 11.1 and v11 > branch tip mixed up? I admit it's suspicious. I am assuming I pull the latest, from REL_11_STABLE, but I will have another hard look at my build stuff. On the other hand, in that test.sh, have you tried enlarging the test table? It works for me too with small enough values in that generate_series. > If not, there must be some platform-specific behavior involved. > What are you testing on, exactly? This is debian 9/Stretch: /etc/os-release: "Debian GNU/Linux 9 (stretch)" uname -a Linux gulo 4.9.0-8-amd64 #1 SMP Debian 4.9.130-2 (2018-10-27) x86_64 GNU/Linux /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 42 model name : Intel(R) Core(TM) i5-2400 CPU @ 3.10GHz stepping : 7 microcode : 0x25 cpu MHz : 2299.645 cache size : 6144 KB physical id : 0 siblings : 4 core id : 0 cpu cores : 4 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp l bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf bogomips : 6185.58 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management: $ (PGSERVICE=dev11 psql -c "select version()") PostgreSQL 11.1_REL_11_STABLE_20181216_0458_171c on x86_64-pc-linux-gnu, compiled by gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, 64-bit (1 row) (note that '171c', tacked on via --with-extra-version) What other info could be useful?
On 2018-12-16 11:19, Erik Rijkers wrote: > On 2018-12-16 07:03, Tom Lane wrote: >> Erik Rijkers <er@xs4all.nl> writes: >>> I have noticed that since ffa4cbd623, a foreign table that pulls data >>> from a PROGRAM (in this case an unzip call) will fail if there is a >>> LIMIT on the SELECT >>> (while succeeding without LIMIT). Below is an example. >> >> Um ... this example works for me, in both HEAD and v11 branch tip. >> Moreover, the behavior you describe is exactly what ffa4cbd623 was >> intended to fix. Is there any chance that you got 11.1 and v11 >> branch tip mixed up? > > I admit it's suspicious. I am assuming I pull the latest, from > REL_11_STABLE, but I will have another hard look at my build stuff. To circumvent a possible bug in my normal build stuff, I built an instance from scratch, maybe someone could check for any errors that I may have overlooked? The instance built with this still has the LIMIT error. I did notice that the error (as provoked by the earlier posted test.sh) can be avoided by adding 'count(*) over ()' to the select list. Not really surprising, I suppose. Here is my scratch_build.sh: ------------ #!/bin/bash git --version project=scratch # shutdown - just in case it's running /home/aardvark/pg_stuff/pg_installations/pgsql.$project/bin.fast/pg_ctl \ -D /home/aardvark/pg_stuff/pg_installations/pgsql.scratch/data \ -l logfile -w stop cd ~/tmp/ # if [[ 0 -eq 1 ]]; then git clone https://git.postgresql.org/git/postgresql.git cd postgresql git checkout REL_11_STABLE # else # cd postgresql # # git pull # fi echo "deleting stuff" make distclean &> /dev/null echo "rebuilding stuff" time ( ./configure \ --prefix=/home/aardvark/pg_stuff/pg_installations/pgsql.$project \ --bindir=/home/aardvark/pg_stuff/pg_installations/pgsql.$project/bin.fast \ --libdir=/home/aardvark/pg_stuff/pg_installations/pgsql.$project/lib.fast \ --with-pgport=6011 --quiet --enable-depend --with-openssl --with-libxml \ --with-libxslt --with-zlib --enable-tap-tests \ --with-extra-version=_$project \ && make -j 6 && ( cd contrib; make ) \ && make check \ && make install && ( cd contrib; make install ) \ && /home/aardvark/pg_stuff/pg_installations/pgsql.$project/bin.fast/initdb \ -D /home/aardvark/pg_stuff/pg_installations/pgsql.$project/data -E UTF8 \ -A scram-sha-256 --pwfile=/home/aardvark/pg_stuff/.11devel --data-checksums \ --waldir=/home/aardvark/pg_stuff/pg_installations_wal/pgsql.$project rc=$? echo "rc [$rc]" ) /home/aardvark/pg_stuff/pg_installations/pgsql.$project/bin.fast/pg_ctl \ -D /home/aardvark/pg_stuff/pg_installations/pgsql.scratch/data \ -l logfile -w start echo "select current_setting('port'), version()" \ | /home/aardvark/pg_stuff/pg_installations/pgsql.$project/bin.fast/psql -qX service=scratch echo " create extension file_fdw; \\dx " | /home/aardvark/pg_stuff/pg_installations/pgsql.$project/bin.fast/psql -qX service=scratch /home/aardvark/pg_stuff/pg_installations/pgsql.$project/bin.fast/pg_ctl \ -D /home/aardvark/pg_stuff/pg_installations/pgsql.scratch/data \ -l logfile -w stop ------------ comments welcome. thanks, Erik Rijkers > On the other hand, in that test.sh, have you tried enlarging the test > table? It works for me too with small enough values in that > generate_series. > >> If not, there must be some platform-specific behavior involved. >> What are you testing on, exactly? > > This is debian 9/Stretch: > > /etc/os-release: > "Debian GNU/Linux 9 (stretch)" > > uname -a > Linux gulo 4.9.0-8-amd64 #1 SMP Debian 4.9.130-2 (2018-10-27) x86_64 > GNU/Linux > > /proc/cpuinfo > processor : 0 > vendor_id : GenuineIntel > cpu family : 6 > model : 42 > model name : Intel(R) Core(TM) i5-2400 CPU @ 3.10GHz > stepping : 7 > microcode : 0x25 > cpu MHz : 2299.645 > cache size : 6144 KB > physical id : 0 > siblings : 4 > core id : 0 > cpu cores : 4 > apicid : 0 > initial apicid : 0 > fpu : yes > fpu_exception : yes > cpuid level : 13 > wp : yes > flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge > mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe > syscall nx rdtscp l > bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass > l1tf > bogomips : 6185.58 > clflush size : 64 > cache_alignment : 64 > address sizes : 36 bits physical, 48 bits virtual > power management: > > > $ (PGSERVICE=dev11 psql -c "select version()") > PostgreSQL 11.1_REL_11_STABLE_20181216_0458_171c on > x86_64-pc-linux-gnu, compiled by gcc (Debian 6.3.0-18+deb9u1) 6.3.0 > 20170516, 64-bit > (1 row) > > (note that '171c', tacked on via --with-extra-version) > > > What other info could be useful?
Erik Rijkers <er@xs4all.nl> writes: > On 2018-12-16 07:03, Tom Lane wrote: >> Um ... this example works for me, in both HEAD and v11 branch tip. >> Moreover, the behavior you describe is exactly what ffa4cbd623 was >> intended to fix. Is there any chance that you got 11.1 and v11 >> branch tip mixed up? > [ nope ] > On the other hand, in that test.sh, have you tried enlarging the test > table? Yeah, I tried sizes ranging up to 1m tuples without success. However, something else occurred to me this morning, and a bit later I can reproduce the problem! I did it by changing the table's definition to use a shell pipeline: program 'unzip -p \"/tmp/t.zip\" \"tmp/t.txt\" | cat' ERROR: program "unzip -p "/tmp/t.zip" "tmp/t.txt" | cat" failed DETAIL: child process exited with exit code 141 What is happening for me, I think, is that if the PROGRAM is just "unzip" then the shell exec's it directly, and so the SIGPIPE result is reported directly to the PG backend. But if the PROGRAM is too complicated to handle that way, then unzip and cat are children of a shell process, and one or both of them get SIGPIPE, and the shell reports that as exit status 128 + SIGPIPE. So we need to consider that result as indicating a sigpipe failure, too. It remains unclear why you had an intervening shell process when I didn't, but perhaps that can be chalked up to use of a different shell? regards, tom lane
I wrote: > It remains unclear why you had an intervening shell process when > I didn't, but perhaps that can be chalked up to use of a different > shell? To provide some data on that: popen() is presumably invoking /bin/sh, which on my box is $ /bin/sh --version GNU bash, version 4.1.2(1)-release (x86_64-redhat-linux-gnu) $ rpm -qf /bin/sh bash-4.1.2-48.el6.x86_64 regards, tom lane
On 2018-12-16 16:52, Tom Lane wrote: > Erik Rijkers <er@xs4all.nl> writes: >> On 2018-12-16 07:03, Tom Lane wrote: >>> Um ... this example works for me, in both HEAD and v11 branch tip. >>> Moreover, the behavior you describe is exactly what ffa4cbd623 was >>> intended to fix. Is there any chance that you got 11.1 and v11 >>> branch tip mixed up? > >> [ nope ] >> On the other hand, in that test.sh, have you tried enlarging the test >> table? > > Yeah, I tried sizes ranging up to 1m tuples without success. > > However, something else occurred to me this morning, and a bit > later I can reproduce the problem! I did it by changing the > table's definition to use a shell pipeline: > > program 'unzip -p \"/tmp/t.zip\" \"tmp/t.txt\" | cat' > > ERROR: program "unzip -p "/tmp/t.zip" "tmp/t.txt" | cat" failed > DETAIL: child process exited with exit code 141 curious... Adding that ' | cat' tail makes all 3 instances (that I have tried) fail: 11.1 as released, REL_11_STABLE, and instance from d56e0fde /bin/sh seems to be dash, here. bash version is: $ /bin/bash --version GNU bash, version 4.4.12(1)-release (x86_64-pc-linux-gnu)
Erik Rijkers <er@xs4all.nl> writes: > On 2018-12-16 16:52, Tom Lane wrote: >> However, something else occurred to me this morning, and a bit >> later I can reproduce the problem! I did it by changing the >> table's definition to use a shell pipeline: > /bin/sh seems to be dash, here. Hm. That must be the relevant difference. I just repeated the experiment on a Fedora 28 box with reasonably up-to-date bash: $ /bin/sh --version GNU bash, version 4.4.23(1)-release (x86_64-redhat-linux-gnu) and it behaves the same as my RHEL6 box: no problem with the direct invocation of unzip, problem if use a pipeline. Anyway, we know what to do, so I'll go do it. regards, tom lane
On 2018-12-16 19:10, Tom Lane wrote: > > Anyway, we know what to do, so I'll go do it. > Thank you very much. I've now also tested with the original, much larger file, which gives no problem anymore. I am really glad this works again, we use this stuff a lot. Erik Rijkers
Erik Rijkers <er@xs4all.nl> writes: > Thank you very much. I've now also tested with the original, much larger > file, which gives no problem anymore. I am really glad this works again, > we use this stuff a lot. We appreciate you noticing the problem before 11.2 got out ... regards, tom lane