Обсуждение: Building PosgresSQL with LLVM fails on Solaris 11.4

Поиск
Список
Период
Сортировка

Building PosgresSQL with LLVM fails on Solaris 11.4

От
Sacha Hottinger
Дата:

Hi all

 

Compiling PostgreSQL 13.13 with option –with-llvm fails with Developer Studio 12.6 as well as with gcc 13.2.0.

I have installed the developer/llvm/clang" + "developer/llvm/clang-build pkgs (13.0.1).

 

- It works without the llvm option

- I have also tried it with 16.1 – no success either

 

o With Developer Studio (psql 13.13):

 

# ./configure CC='/opt/developerstudio12.6/bin/cc -m64 -xarch=native' --enable-dtrace DTRACEFLAGS='-64' --with-system-tzdata=/usr/share/lib/zoneinfo --with-llvm

 

# gmake all

...

/opt/developerstudio12.6/bin/cc -m64 -xarch=native -Xa -v -O -I../../../src/include    -c -o pg_shmem.o pg_shmem.c

gmake[3]: *** No rule to make target 'tas.bc', needed by 'objfiles.txt'.  Stop.

gmake[3]: Leaving directory '/opt/cnd/opt24_13.13_gmake_all_llvm/src/backend/port'

gmake[2]: *** [common.mk:39: port-recursive] Error 2

gmake[2]: Leaving directory '/opt/cnd/opt24_13.13_gmake_all_llvm/src/backend'

gmake[1]: *** [Makefile:42: all-backend-recurse] Error 2

gmake[1]: Leaving directory '/opt/cnd/opt24_13.13_gmake_all_llvm/src'

gmake: *** [GNUmakefile:11: all-src-recurse] Error 2

 

 

o With gcc (psql 13.13):

 

#./configure CC='/usr/bin/gcc -m64' --with-system-tzdata=/usr/share/lib/zoneinfo --with-llvm

 

# time gmake all

...

-Wl,--as-needed -Wl,-R'/usr/local/pgsql/lib'  -lLLVM-13

Undefined                       first referenced

symbol                             in file

TTSOpsHeapTuple                     llvmjit_deform.o

pfree                               llvmjit.o

MemoryContextAllocZero              llvmjit.o

pkglib_path                         llvmjit.o

ExecEvalStepOp                      llvmjit_expr.o

errhidestmt                         llvmjit.o

ld: warning: symbol referencing errors

/usr/bin/clang -Wno-ignored-attributes -fno-strict-aliasing -fwrapv -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -O2  -D__STDC_LIMIT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_CONSTANT_MACROS -I/usr/include  -I../../../../src/include   -flto=thin -emit-llvm -c -o llvmjit_types.bc llvmjit_types.c

gmake[2]: Leaving directory '/opt/cnd/opt25_13.13_gcc_gmak_all_llvm/src/backend/jit/llvm'

gmake[1]: Leaving directory '/opt/cnd/opt25_13.13_gcc_gmak_all_llvm/src'

gmake -C config all

gmake[1]: Entering directory '/opt/cnd/opt25_13.13_gcc_gmak_all_llvm/config'

gmake[1]: Nothing to be done for 'all'.

gmake[1]: Leaving directory '/opt/cnd/opt25_13.13_gcc_gmak_all_llvm/config'

 

 

Kind regards

Sasha


This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager.

Re: Building PosgresSQL with LLVM fails on Solaris 11.4

От
Andres Freund
Дата:
Hi,

On 2023-12-01 17:02:25 +0000, Sacha Hottinger wrote:
> Compiling PostgreSQL 13.13 with option –with-llvm fails with Developer Studio 12.6 as well as with gcc 13.2.0.
> I have installed the developer/llvm/clang" + "developer/llvm/clang-build pkgs (13.0.1).

Uh, huh. I did not expect that anybody would ever really do that on
solaris. Not that the breakage was intentional, that's a separate issue.

Is this on x86-64 or sparc?


I'm somewhat confused that you report this to happen with gcc as well. We
don't use .s files there. Oh, I guess you see a different error
there:

> o With gcc (psql 13.13):
>
> #./configure CC='/usr/bin/gcc -m64' --with-system-tzdata=/usr/share/lib/zoneinfo --with-llvm
>
> # time gmake all
> ...
> -Wl,--as-needed -Wl,-R'/usr/local/pgsql/lib'  -lLLVM-13
> Undefined                       first referenced
> symbol                             in file
> TTSOpsHeapTuple                     llvmjit_deform.o
> pfree                               llvmjit.o
> …
> MemoryContextAllocZero              llvmjit.o
> pkglib_path                         llvmjit.o
> ExecEvalStepOp                      llvmjit_expr.o
> errhidestmt                         llvmjit.o
> ld: warning: symbol referencing errors

This is odd. I think this is when building llvmjit.so - unfortunately there's
not enough details to figure out what's wrong here.

Oh, one thing that might be going wrong is that you just set the C compiler to
be gcc, but not C++ - what happens if you addtionally set CXX to g++?



I did not think about .o files generated from .s when writing the make
infrastructure for JITing.  At first I thought the easiest solution would be
to just add a rule to build .bc from .s - but that doesn't work in the
sunstudio case, because it relies on preprocessor logic that's specific to sun
studio - which clang can't parse. Gah.

Thus the attached hack - I think that should work.  It'd mostly be interesting
to see if this is the only roadblock or if there's more.


To be honest, the only case where .s files matter today is building with sun
studio, and that's a compiler we're planning to remove support for. So I'm not
sure it's worth fixing, if it adds complexity.

Greetings,

Andres Freund

Вложения

AW: Building PosgresSQL with LLVM fails on Solaris 11.4

От
Sacha Hottinger
Дата:

Hi Andres

 

Many thanks for your help and the fix.

 

> Is this on x86-64 or sparc?

It is SPARC

 

> Oh, one thing that might be going wrong is that you just set the C compiler to
> be gcc, but not C++ - what happens if you addtionally set CXX to g++?

 

// That seems to get set correctly:

# grep ^'CXX=' config.log

CXX='g++'

 

// I used the patch command to patch the src/backend/port/Makefile with your attached file and tried again with the Sun Studio compiler. There is now a different error at this stage:

/opt/developerstudio12.6/bin/cc -m64 -xarch=native -Xa -v -O -I../../../src/include    -c -o pg_shmem.o pg_shmem.c

echo | /usr/bin/clang -Wno-ignored-attributes -fno-strict-aliasing -fwrapv -O2  -I../../../src/include   -flto=thin -emit-llvm -c -xc -o tas.bc tas.s

tas.s:1:1: error: expected identifier or '('

!-------------------------------------------------------------------------

^

1 error generated.

gmake[3]: *** [Makefile:42: tas.bc] Error 1

gmake[3]: Leaving directory '/opt/cnd/opt28_13.3_gmake_all_llvm_fix/src/backend/port'

gmake[2]: *** [common.mk:39: port-recursive] Error 2

gmake[2]: Leaving directory '/opt/cnd/opt28_13.3_gmake_all_llvm_fix/src/backend'

gmake[1]: *** [Makefile:42: all-backend-recurse] Error 2

gmake[1]: Leaving directory '/opt/cnd/opt28_13.3_gmake_all_llvm_fix/src'

gmake: *** [GNUmakefile:11: all-src-recurse] Error 2

 

 

// Have attached the config.log, gmake all full log, and patched Makefile.

 

 

Best regards

Sasha

 

Von: Andres Freund <andres@anarazel.de>
Datum: Freitag, 1. Dezember 2023 um 20:49
An: Sacha Hottinger <itdo@cndag.onmicrosoft.com>
Cc: pgsql-hackers@postgresql.org <pgsql-hackers@postgresql.org>
Betreff: Re: Building PosgresSQL with LLVM fails on Solaris 11.4

Hi,

On 2023-12-01 17:02:25 +0000, Sacha Hottinger wrote:
> Compiling PostgreSQL 13.13 with option –with-llvm fails with Developer Studio 12.6 as well as with gcc 13.2.0.
> I have installed the developer/llvm/clang" + "developer/llvm/clang-build pkgs (13.0.1).

Uh, huh. I did not expect that anybody would ever really do that on
solaris. Not that the breakage was intentional, that's a separate issue.

Is this on x86-64 or sparc?


I'm somewhat confused that you report this to happen with gcc as well. We
don't use .s files there. Oh, I guess you see a different error
there:

> o With gcc (psql 13.13):
>
> #./configure CC='/usr/bin/gcc -m64' --with-system-tzdata=/usr/share/lib/zoneinfo --with-llvm
>
> # time gmake all
> ...
> -Wl,--as-needed -Wl,-R'/usr/local/pgsql/lib'  -lLLVM-13
> Undefined                       first referenced
> symbol                             in file
> TTSOpsHeapTuple                     llvmjit_deform.o
> pfree                               llvmjit.o
> …
> MemoryContextAllocZero              llvmjit.o
> pkglib_path                         llvmjit.o
> ExecEvalStepOp                      llvmjit_expr.o
> errhidestmt                         llvmjit.o
> ld: warning: symbol referencing errors

This is odd. I think this is when building llvmjit.so - unfortunately there's
not enough details to figure out what's wrong here.

Oh, one thing that might be going wrong is that you just set the C compiler to
be gcc, but not C++ - what happens if you addtionally set CXX to g++?



I did not think about .o files generated from .s when writing the make
infrastructure for JITing.  At first I thought the easiest solution would be
to just add a rule to build .bc from .s - but that doesn't work in the
sunstudio case, because it relies on preprocessor logic that's specific to sun
studio - which clang can't parse. Gah.

Thus the attached hack - I think that should work.  It'd mostly be interesting
to see if this is the only roadblock or if there's more.


To be honest, the only case where .s files matter today is building with sun
studio, and that's a compiler we're planning to remove support for. So I'm not
sure it's worth fixing, if it adds complexity.

Greetings,

Andres Freund


This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager.

Вложения

Re: Building PosgresSQL with LLVM fails on Solaris 11.4

От
Andres Freund
Дата:
Hi,

On 2023-12-01 23:06:59 +0000, Sacha Hottinger wrote:
> // I used the patch command to patch the src/backend/port/Makefile with your attached file and tried again with the
SunStudio compiler. There is now a different error at this stage:
 
> …
> /opt/developerstudio12.6/bin/cc -m64 -xarch=native -Xa -v -O -I../../../src/include    -c -o pg_shmem.o pg_shmem.c
> echo | /usr/bin/clang -Wno-ignored-attributes -fno-strict-aliasing -fwrapv -O2  -I../../../src/include   -flto=thin
-emit-llvm-c -xc -o tas.bc tas.s
 
> tas.s:1:1: error: expected identifier or '('
> !-------------------------------------------------------------------------
> ^
> 1 error generated.

That's me making a silly mistake...  I've attached at an updated, but still
blindly written, diff.


> // Have attached the config.log, gmake all full log, and patched Makefile.

Could you attach config.log and gmake for the gcc based build? Because so far
I have no idea what causes the linker issue there.

Greetings,

Andres Freund

Вложения

AW: Building PosgresSQL with LLVM fails on Solaris 11.4

От
Sacha Hottinger
Дата:

Hi Andres

 

Thanks a lot.

It now got much further but failed here with Sun Studio:

gmake[2]: Leaving directory '/opt/cnd/opt28-2_13.3_gmake_all_llvm_fixV2/src/test/perl'

gmake -C backend/jit/llvm all

gmake[2]: Entering directory '/opt/cnd/opt28-2_13.3_gmake_all_llvm_fixV2/src/backend/jit/llvm'

/opt/developerstudio12.6/bin/cc -m64 -xarch=native -Xa -v -O  -KPIC -D__STDC_LIMIT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_CONSTANT_MACROS -I/usr/include  -I../../../../src/include    -c -o llvmjit.o llvmjit.c

"llvmjit.c", line 493: warning: argument #1 is incompatible with prototype:

        prototype: pointer to void : "../../../../src/include/jit/llvmjit_emit.h", line 27

        argument : pointer to function(pointer to struct FunctionCallInfoBaseData {pointer to struct FmgrInfo {..} flinfo, pointer to struct Node {..} context, pointer to struct Node {..} resultinfo, unsigned int fncollation, _Bool isnull, short nargs, array[-1] of struct NullableDatum {..} args}) returning unsigned long

g++ -O -std=c++14 -KPIC -D__STDC_LIMIT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_CONSTANT_MACROS -I/usr/include  -I../../../../src/include    -c -o llvmjit_error.o llvmjit_error.cpp

g++: error: unrecognized command-line option ‘-KPIC’; did you mean ‘-fPIC’?

gmake[2]: *** [<builtin>: llvmjit_error.o] Error 1

gmake[2]: Leaving directory '/opt/cnd/opt28-2_13.3_gmake_all_llvm_fixV2/src/backend/jit/llvm'

gmake[1]: *** [Makefile:42: all-backend/jit/llvm-recurse] Error 2

gmake[1]: Leaving directory '/opt/cnd/opt28-2_13.3_gmake_all_llvm_fixV2/src'

gmake: *** [GNUmakefile:11: all-src-recurse] Error 2

 

 

With ggc it fails at the same step as before.

I have attached the log files of the SunStudio and gcc runs to the email.

 

Many thanks for your help.

 

Best regards

Sacha

 

Von: Andres Freund <andres@anarazel.de>
Datum: Mittwoch, 6. Dezember 2023 um 19:01
An: Sacha Hottinger <itdo@cndag.onmicrosoft.com>
Cc: pgsql-hackers@postgresql.org <pgsql-hackers@postgresql.org>
Betreff: Re: Building PosgresSQL with LLVM fails on Solaris 11.4

Hi,

On 2023-12-01 23:06:59 +0000, Sacha Hottinger wrote:
> // I used the patch command to patch the src/backend/port/Makefile with your attached file and tried again with the Sun Studio compiler. There is now a different error at this stage:
> …
> /opt/developerstudio12.6/bin/cc -m64 -xarch=native -Xa -v -O -I../../../src/include    -c -o pg_shmem.o pg_shmem.c
> echo | /usr/bin/clang -Wno-ignored-attributes -fno-strict-aliasing -fwrapv -O2  -I../../../src/include   -flto=thin -emit-llvm -c -xc -o tas.bc tas.s
> tas.s:1:1: error: expected identifier or '('
> !-------------------------------------------------------------------------
> ^
> 1 error generated.

That's me making a silly mistake...  I've attached at an updated, but still
blindly written, diff.


> // Have attached the config.log, gmake all full log, and patched Makefile.

Could you attach config.log and gmake for the gcc based build? Because so far
I have no idea what causes the linker issue there.

Greetings,

Andres Freund


This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager.

Вложения

Re: Building PosgresSQL with LLVM fails on Solaris 11.4

От
Andres Freund
Дата:
Hi,

On 2023-12-07 13:43:55 +0000, Sacha Hottinger wrote:
> Thanks a lot.
> It now got much further but failed here with Sun Studio:
> …
> gmake[2]: Leaving directory '/opt/cnd/opt28-2_13.3_gmake_all_llvm_fixV2/src/test/perl'
> gmake -C backend/jit/llvm all
> gmake[2]: Entering directory '/opt/cnd/opt28-2_13.3_gmake_all_llvm_fixV2/src/backend/jit/llvm'
> /opt/developerstudio12.6/bin/cc -m64 -xarch=native -Xa -v -O  -KPIC -D__STDC_LIMIT_MACROS -D__STDC_FORMAT_MACROS
-D__STDC_CONSTANT_MACROS-I/usr/include  -I../../../../src/include    -c -o llvmjit.o llvmjit.c
 
> "llvmjit.c", line 493: warning: argument #1 is incompatible with prototype:
>         prototype: pointer to void : "../../../../src/include/jit/llvmjit_emit.h", line 27
>         argument : pointer to function(pointer to struct FunctionCallInfoBaseData {pointer to struct FmgrInfo {..}
flinfo,pointer to struct Node {..} context, pointer to struct Node {..} resultinfo, unsigned int fncollation, _Bool
isnull,short nargs, array[-1] of struct NullableDatum {..} args}) returning unsigned long
 
> g++ -O -std=c++14 -KPIC -D__STDC_LIMIT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_CONSTANT_MACROS -I/usr/include
-I../../../../src/include   -c -o llvmjit_error.o llvmjit_error.cpp
 
> g++: error: unrecognized command-line option ‘-KPIC’; did you mean ‘-fPIC’?
> gmake[2]: *** [<builtin>: llvmjit_error.o] Error 1
> gmake[2]: Leaving directory '/opt/cnd/opt28-2_13.3_gmake_all_llvm_fixV2/src/backend/jit/llvm'
> gmake[1]: *** [Makefile:42: all-backend/jit/llvm-recurse] Error 2
> gmake[1]: Leaving directory '/opt/cnd/opt28-2_13.3_gmake_all_llvm_fixV2/src'
> gmake: *** [GNUmakefile:11: all-src-recurse] Error 2

I don't know where the -KPIC is coming from. And TBH, I don't see much point
trying to fix a scenario involving matching sun studio C with g++.


> With ggc it fails at the same step as before.
> I have attached the log files of the SunStudio and gcc runs to the email.

I don't see a failure with gcc.

The warnings are emitted for every extension and compilation succeeds.

Greetings,

Andres Freund



AW: Building PosgresSQL with LLVM fails on Solaris 11.4

От
Sacha Hottinger
Дата:

Hi Andres

 

Thanks for your reply.

The reason I was suspicious with the warnings of the gcc build was, because gmake check reported 138 out of 202 tests to have failed. I have attached the output of gmake check.

 

After you mentioned that gcc did not report any errors, just warnings, we installed the build.

First, it seeemed to work and SELECT pg_jit_available(); showed "pg_jit_available" as "t" but the DB showed strange behaviour. I.e. not always, but sometimes running "show parallel_tuple_cost" caused postmaster to restart a server process.

We had to back to the previous installation.

 

It seems there is definitievly something wrong with the result gcc created.

 

Best regards

Sacha

 

Von: Andres Freund <andres@anarazel.de>
Datum: Donnerstag, 7. Dezember 2023 um 17:50
An: Sacha Hottinger <itdo@cndag.onmicrosoft.com>
Cc: pgsql-hackers@postgresql.org <pgsql-hackers@postgresql.org>
Betreff: Re: Building PosgresSQL with LLVM fails on Solaris 11.4

Hi,

On 2023-12-07 13:43:55 +0000, Sacha Hottinger wrote:
> Thanks a lot.
> It now got much further but failed here with Sun Studio:
> …
> gmake[2]: Leaving directory '/opt/cnd/opt28-2_13.3_gmake_all_llvm_fixV2/src/test/perl'
> gmake -C backend/jit/llvm all
> gmake[2]: Entering directory '/opt/cnd/opt28-2_13.3_gmake_all_llvm_fixV2/src/backend/jit/llvm'
> /opt/developerstudio12.6/bin/cc -m64 -xarch=native -Xa -v -O  -KPIC -D__STDC_LIMIT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_CONSTANT_MACROS -I/usr/include  -I../../../../src/include    -c -o llvmjit.o llvmjit.c
> "llvmjit.c", line 493: warning: argument #1 is incompatible with prototype:
>         prototype: pointer to void : "../../../../src/include/jit/llvmjit_emit.h", line 27
>         argument : pointer to function(pointer to struct FunctionCallInfoBaseData {pointer to struct FmgrInfo {..} flinfo, pointer to struct Node {..} context, pointer to struct Node {..} resultinfo, unsigned int fncollation, _Bool isnull, short nargs, array[-1] of struct NullableDatum {..} args}) returning unsigned long
> g++ -O -std=c++14 -KPIC -D__STDC_LIMIT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_CONSTANT_MACROS -I/usr/include  -I../../../../src/include    -c -o llvmjit_error.o llvmjit_error.cpp
> g++: error: unrecognized command-line option ‘-KPIC’; did you mean ‘-fPIC’?
> gmake[2]: *** [<builtin>: llvmjit_error.o] Error 1
> gmake[2]: Leaving directory '/opt/cnd/opt28-2_13.3_gmake_all_llvm_fixV2/src/backend/jit/llvm'
> gmake[1]: *** [Makefile:42: all-backend/jit/llvm-recurse] Error 2
> gmake[1]: Leaving directory '/opt/cnd/opt28-2_13.3_gmake_all_llvm_fixV2/src'
> gmake: *** [GNUmakefile:11: all-src-recurse] Error 2

I don't know where the -KPIC is coming from. And TBH, I don't see much point
trying to fix a scenario involving matching sun studio C with g++.


> With ggc it fails at the same step as before.
> I have attached the log files of the SunStudio and gcc runs to the email.

I don't see a failure with gcc.

The warnings are emitted for every extension and compilation succeeds.

Greetings,

Andres Freund


This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager.

Вложения

Re: Building PosgresSQL with LLVM fails on Solaris 11.4

От
Andres Freund
Дата:
Hi,

On 2023-12-13 15:18:02 +0000, Sacha Hottinger wrote:
> Thanks for your reply.
> The reason I was suspicious with the warnings of the gcc build was, because gmake check reported 138 out of 202 tests
tohave failed. I have attached the output of gmake check.
 

That'll likely be due to assertion / segmentation failures.

You'd need to enable core dumps and show a backtrace.

I assume that if you run tests without JIT support (e.g. by export
PGOPTIONS='-c jit=0'; gmake check), no such problem occurs?


> After you mentioned that gcc did not report any errors, just warnings, we installed the build.
> First, it seeemed to work and SELECT pg_jit_available(); showed "pg_jit_available" as "t" but the DB showed strange
behaviour.I.e. not always, but sometimes running "show parallel_tuple_cost" caused postmaster to restart a server
process.
> We had to back to the previous installation.
> 
> It seems there is definitievly something wrong with the result gcc created.

I suspect that the LLVM version you used does something wrong on sparc. Which
version of LLVM is it?

Greetings,

Andres Freund