Обсуждение: pg_trgm comparison bug on cross-architecture replication due to different char implementation

Поиск
Список
Период
Сортировка

Hi all,

 

I would like to report an issue with the pg_trgm extension on cross-architecture replication scenarios. When an x86_64 standby server is replicating from an aarch64 primary server or vice versa, the gist_trgm_ops opclass returns different results on the primary and standby. Masahiko previously reported a similar issue affecting the pg_bigm extension [1].

 

To reproduce, execute the following on the x86_64 primary server:

 

CREATE EXTENSION pg_trgm;

CREATE TABLE tbl (c text);

CREATE INDEX ON tbl USING gist (c gist_trgm_ops);

INSERT INTO tbl VALUES ('Bóbr');

 

On the x86_64 primary server:

 

postgres=> select * from tbl where c like '%Bób%';

  c

------

Bóbr

(1 row)

 

On the aarch64 replica server:

 

postgres=> select * from tbl where c like '%Bób%';

c

---

(0 rows)

 

The root cause looks the same as the pg_bigm issue that Masahiko reported. To compare trigrams, pg_trgm uses a numerical comparison of chars [2]. On x86_64 a char is signed by default, whereas on aarch64 it is unsigned by default. gist_trgm_ops expects the trigram list to be sorted, but due to the different signedness of chars, the sort order is broken when replicating the values across architectures.

 

The different sort behaviour can be demonstrated using show_trgm.

 

On the x86_64 primary server:

 

postgres=> SELECT show_trgm('Bóbr');

                show_trgm                

------------------------------------------

{0x89194c,"  b","br ",0x707c72,0x7f7849}

(1 row)

 

On the aarch64 replica server:

 

postgres=> SELECT show_trgm('Bóbr');

                show_trgm                

------------------------------------------

{"  b","br ",0x707c72,0x7f7849,0x89194c}

(1 row)

 

One simple solution for this specific case is to declare the char signedness in the CMPPCHAR macro.

 

--- a/contrib/pg_trgm/trgm.h

+++ b/contrib/pg_trgm/trgm.h

@@ -42,7 +42,7 @@

typedef char trgm[3];

 #define CMPCHAR(a,b) ( ((a)==(b)) ? 0 : ( ((a)<(b)) ? -1 : 1 ) )

-#define CMPPCHAR(a,b,i)  CMPCHAR( *(((const char*)(a))+i), *(((const char*)(b))+i) )

+#define CMPPCHAR(a,b,i)  CMPCHAR( *(((unsigned char*)(a))+i), *(((unsigned char*)(b))+i) )

#define CMPTRGM(a,b) ( CMPPCHAR(a,b,0) ? CMPPCHAR(a,b,0) : ( CMPPCHAR(a,b,1) ? CMPPCHAR(a,b,1) : CMPPCHAR(a,b,2) ) )

 #define CPTRGM(a,b) do {                                                           \

 

Alternatively, Postgres can be compiled with -funsigned-char or -fsigned-char. I came across a code comment suggesting that this may not be a good idea in general [3].

 

Given that this has problem has come up before and seems likely to come up again, I'm curious what other broad solutions there might be to resolve it? Looking forward to any feedback, thanks!

 

Best,

 

Adam Guo

Amazon Web Services: https://aws.amazon.com

 

[1] https://osdn.net/projects/pgbigm/lists/archive/hackers/2024-February/000370.html

[2] https://github.com/postgres/postgres/blob/480bc6e3ed3a5719cdec076d4943b119890e8171/contrib/pg_trgm/trgm.h#L45

[3] https://github.com/postgres/postgres/blob/master/src/backend/utils/adt/cash.c#L114-L123

"Guo, Adam" <adamguo@amazon.com> writes:
> I would like to report an issue with the pg_trgm extension on
> cross-architecture replication scenarios. When an x86_64 standby
> server is replicating from an aarch64 primary server or vice versa,
> the gist_trgm_ops opclass returns different results on the primary
> and standby.

I do not think that is a supported scenario.  Hash functions and
suchlike are not guaranteed to produce the same results on different
CPU architectures.  As a quick example, I get

regression=# select hashfloat8(34);
 hashfloat8 
------------
   21570837
(1 row)

on x86_64 but

postgres=# select hashfloat8(34);
 hashfloat8 
------------
 -602898821
(1 row)

on ppc32 thanks to the endianness difference.

> Given that this has problem has come up before and seems likely to
> come up again, I'm curious what other broad solutions there might be
> to resolve it?

Reject as not a bug.  Discourage people from thinking that physical
replication will work across architectures.

            regards, tom lane



On Tue, Apr 23, 2024 at 11:57 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> "Guo, Adam" <adamguo@amazon.com> writes:
> > I would like to report an issue with the pg_trgm extension on
> > cross-architecture replication scenarios. When an x86_64 standby
> > server is replicating from an aarch64 primary server or vice versa,
> > the gist_trgm_ops opclass returns different results on the primary
> > and standby.
>
> I do not think that is a supported scenario.  Hash functions and
> suchlike are not guaranteed to produce the same results on different
> CPU architectures.  As a quick example, I get
>
> regression=# select hashfloat8(34);
>  hashfloat8
> ------------
>    21570837
> (1 row)
>
> on x86_64 but
>
> postgres=# select hashfloat8(34);
>  hashfloat8
> ------------
>  -602898821
> (1 row)
>
> on ppc32 thanks to the endianness difference.
>
> > Given that this has problem has come up before and seems likely to
> > come up again, I'm curious what other broad solutions there might be
> > to resolve it?
>
> Reject as not a bug.  Discourage people from thinking that physical
> replication will work across architectures.

While cross-arch physical replication is not supported, I think having
architecture dependent differences is not good and It's legitimate to
fix it. FYI the 'char' data type comparison is done as though char is
unsigned. I've attached a small patch to fix it. What do you think?

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Вложения
Masahiko Sawada <sawada.mshk@gmail.com> writes:
> On Tue, Apr 23, 2024 at 11:57 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Reject as not a bug.  Discourage people from thinking that physical
>> replication will work across architectures.

> While cross-arch physical replication is not supported, I think having
> architecture dependent differences is not good and It's legitimate to
> fix it. FYI the 'char' data type comparison is done as though char is
> unsigned. I've attached a small patch to fix it. What do you think?

I think this will break existing indexes that are working fine.
Yeah, it would have been better to avoid the difference, but
it's too late now.

            regards, tom lane



On Tue, Apr 30, 2024 at 12:37 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Masahiko Sawada <sawada.mshk@gmail.com> writes:
> > On Tue, Apr 23, 2024 at 11:57 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> Reject as not a bug.  Discourage people from thinking that physical
> >> replication will work across architectures.
>
> > While cross-arch physical replication is not supported, I think having
> > architecture dependent differences is not good and It's legitimate to
> > fix it. FYI the 'char' data type comparison is done as though char is
> > unsigned. I've attached a small patch to fix it. What do you think?
>
> I think this will break existing indexes that are working fine.
> Yeah, it would have been better to avoid the difference, but
> it's too late now.

True. So it will be a PG18 item.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Masahiko Sawada <sawada.mshk@gmail.com> writes:
> On Tue, Apr 30, 2024 at 12:37 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I think this will break existing indexes that are working fine.
>> Yeah, it would have been better to avoid the difference, but
>> it's too late now.

> True. So it will be a PG18 item.

How will it be any better in v18?  It's still an on-disk
compatibility break for affected platforms.

Now, people could recover by reindexing affected indexes,
but I think we need to have a better justification than this
for making them do so.

            regards, tom lane



On Tue, Apr 23, 2024 at 5:57 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Guo, Adam" <adamguo@amazon.com> writes:
> > I would like to report an issue with the pg_trgm extension on
> > cross-architecture replication scenarios. When an x86_64 standby
> > server is replicating from an aarch64 primary server or vice versa,
> > the gist_trgm_ops opclass returns different results on the primary
> > and standby.
>
> I do not think that is a supported scenario.  Hash functions and
> suchlike are not guaranteed to produce the same results on different
> CPU architectures.  As a quick example, I get
>
> regression=# select hashfloat8(34);
>  hashfloat8
> ------------
>    21570837
> (1 row)
>
> on x86_64 but
>
> postgres=# select hashfloat8(34);
>  hashfloat8
> ------------
>  -602898821
> (1 row)
>
> on ppc32 thanks to the endianness difference.

Given this, should we try to do better with binary compatibility
checks using ControlFileData?  AFAICS they are supposed to check if
the database cluster is binary compatible with the running
architecture.  But it obviously allows incompatibilities.

------
Regards,
Alexander Korotkov



Alexander Korotkov <aekorotkov@gmail.com> writes:
> Given this, should we try to do better with binary compatibility
> checks using ControlFileData?  AFAICS they are supposed to check if
> the database cluster is binary compatible with the running
> architecture.  But it obviously allows incompatibilities.

Perhaps.  pg_control already covers endianness, which I think
is the root of the hashing differences I showed.  Adding a field
for char signedness feels a little weird, since it's not directly
a property of the bits-on-disk, but maybe we should.

            regards, tom lane



On Tue, Apr 30, 2024 at 7:54 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Alexander Korotkov <aekorotkov@gmail.com> writes:
> > Given this, should we try to do better with binary compatibility
> > checks using ControlFileData?  AFAICS they are supposed to check if
> > the database cluster is binary compatible with the running
> > architecture.  But it obviously allows incompatibilities.
>
> Perhaps.  pg_control already covers endianness, which I think
> is the root of the hashing differences I showed.  Adding a field
> for char signedness feels a little weird, since it's not directly
> a property of the bits-on-disk, but maybe we should.

I agree that storing char signedness might seem weird.  But it appears
that we already store indexes that depend on char signedness.  So,
it's effectively property of bits-on-disk even though it affects
indirectly.  Then I see two options to make the picture consistent.
1) Assume that char signedness is somehow a property of bits-on-disk
even though it's weird.  Then pg_trgm indexes are correct, but we need
to store char signedness in pg_control.
2) Assume that char signedness is not a property of bits-on-disk.
Then pg_trgm indexes are buggy and need to be fixed.
What do you think?

------
Regards,
Alexander Korotkov



Alexander Korotkov <aekorotkov@gmail.com> writes:
> I agree that storing char signedness might seem weird.  But it appears
> that we already store indexes that depend on char signedness.  So,
> it's effectively property of bits-on-disk even though it affects
> indirectly.  Then I see two options to make the picture consistent.
> 1) Assume that char signedness is somehow a property of bits-on-disk
> even though it's weird.  Then pg_trgm indexes are correct, but we need
> to store char signedness in pg_control.
> 2) Assume that char signedness is not a property of bits-on-disk.
> Then pg_trgm indexes are buggy and need to be fixed.
> What do you think?

The problem with option (2) is the assumption that pg_trgm's behavior
is the only bug of this kind, either now or in the future.  I think
that's just about an impossible standard to meet, because there's no
realistic way to test whether char signedness is affecting things.
(Sure, you can compare results across platforms, but maybe you
just didn't test the right case.)

Also, the bigger picture here is the seeming assumption that "if
we change pg_trgm then it will be safe to replicate from x86 to
arm".  I don't believe that that's a good idea and I'm unwilling
to promise that it will work, regardless of what we do about
char signedness.  That being the case, I don't want to invest a
lot of effort in the signedness issue.  Option (1) is clearly
a small change with little if any risk of future breakage.
Option (2) ... not so much.

            regards, tom lane



On Wed, May 1, 2024 at 2:29 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Alexander Korotkov <aekorotkov@gmail.com> writes:
> > I agree that storing char signedness might seem weird.  But it appears
> > that we already store indexes that depend on char signedness.  So,
> > it's effectively property of bits-on-disk even though it affects
> > indirectly.  Then I see two options to make the picture consistent.
> > 1) Assume that char signedness is somehow a property of bits-on-disk
> > even though it's weird.  Then pg_trgm indexes are correct, but we need
> > to store char signedness in pg_control.
> > 2) Assume that char signedness is not a property of bits-on-disk.
> > Then pg_trgm indexes are buggy and need to be fixed.
> > What do you think?
>
> The problem with option (2) is the assumption that pg_trgm's behavior
> is the only bug of this kind, either now or in the future.  I think
> that's just about an impossible standard to meet, because there's no
> realistic way to test whether char signedness is affecting things.
> (Sure, you can compare results across platforms, but maybe you
> just didn't test the right case.)
>
> Also, the bigger picture here is the seeming assumption that "if
> we change pg_trgm then it will be safe to replicate from x86 to
> arm".  I don't believe that that's a good idea and I'm unwilling
> to promise that it will work, regardless of what we do about
> char signedness.  That being the case, I don't want to invest a
> lot of effort in the signedness issue.

I think that the char signedness issue is an issue also for developers
(and extension authors) since it could lead to confusion and potential
bugs in the future due to that. x86 developers would think of char as
always being signed and write code that will misbehave on arm
machines. For example, since logical replication should behave
correctly even in cross-arch replication all developers need to be
aware of that. I thought of using the -funsigned-char (or
-fsigned-char) compiler flag to avoid that but it would have a broader
impact not only on indexes.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



On 30.04.24 19:29, Tom Lane wrote:
>> 1) Assume that char signedness is somehow a property of bits-on-disk
>> even though it's weird.  Then pg_trgm indexes are correct, but we need
>> to store char signedness in pg_control.
>> 2) Assume that char signedness is not a property of bits-on-disk.
>> Then pg_trgm indexes are buggy and need to be fixed.
>> What do you think?
> Also, the bigger picture here is the seeming assumption that "if
> we change pg_trgm then it will be safe to replicate from x86 to
> arm".  I don't believe that that's a good idea and I'm unwilling
> to promise that it will work, regardless of what we do about
> char signedness.  That being the case, I don't want to invest a
> lot of effort in the signedness issue.  Option (1) is clearly
> a small change with little if any risk of future breakage.

But note that option 1 would prevent some replication that is currently 
working.




Peter Eisentraut <peter@eisentraut.org> writes:
> On 30.04.24 19:29, Tom Lane wrote:
>> Also, the bigger picture here is the seeming assumption that "if
>> we change pg_trgm then it will be safe to replicate from x86 to
>> arm".  I don't believe that that's a good idea and I'm unwilling
>> to promise that it will work, regardless of what we do about
>> char signedness.  That being the case, I don't want to invest a
>> lot of effort in the signedness issue.  Option (1) is clearly
>> a small change with little if any risk of future breakage.

> But note that option 1 would prevent some replication that is currently 
> working.

The point of this thread though is that it's working only for small
values of "work".  People are rightfully unhappy if it seems to work
and then later they get bitten by compatibility problems.

Treating char signedness as a machine property in pg_control would
signal that we don't intend to make it work, and would ensure that
even the most minimal testing would find out that it doesn't work.

If we do not do that, it seems to me we have to buy into making
it work.  That would mean dealing with the consequences of an
incompatible change in pg_trgm indexes, and then going through
the same dance again the next time(s) similar problems are found.

            regards, tom lane



On 03.05.24 16:13, Tom Lane wrote:
> Peter Eisentraut <peter@eisentraut.org> writes:
>> On 30.04.24 19:29, Tom Lane wrote:
>>> Also, the bigger picture here is the seeming assumption that "if
>>> we change pg_trgm then it will be safe to replicate from x86 to
>>> arm".  I don't believe that that's a good idea and I'm unwilling
>>> to promise that it will work, regardless of what we do about
>>> char signedness.  That being the case, I don't want to invest a
>>> lot of effort in the signedness issue.  Option (1) is clearly
>>> a small change with little if any risk of future breakage.
> 
>> But note that option 1 would prevent some replication that is currently
>> working.
> 
> The point of this thread though is that it's working only for small
> values of "work".  People are rightfully unhappy if it seems to work
> and then later they get bitten by compatibility problems.
> 
> Treating char signedness as a machine property in pg_control would
> signal that we don't intend to make it work, and would ensure that
> even the most minimal testing would find out that it doesn't work.
> 
> If we do not do that, it seems to me we have to buy into making
> it work.  That would mean dealing with the consequences of an
> incompatible change in pg_trgm indexes, and then going through
> the same dance again the next time(s) similar problems are found.

Yes, that is understood.  But anecdotally, replicating between x86-64 
arm64 is occasionally used for upgrades or migrations.  In practice, 
this appears to have mostly worked.  If we now discover that it won't 
work with certain index extension modules, it's usable for most users. 
Even if we say, you have to reindex everything afterwards, it's probably 
still useful for these scenarios.

The way I understand the original report, the issue has to do 
specifically with how signed and unsigned chars compare differently.  I 
don't imagine this is used anywhere in the table/heap code.  So it's 
plausible that this issue is really contained to indexes.

On the other hand, if we put in a check against this, then at least we 
can answer any user questions about this with more certainty: No, won't 
work, here is why.




On 5/3/24 11:44, Peter Eisentraut wrote:
> On 03.05.24 16:13, Tom Lane wrote:
>> Peter Eisentraut <peter@eisentraut.org> writes:
>>> On 30.04.24 19:29, Tom Lane wrote:
>>>> Also, the bigger picture here is the seeming assumption that "if
>>>> we change pg_trgm then it will be safe to replicate from x86 to
>>>> arm".  I don't believe that that's a good idea and I'm unwilling
>>>> to promise that it will work, regardless of what we do about
>>>> char signedness.  That being the case, I don't want to invest a
>>>> lot of effort in the signedness issue.  Option (1) is clearly
>>>> a small change with little if any risk of future breakage.
>>
>>> But note that option 1 would prevent some replication that is currently
>>> working.
>>
>> The point of this thread though is that it's working only for small
>> values of "work".  People are rightfully unhappy if it seems to work
>> and then later they get bitten by compatibility problems.
>>
>> Treating char signedness as a machine property in pg_control would
>> signal that we don't intend to make it work, and would ensure that
>> even the most minimal testing would find out that it doesn't work.
>>
>> If we do not do that, it seems to me we have to buy into making
>> it work.  That would mean dealing with the consequences of an
>> incompatible change in pg_trgm indexes, and then going through
>> the same dance again the next time(s) similar problems are found.
> 
> Yes, that is understood.  But anecdotally, replicating between x86-64 arm64 is 
> occasionally used for upgrades or migrations.  In practice, this appears to have 
> mostly worked.  If we now discover that it won't work with certain index 
> extension modules, it's usable for most users. Even if we say, you have to 
> reindex everything afterwards, it's probably still useful for these scenarios.

+1

I have heard similar anecdotes, and the reported experience goes even further -- 
many such upgrade/migration uses, with exceedingly rare reported failures.

-- 
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




On Sat, May 4, 2024 at 7:36 AM Joe Conway <mail@joeconway.com> wrote:
>
> On 5/3/24 11:44, Peter Eisentraut wrote:
> > On 03.05.24 16:13, Tom Lane wrote:
> >> Peter Eisentraut <peter@eisentraut.org> writes:
> >>> On 30.04.24 19:29, Tom Lane wrote:
> >>>> Also, the bigger picture here is the seeming assumption that "if
> >>>> we change pg_trgm then it will be safe to replicate from x86 to
> >>>> arm".  I don't believe that that's a good idea and I'm unwilling
> >>>> to promise that it will work, regardless of what we do about
> >>>> char signedness.  That being the case, I don't want to invest a
> >>>> lot of effort in the signedness issue.  Option (1) is clearly
> >>>> a small change with little if any risk of future breakage.
> >>
> >>> But note that option 1 would prevent some replication that is currently
> >>> working.
> >>
> >> The point of this thread though is that it's working only for small
> >> values of "work".  People are rightfully unhappy if it seems to work
> >> and then later they get bitten by compatibility problems.
> >>
> >> Treating char signedness as a machine property in pg_control would
> >> signal that we don't intend to make it work, and would ensure that
> >> even the most minimal testing would find out that it doesn't work.
> >>
> >> If we do not do that, it seems to me we have to buy into making
> >> it work.  That would mean dealing with the consequences of an
> >> incompatible change in pg_trgm indexes, and then going through
> >> the same dance again the next time(s) similar problems are found.
> >
> > Yes, that is understood.  But anecdotally, replicating between x86-64 arm64 is
> > occasionally used for upgrades or migrations.  In practice, this appears to have
> > mostly worked.  If we now discover that it won't work with certain index
> > extension modules, it's usable for most users. Even if we say, you have to
> > reindex everything afterwards, it's probably still useful for these scenarios.
>
> +1

+1

How about extending amcheck to support GIN and GIst indexes so that it
can detect potential data incompatibility due to changing 'char' to
'unsigned char'? I think these new tests would be useful also for
users to check if they really need to reindex indexes due to such
changes. Also we fix pg_trgm so that it uses 'unsigned char' in PG18.
Users who upgraded to PG18 can run the new amcheck tests on the
primary as well as the standby.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



On Wed, May 15, 2024 at 02:56:54PM +0900, Masahiko Sawada wrote:
> On Sat, May 4, 2024 at 7:36 AM Joe Conway <mail@joeconway.com> wrote:
> > On 5/3/24 11:44, Peter Eisentraut wrote:
> > > On 03.05.24 16:13, Tom Lane wrote:
> > >> Peter Eisentraut <peter@eisentraut.org> writes:
> > >>> On 30.04.24 19:29, Tom Lane wrote:
> > >>>> Also, the bigger picture here is the seeming assumption that "if
> > >>>> we change pg_trgm then it will be safe to replicate from x86 to
> > >>>> arm".  I don't believe that that's a good idea and I'm unwilling
> > >>>> to promise that it will work, regardless of what we do about
> > >>>> char signedness.  That being the case, I don't want to invest a
> > >>>> lot of effort in the signedness issue.  Option (1) is clearly
> > >>>> a small change with little if any risk of future breakage.
> > >>
> > >>> But note that option 1 would prevent some replication that is currently
> > >>> working.
> > >>
> > >> The point of this thread though is that it's working only for small
> > >> values of "work".  People are rightfully unhappy if it seems to work
> > >> and then later they get bitten by compatibility problems.
> > >>
> > >> Treating char signedness as a machine property in pg_control would
> > >> signal that we don't intend to make it work, and would ensure that
> > >> even the most minimal testing would find out that it doesn't work.
> > >>
> > >> If we do not do that, it seems to me we have to buy into making
> > >> it work.  That would mean dealing with the consequences of an
> > >> incompatible change in pg_trgm indexes, and then going through
> > >> the same dance again the next time(s) similar problems are found.
> > >
> > > Yes, that is understood.  But anecdotally, replicating between x86-64 arm64 is
> > > occasionally used for upgrades or migrations.  In practice, this appears to have
> > > mostly worked.  If we now discover that it won't work with certain index
> > > extension modules, it's usable for most users. Even if we say, you have to
> > > reindex everything afterwards, it's probably still useful for these scenarios.

Like you, I would not introduce a ControlFileData block for sign-of-char,
given the signs of breakage in extension indexing only.  That would lose much
useful migration capability.  I'm sympathetic to Tom's point, which I'll
attempt to summarize as: a ControlFileData block is a promise we know how to
keep, whereas we should expect further bug discoveries without it.  At the
same time, I put more weight on the apparently-wide span of functionality that
works fine.  (I wonder whether any static analyzer does or practically could
detect char sign dependence with a decent false positive rate.)

> +1
> 
> How about extending amcheck to support GIN and GIst indexes so that it
> can detect potential data incompatibility due to changing 'char' to
> 'unsigned char'? I think these new tests would be useful also for
> users to check if they really need to reindex indexes due to such
> changes. Also we fix pg_trgm so that it uses 'unsigned char' in PG18.
> Users who upgraded to PG18 can run the new amcheck tests on the
> primary as well as the standby.

If I were standardizing pg_trgm on one or the other notion of "char", I would
choose signed char, since I think it's still the majority.  More broadly, I
see these options to fix pg_trgm:

1. Change to signed char.  Every arm64 system needs to scan pg_trgm indexes.
2. Change to unsigned char.  Every x86 system needs to scan pg_trgm indexes.
3. Offer both, as an upgrade path.  For example, pg_trgm could have separate
   operator classes gin_trgm_ops and gin_trgm_ops_unsigned.  Running
   pg_upgrade on an unsigned-char system would automatically map v17
   gin_trgm_ops to v18 gin_trgm_ops_unsigned.  This avoids penalizing any
   architecture with upgrade-time scans.

Independently, having amcheck for GIN and/or GiST would be great.