Обсуждение: Collations and Replication; Next Steps

Поиск

Список

Период

Сортировка

Collations and Replication; Next Steps

От

Matthew Kelly

Дата:

16 сентября 2014 г., 21:41:17

Hello,

Last month, I brought up the following issue to the general mailing list about how running streaming replication between machines running different versions of glibc can cause corrupt indexes.

http://www.postgresql.org/message-id/BA6132ED-1F6B-4A0B-AC22-81278F5AB81E@tripadvisor.com

In the month following, we have done further investigation here at TripAdvisor and have found that scope of this issue is far more troubling than initially thought. Hackers seems like appropriate place to present this update because it will certainly motivate some discussion about the approach to collation support going forward.

After the initial episode, we thought it was necessary to find the true scope of the problem. We developed a quick smoke test to evaluate the integrity of the indexes on a given machine. We understood that the test was not exhaustive, but it would catch most instances of corrupt indexes given TripAdvisor's normal database usage pattern. The source code with documentation about how it works is available at (https://github.com/mkellycs/postgres_index_integrity_check) for those interested.

What we found with this simple check was simply frightening. In every single streaming replica cluster where one or more machines had been commissioned at a different time, that member was found to be corrupt. When hardware upgrades of the master had been accomplished with a streaming replication, the new master was also found to have similar issues. The following numbers are only as small as they are because our adoption of streaming replication has barely just begun. So far we have found:

8 internal production databases, and 2 live site database servers effected.
Up to 3771 rows out of place in a single index (more correctly: 3771 times a row was smaller then the row before it when sorted in ascending order, the actual number of incorrectly placed rows is probably much higher)
On the worst offender, there were 12806 rows out of place across 26 indexes
On average roughly 15% of indexes containing text keys on tables larger 100MB were found to exhibit this issue
In at least one case, rebuilding a unique index on a master revealed that the database had allowed 100+ primary key violations.

It sounds like we as a community knew that these issues were theoretically possible, but I now have empirical evidence demonstrating the prevalence of this issue on our corpus of international data. Instances of this issue showed up in indexes of member usernames, location/property names, and even Facebook url's. I encourage other sufficiently large operations who index internationalized text to run similar tests; its highly likely they have similar latent issues that they just have not detected yet.

Here is the simple reality. Collation based indexes, streaming replication, and multiple versions of glibc/os cannot coexist in a sufficiently large operation and not cause corrupt indexes. The current options are to collate all of your indexes in C, or to ensure that all of your machines run exactly the same OS version.

The first and immediate TODO is to patch the documentation to add warnings regarding this issue. I can propose a doc patch explaining the issue, if no one has any objections.

The second and far more challenging problem is how do we fix this issue? As of our last discussion, Peter Geoghegan revived the proposal of using ICU as an alternative. (http://www.postgresql.org/message-id/CAEYLb_WvdCzuL=Cyf1xyzjwn-1CVo6kZEaWMKbxTS3jPhtjOig@mail.gmail.com) I do not feel qualified to compare the value of this library to other options, but I am certainly willing to help with the patch process once a direction has been selected.

I will be at Postgres Open in Chicago this week, and I will be more than willing to further discuss the details of what we have found.

Regards,

Matt Kelly

Re: Collations and Replication; Next Steps

От

Peter Eisentraut

Дата:

17 сентября 2014 г., 00:11:49

On 9/16/14 12:06 PM, Matthew Kelly wrote:
> The second and far more challenging problem is how do we fix this issue?
>  As of our last discussion, Peter Geoghegan revived the proposal of
> using ICU as an alternative.
>  (http://www.postgresql.org/message-id/CAEYLb_WvdCzuL=Cyf1xyzjwn-1CVo6kZEaWMKbxTS3jPhtjOig@mail.gmail.com)
>  I do not feel qualified to compare the value of this library to other
> options, but I am certainly willing to help with the patch process once
> a direction has been selected.

It seems to me that this is a more general problem that can affect any
data type that relies on anything external.  For example, you could
probably create a case where indexes are corrupted if you have two
different time zone databases.  Or what if you use PostGIS and one of
the libraries it uses has different rounding behaviors in different
versions?

Even in the absence of such external dependencies, there will still be
problems like this if you don't upgrade all nodes participating in
replication at the same time.

Clearly, this is worth documenting, but I don't think we can completely
prevent the problem.  There has been talk of a built-in index integrity
checking tool.  That would be quite useful.

Re: Collations and Replication; Next Steps

От

Peter Geoghegan

Дата:

17 сентября 2014 г., 00:57:07

On Tue, Sep 16, 2014 at 2:07 PM, Peter Eisentraut <peter_e@gmx.net> wrote:
> Clearly, this is worth documenting, but I don't think we can completely
> prevent the problem.  There has been talk of a built-in index integrity
> checking tool.  That would be quite useful.

We could at least use the GNU facility for versioning collations where
available, LC_IDENTIFICATION [1]. By not versioning collations, we are
going against the express advice of the Unicode consortium (they also
advise to do a strcmp() tie-breaker, something that I think we
independently discovered in 2005, because of a bug report - this is
what I like to call "the Hungarian issue". They know what our
constraints are.). I recognize it's a tricky problem, because of our
historic dependence on OS collations, but I think we should definitely
do something. That said, I'm not volunteering for the task, because I
don't have time. While I'm not sure of what the long term solution
should be, it *is not* okay that we don't version collations. I think
that even the best possible B-Tree check tool is a not a solution.

[1] http://www.postgresql.org/message-id/CAEYLb_UTMgM2V_pP7qnuKZYmTYXoym-zNYVbwoU79=TuP8HE3A@mail.gmail.com
-- 
Peter Geoghegan

Re: Collations and Replication; Next Steps

От

Peter Geoghegan

Дата:

17 сентября 2014 г., 01:41:49

On Tue, Sep 16, 2014 at 2:07 PM, Peter Eisentraut <peter_e@gmx.net> wrote:
> It seems to me that this is a more general problem that can affect any
> data type that relies on anything external.  For example, you could
> probably create a case where indexes are corrupted if you have two
> different time zone databases.  Or what if you use PostGIS and one of
> the libraries it uses has different rounding behaviors in different
> versions?

The timezone case you highlight here seems quite distinct from what
Matthew is talking about, because in point of fact the on-disk
representation is merely *interpreted* with reference to the timezone
database. So, you could have an inconsistency between standbys
concerning what the time was in a particular timezone at a particular
timestamp value as reported by the timestamptz output function, but
both standbys would be correct on their own terms, which isn't too
bad. You still cannot have a situation where on a single standby, a
value isn't returned by an index scan that patently exists in the
table on the same standby (i.e. index corruption); the timezone isn't
actually stored (just an offset from a special Postgres epoch). As for
the PostGIS example, I think that they'd know better than to change
the behavior of an established opclass B-Tree support function 1. If
people that author opclasses don't read the documentation on how to do
so correctly, what chance do regular DBAs have? Should they make sure
that operator classes are authored correctly in each and every
instance? Surely not.

Even if I was wrong about all of this, we should treat text as a
special case, a case worth making every effort for.

-- 
Peter Geoghegan

Re: Collations and Replication; Next Steps

От

Martijn van Oosterhout

Дата:

17 сентября 2014 г., 15:39:25

On Tue, Sep 16, 2014 at 02:57:00PM -0700, Peter Geoghegan wrote:
> On Tue, Sep 16, 2014 at 2:07 PM, Peter Eisentraut <peter_e@gmx.net> wrote:
> > Clearly, this is worth documenting, but I don't think we can completely
> > prevent the problem.  There has been talk of a built-in index integrity
> > checking tool.  That would be quite useful.
>
> We could at least use the GNU facility for versioning collations where
> available, LC_IDENTIFICATION [1]. By not versioning collations, we are
> going against the express advice of the Unicode consortium (they also
> advise to do a strcmp() tie-breaker, something that I think we
> independently discovered in 2005, because of a bug report - this is
> what I like to call "the Hungarian issue". They know what our
> constraints are.). I recognize it's a tricky problem, because of our
> historic dependence on OS collations, but I think we should definitely
> do something. That said, I'm not volunteering for the task, because I
> don't have time. While I'm not sure of what the long term solution
> should be, it *is not* okay that we don't version collations. I think
> that even the best possible B-Tree check tool is a not a solution.

Personally I think we should just support ICU as an option. FreeBSD has
been maintaining an out of tree patch for 10 years now so we know it
works.

The FreeBSD patch is not optimal though, these days ICU supports UTF-8
directly so many of the push-ups FreeBSD does are no longer necessary.
It is often faster than glibc and the key sizes for strxfrm are more
compact [1] which is relevent for the recent optimisation patch.

Lets solve this problem for once and for all.

[1] http://site.icu-project.org/charts/collation-icu4c48-glibc

--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> He who writes carelessly confesses thereby at the very outset that he does
> not attach much importance to his own thoughts.  -- Arthur Schopenhauer

Re: Collations and Replication; Next Steps

От

Matthew Kelly

Дата:

17 сентября 2014 г., 16:08:12

Here is where I think the timezone and PostGIS cases are fundamentally different:
I can pretty easily make sure that all my servers run in the same timezone.  That's just good practice.  I'm also going
toinstall the same version of PostGIS everywhere in a cluster.  I'll build PostGIS and its dependencies from the exact
samesource files, regardless of when I build the machine. 

Timezone is a user level setting; PostGIS is a user level library used by a subset.

glibc is a system level library, and text is a core data type, however.  Changing versions to something that doesn't
matchthe kernel can lead to system level instability, broken linkers, etc.  (I know because I tried).  Here are some
subtleother problems that fall out: 
* Upgrading glibc, the kernel, and linker through the package manager in order to get security updates can cause the
corruption.*A basebackup that is taken in production and placed on a backup server might not be valid on that server,
oryour desktop machine, or on the spare you keep to do PITR when someone screws up.* Unless you keep _all_ of your
clusterson the same OS, machines from your database spare pool probably won't be the right OS when you add them to the
clusterbecause a member failed. 

Keep in mind here, by OS I mean CentOS versions.  (we're running a mix of late 5.x and 6.x, because of our numerous
issueswith the 6.x kernel) 

The problem with LC_IDENTIFICATION is that every machine I have seen reports revision "1.0", date "2000-06-24".  It
doesn'tseem like the versioning is being actively maintained. 

I'm with Martjin here, lets go ICU, if only because it moves sorting to a user level library, instead of a system
level. Martjin do you have a link to the out of tree patch?  If not I'll find it.  I'd like to apply it to a branch and
startplaying with it. 

- Matt K

On Sep 17, 2014, at 7:39 AM, Martijn van Oosterhout <kleptog@svana.org>wrote:

> On Tue, Sep 16, 2014 at 02:57:00PM -0700, Peter Geoghegan wrote:
>> On Tue, Sep 16, 2014 at 2:07 PM, Peter Eisentraut <peter_e@gmx.net> wrote:
>>> Clearly, this is worth documenting, but I don't think we can completely
>>> prevent the problem.  There has been talk of a built-in index integrity
>>> checking tool.  That would be quite useful.
>>
>> We could at least use the GNU facility for versioning collations where
>> available, LC_IDENTIFICATION [1]. By not versioning collations, we are
>> going against the express advice of the Unicode consortium (they also
>> advise to do a strcmp() tie-breaker, something that I think we
>> independently discovered in 2005, because of a bug report - this is
>> what I like to call "the Hungarian issue". They know what our
>> constraints are.). I recognize it's a tricky problem, because of our
>> historic dependence on OS collations, but I think we should definitely
>> do something. That said, I'm not volunteering for the task, because I
>> don't have time. While I'm not sure of what the long term solution
>> should be, it *is not* okay that we don't version collations. I think
>> that even the best possible B-Tree check tool is a not a solution.
>
> Personally I think we should just support ICU as an option. FreeBSD has
> been maintaining an out of tree patch for 10 years now so we know it
> works.
>
> The FreeBSD patch is not optimal though, these days ICU supports UTF-8
> directly so many of the push-ups FreeBSD does are no longer necessary.
> It is often faster than glibc and the key sizes for strxfrm are more
> compact [1] which is relevent for the recent optimisation patch.
>
> Lets solve this problem for once and for all.
>
> [1] http://site.icu-project.org/charts/collation-icu4c48-glibc
>
> --
> Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
>> He who writes carelessly confesses thereby at the very outset that he does
>> not attach much importance to his own thoughts.
>   -- Arthur Schopenhauer

Re: Collations and Replication; Next Steps

От

Robert Haas

Дата:

17 сентября 2014 г., 16:17:29

On Wed, Sep 17, 2014 at 9:07 AM, Matthew Kelly <mkelly@tripadvisor.com> wrote:
> Here is where I think the timezone and PostGIS cases are fundamentally different:
> I can pretty easily make sure that all my servers run in the same timezone.  That's just good practice.  I'm also
goingto install the same version of PostGIS everywhere in a cluster.  I'll build PostGIS and its dependencies from the
exactsame source files, regardless of when I build the machine. 
>
> Timezone is a user level setting; PostGIS is a user level library used by a subset.
>
> glibc is a system level library, and text is a core data type, however.  Changing versions to something that doesn't
matchthe kernel can lead to system level instability, broken linkers, etc.  (I know because I tried).  Here are some
subtleother problems that fall out: 
>
>  * Upgrading glibc, the kernel, and linker through the package manager in order to get security updates can cause the
corruption.
>  * A basebackup that is taken in production and placed on a backup server might not be valid on that server, or your
desktopmachine, or on the spare you keep to do PITR when someone screws up. 
>  * Unless you keep _all_ of your clusters on the same OS, machines from your database spare pool probably won't be
theright OS when you add them to the cluster because a member failed. 
>
> Keep in mind here, by OS I mean CentOS versions.  (we're running a mix of late 5.x and 6.x, because of our numerous
issueswith the 6.x kernel) 
>
> The problem with LC_IDENTIFICATION is that every machine I have seen reports revision "1.0", date "2000-06-24".  It
doesn'tseem like the versioning is being actively maintained. 
>
> I'm with Martjin here, lets go ICU, if only because it moves sorting to a user level library, instead of a system
level. Martjin do you have a link to the out of tree patch?  If not I'll find it.  I'd like to apply it to a branch and
startplaying with it. 

What I find astonishing is that whoever maintains glibc (or the Red
Hat packaging for it) thinks it's OK to change the collation order in
a minor release.  I'd understand changing it between, say, RHEL 6 and
RHEL 7.  But the idea that minor release, supposedly safe updates
think they can whack this around without breaking applications really
kind of blows my mind.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Collations and Replication; Next Steps

От

Matthew Kelly

Дата:

17 сентября 2014 г., 17:10:15

Let me double check that assertion before we go too far with it.

Most of the problems I've seen are across 5 and 6 boundaries.  I thought I had case where it was within a minor release
butI can't find it right now.  I'm going to dig. 

That being said the sort order changes whether you statically or dynamically link (demonstrated on 4+ machines running
differentlinux flavors), so at the point I have no reason to trust the stability of the sort across any build.  I
legitimatelyquestion whether strcoll is buggy.  Ex. I have cases where for three strings a, b and c:  a > b, but  (a ||
c)< (b || c).  That's right postfixing doesn't hold.  It actually calls into question the index scan optimization that
occurswhen you do LIKE 'test%' even on a single machine, but I don't want to bite that off at the moment. 

My mentality has switched to 'don't trust any change until shown otherwise', so that may have bled into my last email.

- Matt K.




On Sep 17, 2014, at 8:17 AM, Robert Haas <robertmhaas@gmail.com>wrote:

> On Wed, Sep 17, 2014 at 9:07 AM, Matthew Kelly <mkelly@tripadvisor.com> wrote:
>> Here is where I think the timezone and PostGIS cases are fundamentally different:
>> I can pretty easily make sure that all my servers run in the same timezone.  That's just good practice.  I'm also
goingto install the same version of PostGIS everywhere in a cluster.  I'll build PostGIS and its dependencies from the
exactsame source files, regardless of when I build the machine. 
>>
>> Timezone is a user level setting; PostGIS is a user level library used by a subset.
>>
>> glibc is a system level library, and text is a core data type, however.  Changing versions to something that doesn't
matchthe kernel can lead to system level instability, broken linkers, etc.  (I know because I tried).  Here are some
subtleother problems that fall out: 
>>
>> * Upgrading glibc, the kernel, and linker through the package manager in order to get security updates can cause the
corruption.
>> * A basebackup that is taken in production and placed on a backup server might not be valid on that server, or your
desktopmachine, or on the spare you keep to do PITR when someone screws up. 
>> * Unless you keep _all_ of your clusters on the same OS, machines from your database spare pool probably won't be
theright OS when you add them to the cluster because a member failed. 
>>
>> Keep in mind here, by OS I mean CentOS versions.  (we're running a mix of late 5.x and 6.x, because of our numerous
issueswith the 6.x kernel) 
>>
>> The problem with LC_IDENTIFICATION is that every machine I have seen reports revision "1.0", date "2000-06-24".  It
doesn'tseem like the versioning is being actively maintained. 
>>
>> I'm with Martjin here, lets go ICU, if only because it moves sorting to a user level library, instead of a system
level. Martjin do you have a link to the out of tree patch?  If not I'll find it.  I'd like to apply it to a branch and
startplaying with it. 
>
> What I find astonishing is that whoever maintains glibc (or the Red
> Hat packaging for it) thinks it's OK to change the collation order in
> a minor release.  I'd understand changing it between, say, RHEL 6 and
> RHEL 7.  But the idea that minor release, supposedly safe updates
> think they can whack this around without breaking applications really
> kind of blows my mind.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company

Re: Collations and Replication; Next Steps

От

Greg Stark

Дата:

17 сентября 2014 г., 17:47:33

On Tue, Sep 16, 2014 at 11:41 PM, Peter Geoghegan <pg@heroku.com> wrote:
> The timezone case you highlight here seems quite distinct from what
> Matthew is talking about, because in point of fact the on-disk
> representation is merely *interpreted* with reference to the timezone
> database. So, you could have an inconsistency between standbys
> concerning what the time was in a particular timezone at a particular
> timestamp value as reported by the timestamptz output function, but
> both standbys would be correct on their own terms, which isn't too
> bad.

You could have a problem if you have an expression index on (timestamp
AT TIME ZONE '...'). I may have the expression slightly wrong but I
believe it is posisble to write an immutable expression that depends
on the tzdata data as long as it doesn't depend on not the user's
current time zone (which would be stable but not immutable). The
actual likelihood of that situation might be much lower and the
ability to avoid it higher but in theory I think Peter's right that
it's the same class of problem.

Generally speaking we try to protect against most environment
dependencies that lead to corrupt databases by encoding them in the
control file. Obviously we can't encode an entire collation in the
controlfile though. We could conceivably have a corpus of
representative strings that we sort and then checksum in the
controlfile. It wouldn't be foolproof but if we collect interesting
examples as we find them it might be a worthwhile safety check.

Just brainstorming... I wonder if it would be possible to include any
collation comparisons made in handling an index insert in the xlog
record and have the standby verify those comparisons are valid on the
standby. I guess that would be pretty hard to arrange code-wise since
the comparisons could be coming from anywhere to say nothing of the
wal bloat.

Peter G, could go into more detail about collation versioning? What
would the implications be for Postgres?

-- 
greg

Re: Collations and Replication; Next Steps

От

Tatsuo Ishii

Дата:

17 сентября 2014 г., 17:47:42

Why don't we have our collation data? It seems MySQL has already done this.

http://dev.mysql.com/doc/refman/5.0/en/charset-collation-implementations.html

I don't think we cannot achieve that because even MySQL accomplishes:-)

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

Re: Collations and Replication; Next Steps

От

Greg Stark

Дата:

17 сентября 2014 г., 17:58:25

On Wed, Sep 17, 2014 at 3:47 PM, Tatsuo Ishii <ishii@postgresql.org> wrote:
> I don't think we cannot achieve that because even MySQL accomplishes:-)

We've always considered it an advantage that we're consistent with the
collations in the rest of the system. Generally speaking the fact that
Postgres integrates with the system rather than be a separate system
unto itself.

Consider bug reports like "I've configured my system to use
fr_FR.UTF-8 and "sort" produces output in this order why is Postgres
producing output in a different order? Or extension authors using
strcoll and being surprised that the module gets inconsistent data
from the database.

Separately we always had a huge problem with ICU that it depended on
storing everything in a UCS-16 native encoding and required converting
to and from UTF-8 using an iterator interface. I heard that improved
somewhat but from what I understand it would be a struggle to avoid
copying every string before using it and consuming twice as much
memory. No more using strings directly out of disk buffers.

Then there's the concern that ICU is a *huge* dependency. ICU is
itself larger than the entire Postgres install. It's a big burden on
users to have to install and configure a second collation library in
addition to the system library and a complete non-starter for embedded
systems or low-memory systems.


-- 
greg

Re: Collations and Replication; Next Steps

От

Peter Geoghegan

Дата:

17 сентября 2014 г., 20:56:01

On Wed, Sep 17, 2014 at 6:17 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> What I find astonishing is that whoever maintains glibc (or the Red
> Hat packaging for it) thinks it's OK to change the collation order in
> a minor release.  I'd understand changing it between, say, RHEL 6 and
> RHEL 7.  But the idea that minor release, supposedly safe updates
> think they can whack this around without breaking applications really
> kind of blows my mind.

Why wouldn't they feel entitled to? To quote UTS #10 [1]:

"""
Collation order is not fixed.

Over time, collation order will vary: there may be fixes needed as
more information becomes available about languages; there may be new
government or industry standards for the language that require
changes; and finally, new characters added to the Unicode Standard
will interleave with the previously-defined ones. This means that
collations must be carefully versioned.
"""

Indeed, they do version collations with LC_IDENTIFICATION. We just
don't make any attempt to use the version information. In short, this
is our fault.   :-(

[1] http://www.unicode.org/reports/tr10/#Stability
-- 
Peter Geoghegan

Re: Collations and Replication; Next Steps

От

Peter Eisentraut

Дата:

17 сентября 2014 г., 20:59:11

On 9/17/14 10:47 AM, Tatsuo Ishii wrote:
> Why don't we have our collation data? It seems MySQL has already done this.

Where would you get the source data from?  How would you maintain it?

Re: Collations and Replication; Next Steps

От

Peter Eisentraut

Дата:

17 сентября 2014 г., 21:05:21

On 9/16/14 5:57 PM, Peter Geoghegan wrote:
> On Tue, Sep 16, 2014 at 2:07 PM, Peter Eisentraut <peter_e@gmx.net> wrote:
>> Clearly, this is worth documenting, but I don't think we can completely
>> prevent the problem.  There has been talk of a built-in index integrity
>> checking tool.  That would be quite useful.
> 
> We could at least use the GNU facility for versioning collations where
> available, LC_IDENTIFICATION [1].

It looks like the revisions or dates reported by LC_IDENTIFICATION
aren't ever updated for most locales.

Re: Collations and Replication; Next Steps

От

Martijn van Oosterhout

Дата:

17 сентября 2014 г., 21:05:25

On Wed, Sep 17, 2014 at 01:07:56PM +0000, Matthew Kelly wrote:
> I'm with Martjin here, lets go ICU, if only because it moves sorting
> to a user level library, instead of a system level.  Martjin do you
> have a link to the out of tree patch?  If not I'll find it.  I'd like
> to apply it to a branch and start playing with it.

http://people.freebsd.org/~girgen/postgresql-icu/README.html
http://people.freebsd.org/~girgen/postgresql-icu/

Note I said optional. It is a large library for sure, but for some
installations I think the benefits are sufficient.

Mvg,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> He who writes carelessly confesses thereby at the very outset that he does
> not attach much importance to his own thoughts.  -- Arthur Schopenhauer

Re: Collations and Replication; Next Steps

От

Peter Eisentraut

Дата:

17 сентября 2014 г., 21:08:43

On 9/17/14 9:07 AM, Matthew Kelly wrote:
> Here is where I think the timezone and PostGIS cases are fundamentally different:
> I can pretty easily make sure that all my servers run in the same timezone.  That's just good practice.  I'm also
goingto install the same version of PostGIS everywhere in a cluster.  I'll build PostGIS and its dependencies from the
exactsame source files, regardless of when I build the machine.

I wrote time zone *database*, not time zone.  The time zone database  is
(in some configurations) part of glibc.

I also wrote PostGIS dependent libraries, not PostGIS itself.  If you
are comparing RHEL 5 and 6, as you wrote elsewhere, then some of those
will most likely be different.  (Heck, glibc could be different.  Is
glibc never allowed to fix insufficiencies in its floating-point
implementation, for example?)

Also, there is nothing that guarantees that the PostGIS version will be
the same on both sides.

Re: Collations and Replication; Next Steps

От

Peter Geoghegan

Дата:

17 сентября 2014 г., 21:13:21

On Wed, Sep 17, 2014 at 11:05 AM, Peter Eisentraut <peter_e@gmx.net> wrote:
>> We could at least use the GNU facility for versioning collations where
>> available, LC_IDENTIFICATION [1].
>
> It looks like the revisions or dates reported by LC_IDENTIFICATION
> aren't ever updated for most locales.

That's not surprising. There is zero controversy about how to
correctly sort English text, for example. For other languages, that
might be much less true.

-- 
Peter Geoghegan

Re: Collations and Replication; Next Steps

От

Peter Eisentraut

Дата:

17 сентября 2014 г., 21:15:43

On 9/17/14 10:46 AM, Greg Stark wrote:
> You could have a problem if you have an expression index on (timestamp
> AT TIME ZONE '...'). I may have the expression slightly wrong but I
> believe it is posisble to write an immutable expression that depends
> on the tzdata data as long as it doesn't depend on not the user's
> current time zone (which would be stable but not immutable). The
> actual likelihood of that situation might be much lower and the
> ability to avoid it higher but in theory I think Peter's right that
> it's the same class of problem.

I was thinking of something like a text column with "natural" input of
time stamp information, and and index on that_column::timestamp.

> Generally speaking we try to protect against most environment
> dependencies that lead to corrupt databases by encoding them in the
> control file. Obviously we can't encode an entire collation in the
> controlfile though. We could conceivably have a corpus of
> representative strings that we sort and then checksum in the
> controlfile. It wouldn't be foolproof but if we collect interesting
> examples as we find them it might be a worthwhile safety check.

I think it could be useful in a number of situations if a type could
stick some arbitrary additional information into a new column in
pg_type, such as versions of libraries it depends on or storage format
versions.

Then again, collation isn't actually a property of any single type.

Re: Collations and Replication; Next Steps

От

Peter Eisentraut

Дата:

17 сентября 2014 г., 21:20:14

On 9/17/14 2:07 PM, Peter Geoghegan wrote:
> On Wed, Sep 17, 2014 at 11:05 AM, Peter Eisentraut <peter_e@gmx.net> wrote:
>>> We could at least use the GNU facility for versioning collations where
>>> available, LC_IDENTIFICATION [1].
>>
>> It looks like the revisions or dates reported by LC_IDENTIFICATION
>> aren't ever updated for most locales.
> 
> 
> That's not surprising. There is zero controversy about how to
> correctly sort English text, for example. For other languages, that
> might be much less true.
> 

a) There is plenty of controversy about how to sort English text on
Stack Overflow. ;-)

b) Even in an English locale you have to maintain a sort order for all
Unicode characters, and that changes more than zero times.  But it's
quite clear from looking at the glibc git logs that no one is
maintaining these version numbers.

Re: Collations and Replication; Next Steps

От

Peter Geoghegan

Дата:

17 сентября 2014 г., 21:22:40

On Wed, Sep 17, 2014 at 11:08 AM, Peter Eisentraut <peter_e@gmx.net> wrote:
> I also wrote PostGIS dependent libraries, not PostGIS itself.  If you
> are comparing RHEL 5 and 6, as you wrote elsewhere, then some of those
> will most likely be different.  (Heck, glibc could be different.  Is
> glibc never allowed to fix insufficiencies in its floating-point
> implementation, for example?)

The operator class author has a responsibility to make sure that
doesn't happen. If he or she should fail, then it's a bug, and
possibly a failure of imagination on their part. This is the only way
of thinking about it that makes sense. If you want to use a library
feature in your opclass B-Tree support function 1, then you'd better
be damned sure that it implies immutability insofar as that's
possible. Sure, it's also possible that your users could be the victim
on an unfortunate upstream bug that you couldn't reasonably predict,
but when is that not true?

In general, I am totally unconvinced by this line of argument. It
implies that everyone has to be an expert on everything just to use
Postgres.

-- 
Peter Geoghegan

Re: Collations and Replication; Next Steps

От

Peter Geoghegan

Дата:

17 сентября 2014 г., 23:41:30

On Wed, Sep 17, 2014 at 7:46 AM, Greg Stark <stark@mit.edu> wrote:
> You could have a problem if you have an expression index on (timestamp
> AT TIME ZONE '...'). I may have the expression slightly wrong but I
> believe it is posisble to write an immutable expression that depends
> on the tzdata data as long as it doesn't depend on not the user's
> current time zone (which would be stable but not immutable). The
> actual likelihood of that situation might be much lower and the
> ability to avoid it higher but in theory I think Peter's right that
> it's the same class of problem.

Really? If you have an "IMMUTABLE" expression predicated on something
that isn't immutable, then it isn't actually immutable, regardless of
the ability of Postgres to determine that. It's that simple. So, if
that's true, then it sounds like a bug to me. I didn't check, but I
think that AT TIME ZONE just changes the display format, and the
relevant output function is only stable, and so this shouldn't be a
problem.

> Peter G, could go into more detail about collation versioning? What
> would the implications be for Postgres?

Well, I think one implication might be that Postgres won't start, just
because you updated the OS (we have no way to get back results
consistent with the old collation file, I would think). At least your
database isn't corrupt, but that's still a really bad loss of
availability. This makes me lean towards pursuing ICU support as part
of any versioning scheme. There is a reason why everyone else does
something similar. Apparently DB2 uses ICU.

-- 
Peter Geoghegan

Re: Collations and Replication; Next Steps

От

Tatsuo Ishii

Дата:

18 сентября 2014 г., 02:41:43

> On 9/17/14 10:47 AM, Tatsuo Ishii wrote:
>> Why don't we have our collation data? It seems MySQL has already done this.
> 
> Where would you get the source data from?  How would you maintain it?

Don't know. However seeing that that MySQL manages it, it should be
possible for us.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

Re: Collations and Replication; Next Steps

От

Tatsuo Ishii

Дата:

18 сентября 2014 г., 02:46:28

> On Wed, Sep 17, 2014 at 3:47 PM, Tatsuo Ishii <ishii@postgresql.org> wrote:
>> I don't think we cannot achieve that because even MySQL accomplishes:-)
> 
> We've always considered it an advantage that we're consistent with the
> collations in the rest of the system. Generally speaking the fact that
> Postgres integrates with the system rather than be a separate system
> unto itself.
>
> Consider bug reports like "I've configured my system to use
> fr_FR.UTF-8 and "sort" produces output in this order why is Postgres
> producing output in a different order? Or extension authors using
> strcoll and being surprised that the module gets inconsistent data
> from the database.

I doubt it. glibc takes liberty to change the collation data release
by release, but people don't seem to complain it. Then why would
people complain the collation difference between PostgreSQL and glibc
if there's any?

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

Re: Collations and Replication; Next Steps

От

Robert Haas

Дата:

18 сентября 2014 г., 03:16:43

On Wed, Sep 17, 2014 at 10:06 AM, Matthew Kelly <mkelly@tripadvisor.com> wrote:
> Let me double check that assertion before we go too far with it.
>
> Most of the problems I've seen are across 5 and 6 boundaries.  I thought I had case where it was within a minor
releasebut I can't find it right now.  I'm going to dig. 
>
> That being said the sort order changes whether you statically or dynamically link (demonstrated on 4+ machines
runningdifferent linux flavors), so at the point I have no reason to trust the stability of the sort across any build.
Ilegitimately question whether strcoll is buggy.  Ex. I have cases where for three strings a, b and c:  a > b, but  (a
||c) < (b || c).  That's right postfixing doesn't hold.  It actually calls into question the index scan optimization
thatoccurs when you do LIKE 'test%' even on a single machine, but I don't want to bite that off at the moment. 
>
> My mentality has switched to 'don't trust any change until shown otherwise', so that may have bled into my last
email.

Of course, there's also the question of whether ICU would have similar
issues.  You're assuming that they *don't* whack the collation order
around in minor releases, or at least that they do so to some lesser
degree than glibc, but is that actually true?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Collations and Replication; Next Steps

От

Peter Geoghegan

Дата:

18 сентября 2014 г., 03:26:15

On Wed, Sep 17, 2014 at 5:16 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> Of course, there's also the question of whether ICU would have similar
> issues.  You're assuming that they *don't* whack the collation order
> around in minor releases, or at least that they do so to some lesser
> degree than glibc, but is that actually true?

No, but they're disciplined about it. They clearly do versioning
properly, which seems to not be the case with glibc, based on Peter's
remarks: http://userguide.icu-project.org/collation/architecture#TOC-Versioning
(they talk about a 32-bit identifier here).

PostgreSQL's problems in this area are exactly the same as every other
database system's (the Unicode consortium anticipated these problems
too, and as I pointed out have commented on these problems.). A bunch
of prominent database systems are listed as using ICU.
-- 
Peter Geoghegan

Re: Collations and Replication; Next Steps

От

Oleg Bartunov

Дата:

18 сентября 2014 г., 07:07:23

<div dir="ltr">We use ICU with postgres for many years in our  mchar extension, which provides  case-insensitive  text
datatype for popular  russian financial system.  I don't know if we may ask ICU to give us special BSD-compatible
license?<br /></div>

Re: Collations and Replication; Next Steps

От

Peter Geoghegan

Дата:

18 сентября 2014 г., 07:09:38

On Wed, Sep 17, 2014 at 9:06 PM, Oleg Bartunov <obartunov@gmail.com> wrote:
> We use ICU with postgres for many years in our  mchar extension, which
> provides  case-insensitive  text data type for popular  russian financial
> system.  I don't know if we may ask ICU to give us special BSD-compatible
> license ?

I don't think that's necessary. Firebird uses ICU, and has similar
licensing terms to PostgreSQL.

-- 
Peter Geoghegan

Re: Collations and Replication; Next Steps

От

Craig Ringer

Дата:

18 сентября 2014 г., 07:11:18

On 09/17/2014 09:17 PM, Robert Haas wrote:
> What I find astonishing is that whoever maintains glibc (or the Red
> Hat packaging for it) thinks it's OK to change the collation order in
> a minor release.  I'd understand changing it between, say, RHEL 6 and
> RHEL 7.  But the idea that minor release, supposedly safe updates
> think they can whack this around without breaking applications really
> kind of blows my mind.


If confirmed, it certainly requires some ... firm ... bug reports.

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: Collations and Replication; Next Steps

От

Oleg Bartunov

Дата:

18 сентября 2014 г., 07:17:27

On Thu, Sep 18, 2014 at 1:09 PM, Peter Geoghegan <pg@heroku.com> wrote:

On Wed, Sep 17, 2014 at 9:06 PM, Oleg Bartunov <obartunov@gmail.com> wrote:
> We use ICU with postgres for many years in our mchar extension, which
> provides case-insensitive text data type for popular russian financial
> system. I don't know if we may ask ICU to give us special BSD-compatible
> license ?

I don't think that's necessary. Firebird uses ICU, and has similar
licensing terms to PostgreSQL.

Firebird uses MPL license, which is compatible with GPL
http://en.wikipedia.org/wiki/Mozilla_Public_License

--
Peter Geoghegan

Re: Collations and Replication; Next Steps

От

Tatsuo Ishii

Дата:

18 сентября 2014 г., 07:35:17

> On Wed, Sep 17, 2014 at 9:06 PM, Oleg Bartunov <obartunov@gmail.com> wrote:
>> We use ICU with postgres for many years in our  mchar extension, which
>> provides  case-insensitive  text data type for popular  russian financial
>> system.  I don't know if we may ask ICU to give us special BSD-compatible
>> license ?
> 
> I don't think that's necessary. Firebird uses ICU, and has similar
> licensing terms to PostgreSQL.

http://source.icu-project.org/repos/icu/icu/trunk/license.html

........permission notice appear in supporting documentation.Permission ishereby granted, free of charge, to any person
obtaininga copy ofthis software and associated documentation files (the "Software"), todeal in the Software without
restriction,including withoutlimitation the rights to use, copy, modify, merge, publish,distribute, and/or sell copies
ofthe Software, and to permit personsto whom the Software is furnished to do so, provided that the abovecopyright
notice(s)and this permission notice appear in all copiesof the Software and that both the above copyright notice(s) and
thispermissionnotice appear in supporting documentation.

........

In my understanding PostgreSQL's manual MUST include the ICU license
term (this is not a problem).  What I am not so sure is, any software
uses PostgreSQL also MUST include the ICU license or not. If yes, I
think this is surely a problem.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

Re: Collations and Replication; Next Steps

От

Peter Geoghegan

Дата:

18 сентября 2014 г., 08:36:40

On Wed, Sep 17, 2014 at 9:35 PM, Tatsuo Ishii <ishii@postgresql.org> wrote:
> In my understanding PostgreSQL's manual MUST include the ICU license
> term (this is not a problem).  What I am not so sure is, any software
> uses PostgreSQL also MUST include the ICU license or not. If yes, I
> think this is surely a problem.


It uses the MIT license, which is less restrictive than even the BSD
license. I believe one part (the Chinese/Japanese Word Break
Dictionary Data) is BSD Licensed, though:

https://ssl.icu-project.org/repos/icu/icu/trunk/license.html

I don't think licensing would be a problem.
-- 
Peter Geoghegan

Re: Collations and Replication; Next Steps

От

Martijn van Oosterhout

Дата:

18 сентября 2014 г., 14:26:13

On Thu, Sep 18, 2014 at 01:35:10PM +0900, Tatsuo Ishii wrote:
> In my understanding PostgreSQL's manual MUST include the ICU license
> term (this is not a problem).  What I am not so sure is, any software
> uses PostgreSQL also MUST include the ICU license or not. If yes, I
> think this is surely a problem.

Only if we're thinking of distributing it. If the user gets ICU from
their distribution then there is no need to list the licence (just like
we don't need to mention the licence of glibc).  We only need link
against it, not distribute it.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> He who writes carelessly confesses thereby at the very outset that he does
> not attach much importance to his own thoughts.  -- Arthur Schopenhauer

Re: Collations and Replication; Next Steps

От

Martijn van Oosterhout

Дата:

18 сентября 2014 г., 14:47:12

On Wed, Sep 17, 2014 at 03:57:38PM +0100, Greg Stark wrote:
> Then there's the concern that ICU is a *huge* dependency. ICU is
> itself larger than the entire Postgres install. It's a big burden on
> users to have to install and configure a second collation library in
> addition to the system library and a complete non-starter for embedded
> systems or low-memory systems.

$ apt-cache show libicu52|grep Installed-Size
Installed-Size: 27283

That's 27MB or less than 2 WAL files. Or about 4 times the template
database, or which 3 are created during install and the first user
database will be a fourth.

The installed size of Postgres is 11MB not including any of the
libraries it already depends on.

Yes, it's not a small library but lets not get carried away.

And if it's optional then low memory systems can configure it out.

As for configuration, ICU doesn't require configuration just like glibc
doesn't require configuration.

Mvg,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> He who writes carelessly confesses thereby at the very outset that he does
> not attach much importance to his own thoughts.  -- Arthur Schopenhauer

Re: Collations and Replication; Next Steps

От

Oleg Bartunov

Дата:

18 сентября 2014 г., 16:13:30

On Thu, Sep 18, 2014 at 3:25 PM, Martijn van Oosterhout <kleptog@svana.org> wrote:

On Thu, Sep 18, 2014 at 01:35:10PM +0900, Tatsuo Ishii wrote:
> In my understanding PostgreSQL's manual MUST include the ICU license
> term (this is not a problem). What I am not so sure is, any software
> uses PostgreSQL also MUST include the ICU license or not. If yes, I
> think this is surely a problem.

Only if we're thinking of distributing it. If the user gets ICU from
their distribution then there is no need to list the licence (just like
we don't need to mention the licence of glibc). We only need link
against it, not distribute it.

I understand how it'd works with extension, but not with core.

Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
> He who writes carelessly confesses thereby at the very outset that he does
> not attach much importance to his own thoughts.
-- Arthur Schopenhauer

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)

iQIVAwUBVBrBQkvt++dL5i1EAQhDehAAiQg7iLKficvXMSfAvwR+8Qp7yTG3wiNu
oZ5ieomZCYhlTpXvHae9dWTJPKehjiCCy/zjVyLpFvfHQVptadjBVefkIFSS/d+V
Y7X3gPI3d+8Tc+ZGVZF/CX/5eG9iPgKbsDRpK0zs5j+C4D1HYjxLFf4jWYI/gTXN
Abfr6taSi4YrSAw/4bSMlSQMFDU/wmLx175R4f8j2CvEYmspgXe89i4QU2V14m6Y
bB+zGhkxxJZAubTq3UcPzDDeX3FH4KqS/4NTSZ1V1ceIWo3r9jRPLHpceAvQlHwP
e4eNnRkFZbTeLlOUcvd7N7qkEc2kDYEGXKyaNqr868N022mFiYx4AHwOU/U/dbmm
Xw22FpSTwkokmwugohr7wrL7tUJV7NtHrcEUyPd/2cuddIRvO2H2iV4wqR6ct0TZ
2RCd1b7bIKq+ywrGkySW1xplMhGmGygfPnUzqzlZ2f1YxcmK6PNMnpxldy5nvl3V
2rNnPOoWS3H+R6aE31sSGH+Gl9w6J4lvpPiTAwx8pGoBPi4fWWqvwHUPp7FNqsAO
8fjo00+MN9Vbg0YsqHiE6oCp2pKs3BJy3IHfZiw2nefh9UcEV29666atUBJjb3bw
hO65Km7uzoacX+WKm0XdmaQhdUwVJKIFFoH3sxnawA6+CUr4M8mUDtQicnd/ajRo
HX0lbgRk9z4=
=31r4
-----END PGP SIGNATURE-----

Re: Collations and Replication; Next Steps

От

Heikki Linnakangas

Дата:

18 сентября 2014 г., 16:51:47

On 09/18/2014 04:12 PM, Oleg Bartunov wrote:
> On Thu, Sep 18, 2014 at 3:25 PM, Martijn van Oosterhout <kleptog@svana.org>
> wrote:
>
>> On Thu, Sep 18, 2014 at 01:35:10PM +0900, Tatsuo Ishii wrote:
>>> In my understanding PostgreSQL's manual MUST include the ICU license
>>> term (this is not a problem).  What I am not so sure is, any software
>>> uses PostgreSQL also MUST include the ICU license or not. If yes, I
>>> think this is surely a problem.
>>
>> Only if we're thinking of distributing it. If the user gets ICU from
>> their distribution then there is no need to list the licence (just like
>> we don't need to mention the licence of glibc).  We only need link
>> against it, not distribute it.
>
> I understand how it'd works with extension, but not with core.

The same it works with libxml, openssl, libreadline and all the other 
libraries you can build with.

- Heikki

Re: Collations and Replication; Next Steps

От

Peter Geoghegan

Дата:

18 сентября 2014 г., 21:59:07

On Thu, Sep 18, 2014 at 6:51 AM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:
> The same it works with libxml, openssl, libreadline and all the other
> libraries you can build with.

I like the comparison with libxml. If we were to adopt ICU, it would
be as a core component that makes collation versioning work, that in
practice all packages use. It wouldn't actually be mandatory, but
almost universally available in practice.

I really think that long term, relying on the OS collations is not a
good plan. It's a big contributing factor to our reticence on the
question of when two different systems should be considered compatible
for the purposes of physical replication. Not that I'm volunteering to
work on it!
-- 
Peter Geoghegan

Re: Collations and Replication; Next Steps

От

Bruce Momjian

Дата:

30 сентября 2014 г., 17:57:07

On Wed, Sep 17, 2014 at 01:07:56PM +0000, Matthew Kelly wrote:
>  * Unless you keep _all_ of your clusters on the same OS, machines
>  from your database spare pool probably won't be the right OS when you
>  add them to the cluster because a member failed.

There has been discussion about having master/streaming slaves use the
same OS version, but a simple OS update of an existing master can break
indexes too.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + Everyone has their own god. +

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Collations and Replication; Next Steps