Обсуждение: Collations and Replication; Next Steps
Hello,
Last month, I brought up the following issue to the general mailing list about how running streaming replication between machines running different versions of glibc can cause corrupt indexes.
In the month following, we have done further investigation here at TripAdvisor and have found that scope of this issue is far more troubling than initially thought. Hackers seems like appropriate place to present this update because it will certainly motivate some discussion about the approach to collation support going forward.
After the initial episode, we thought it was necessary to find the true scope of the problem. We developed a quick smoke test to evaluate the integrity of the indexes on a given machine. We understood that the test was not exhaustive, but it would catch most instances of corrupt indexes given TripAdvisor's normal database usage pattern. The source code with documentation about how it works is available at (https://github.com/mkellycs/postgres_index_integrity_check) for those interested.
What we found with this simple check was simply frightening. In every single streaming replica cluster where one or more machines had been commissioned at a different time, that member was found to be corrupt. When hardware upgrades of the master had been accomplished with a streaming replication, the new master was also found to have similar issues. The following numbers are only as small as they are because our adoption of streaming replication has barely just begun. So far we have found:
- 8 internal production databases, and 2 live site database servers effected.
- Up to 3771 rows out of place in a single index (more correctly: 3771 times a row was smaller then the row before it when sorted in ascending order, the actual number of incorrectly placed rows is probably much higher)
- On the worst offender, there were 12806 rows out of place across 26 indexes
- On average roughly 15% of indexes containing text keys on tables larger 100MB were found to exhibit this issue
- In at least one case, rebuilding a unique index on a master revealed that the database had allowed 100+ primary key violations.
It sounds like we as a community knew that these issues were theoretically possible, but I now have empirical evidence demonstrating the prevalence of this issue on our corpus of international data. Instances of this issue showed up in indexes of member usernames, location/property names, and even Facebook url's. I encourage other sufficiently large operations who index internationalized text to run similar tests; its highly likely they have similar latent issues that they just have not detected yet.
Here is the simple reality. Collation based indexes, streaming replication, and multiple versions of glibc/os cannot coexist in a sufficiently large operation and not cause corrupt indexes. The current options are to collate all of your indexes in C, or to ensure that all of your machines run exactly the same OS version.
The first and immediate TODO is to patch the documentation to add warnings regarding this issue. I can propose a doc patch explaining the issue, if no one has any objections.
The second and far more challenging problem is how do we fix this issue? As of our last discussion, Peter Geoghegan revived the proposal of using ICU as an alternative. (http://www.postgresql.org/message-id/CAEYLb_WvdCzuL=Cyf1xyzjwn-1CVo6kZEaWMKbxTS3jPhtjOig@mail.gmail.com) I do not feel qualified to compare the value of this library to other options, but I am certainly willing to help with the patch process once a direction has been selected.
I will be at Postgres Open in Chicago this week, and I will be more than willing to further discuss the details of what we have found.
Regards,
Matt Kelly
On 9/16/14 12:06 PM, Matthew Kelly wrote: > The second and far more challenging problem is how do we fix this issue? > As of our last discussion, Peter Geoghegan revived the proposal of > using ICU as an alternative. > (http://www.postgresql.org/message-id/CAEYLb_WvdCzuL=Cyf1xyzjwn-1CVo6kZEaWMKbxTS3jPhtjOig@mail.gmail.com) > I do not feel qualified to compare the value of this library to other > options, but I am certainly willing to help with the patch process once > a direction has been selected. It seems to me that this is a more general problem that can affect any data type that relies on anything external. For example, you could probably create a case where indexes are corrupted if you have two different time zone databases. Or what if you use PostGIS and one of the libraries it uses has different rounding behaviors in different versions? Even in the absence of such external dependencies, there will still be problems like this if you don't upgrade all nodes participating in replication at the same time. Clearly, this is worth documenting, but I don't think we can completely prevent the problem. There has been talk of a built-in index integrity checking tool. That would be quite useful.
On Tue, Sep 16, 2014 at 2:07 PM, Peter Eisentraut <peter_e@gmx.net> wrote: > Clearly, this is worth documenting, but I don't think we can completely > prevent the problem. There has been talk of a built-in index integrity > checking tool. That would be quite useful. We could at least use the GNU facility for versioning collations where available, LC_IDENTIFICATION [1]. By not versioning collations, we are going against the express advice of the Unicode consortium (they also advise to do a strcmp() tie-breaker, something that I think we independently discovered in 2005, because of a bug report - this is what I like to call "the Hungarian issue". They know what our constraints are.). I recognize it's a tricky problem, because of our historic dependence on OS collations, but I think we should definitely do something. That said, I'm not volunteering for the task, because I don't have time. While I'm not sure of what the long term solution should be, it *is not* okay that we don't version collations. I think that even the best possible B-Tree check tool is a not a solution. [1] http://www.postgresql.org/message-id/CAEYLb_UTMgM2V_pP7qnuKZYmTYXoym-zNYVbwoU79=TuP8HE3A@mail.gmail.com -- Peter Geoghegan
On Tue, Sep 16, 2014 at 2:07 PM, Peter Eisentraut <peter_e@gmx.net> wrote: > It seems to me that this is a more general problem that can affect any > data type that relies on anything external. For example, you could > probably create a case where indexes are corrupted if you have two > different time zone databases. Or what if you use PostGIS and one of > the libraries it uses has different rounding behaviors in different > versions? The timezone case you highlight here seems quite distinct from what Matthew is talking about, because in point of fact the on-disk representation is merely *interpreted* with reference to the timezone database. So, you could have an inconsistency between standbys concerning what the time was in a particular timezone at a particular timestamp value as reported by the timestamptz output function, but both standbys would be correct on their own terms, which isn't too bad. You still cannot have a situation where on a single standby, a value isn't returned by an index scan that patently exists in the table on the same standby (i.e. index corruption); the timezone isn't actually stored (just an offset from a special Postgres epoch). As for the PostGIS example, I think that they'd know better than to change the behavior of an established opclass B-Tree support function 1. If people that author opclasses don't read the documentation on how to do so correctly, what chance do regular DBAs have? Should they make sure that operator classes are authored correctly in each and every instance? Surely not. Even if I was wrong about all of this, we should treat text as a special case, a case worth making every effort for. -- Peter Geoghegan
On Tue, Sep 16, 2014 at 02:57:00PM -0700, Peter Geoghegan wrote: > On Tue, Sep 16, 2014 at 2:07 PM, Peter Eisentraut <peter_e@gmx.net> wrote: > > Clearly, this is worth documenting, but I don't think we can completely > > prevent the problem. There has been talk of a built-in index integrity > > checking tool. That would be quite useful. > > We could at least use the GNU facility for versioning collations where > available, LC_IDENTIFICATION [1]. By not versioning collations, we are > going against the express advice of the Unicode consortium (they also > advise to do a strcmp() tie-breaker, something that I think we > independently discovered in 2005, because of a bug report - this is > what I like to call "the Hungarian issue". They know what our > constraints are.). I recognize it's a tricky problem, because of our > historic dependence on OS collations, but I think we should definitely > do something. That said, I'm not volunteering for the task, because I > don't have time. While I'm not sure of what the long term solution > should be, it *is not* okay that we don't version collations. I think > that even the best possible B-Tree check tool is a not a solution. Personally I think we should just support ICU as an option. FreeBSD has been maintaining an out of tree patch for 10 years now so we know it works. The FreeBSD patch is not optimal though, these days ICU supports UTF-8 directly so many of the push-ups FreeBSD does are no longer necessary. It is often faster than glibc and the key sizes for strxfrm are more compact [1] which is relevent for the recent optimisation patch. Lets solve this problem for once and for all. [1] http://site.icu-project.org/charts/collation-icu4c48-glibc -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > He who writes carelessly confesses thereby at the very outset that he does > not attach much importance to his own thoughts. -- Arthur Schopenhauer
Here is where I think the timezone and PostGIS cases are fundamentally different: I can pretty easily make sure that all my servers run in the same timezone. That's just good practice. I'm also going toinstall the same version of PostGIS everywhere in a cluster. I'll build PostGIS and its dependencies from the exact samesource files, regardless of when I build the machine. Timezone is a user level setting; PostGIS is a user level library used by a subset. glibc is a system level library, and text is a core data type, however. Changing versions to something that doesn't matchthe kernel can lead to system level instability, broken linkers, etc. (I know because I tried). Here are some subtleother problems that fall out: * Upgrading glibc, the kernel, and linker through the package manager in order to get security updates can cause the corruption.*A basebackup that is taken in production and placed on a backup server might not be valid on that server, oryour desktop machine, or on the spare you keep to do PITR when someone screws up.* Unless you keep _all_ of your clusterson the same OS, machines from your database spare pool probably won't be the right OS when you add them to the clusterbecause a member failed. Keep in mind here, by OS I mean CentOS versions. (we're running a mix of late 5.x and 6.x, because of our numerous issueswith the 6.x kernel) The problem with LC_IDENTIFICATION is that every machine I have seen reports revision "1.0", date "2000-06-24". It doesn'tseem like the versioning is being actively maintained. I'm with Martjin here, lets go ICU, if only because it moves sorting to a user level library, instead of a system level. Martjin do you have a link to the out of tree patch? If not I'll find it. I'd like to apply it to a branch and startplaying with it. - Matt K On Sep 17, 2014, at 7:39 AM, Martijn van Oosterhout <kleptog@svana.org>wrote: > On Tue, Sep 16, 2014 at 02:57:00PM -0700, Peter Geoghegan wrote: >> On Tue, Sep 16, 2014 at 2:07 PM, Peter Eisentraut <peter_e@gmx.net> wrote: >>> Clearly, this is worth documenting, but I don't think we can completely >>> prevent the problem. There has been talk of a built-in index integrity >>> checking tool. That would be quite useful. >> >> We could at least use the GNU facility for versioning collations where >> available, LC_IDENTIFICATION [1]. By not versioning collations, we are >> going against the express advice of the Unicode consortium (they also >> advise to do a strcmp() tie-breaker, something that I think we >> independently discovered in 2005, because of a bug report - this is >> what I like to call "the Hungarian issue". They know what our >> constraints are.). I recognize it's a tricky problem, because of our >> historic dependence on OS collations, but I think we should definitely >> do something. That said, I'm not volunteering for the task, because I >> don't have time. While I'm not sure of what the long term solution >> should be, it *is not* okay that we don't version collations. I think >> that even the best possible B-Tree check tool is a not a solution. > > Personally I think we should just support ICU as an option. FreeBSD has > been maintaining an out of tree patch for 10 years now so we know it > works. > > The FreeBSD patch is not optimal though, these days ICU supports UTF-8 > directly so many of the push-ups FreeBSD does are no longer necessary. > It is often faster than glibc and the key sizes for strxfrm are more > compact [1] which is relevent for the recent optimisation patch. > > Lets solve this problem for once and for all. > > [1] http://site.icu-project.org/charts/collation-icu4c48-glibc > > -- > Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ >> He who writes carelessly confesses thereby at the very outset that he does >> not attach much importance to his own thoughts. > -- Arthur Schopenhauer
On Wed, Sep 17, 2014 at 9:07 AM, Matthew Kelly <mkelly@tripadvisor.com> wrote: > Here is where I think the timezone and PostGIS cases are fundamentally different: > I can pretty easily make sure that all my servers run in the same timezone. That's just good practice. I'm also goingto install the same version of PostGIS everywhere in a cluster. I'll build PostGIS and its dependencies from the exactsame source files, regardless of when I build the machine. > > Timezone is a user level setting; PostGIS is a user level library used by a subset. > > glibc is a system level library, and text is a core data type, however. Changing versions to something that doesn't matchthe kernel can lead to system level instability, broken linkers, etc. (I know because I tried). Here are some subtleother problems that fall out: > > * Upgrading glibc, the kernel, and linker through the package manager in order to get security updates can cause the corruption. > * A basebackup that is taken in production and placed on a backup server might not be valid on that server, or your desktopmachine, or on the spare you keep to do PITR when someone screws up. > * Unless you keep _all_ of your clusters on the same OS, machines from your database spare pool probably won't be theright OS when you add them to the cluster because a member failed. > > Keep in mind here, by OS I mean CentOS versions. (we're running a mix of late 5.x and 6.x, because of our numerous issueswith the 6.x kernel) > > The problem with LC_IDENTIFICATION is that every machine I have seen reports revision "1.0", date "2000-06-24". It doesn'tseem like the versioning is being actively maintained. > > I'm with Martjin here, lets go ICU, if only because it moves sorting to a user level library, instead of a system level. Martjin do you have a link to the out of tree patch? If not I'll find it. I'd like to apply it to a branch and startplaying with it. What I find astonishing is that whoever maintains glibc (or the Red Hat packaging for it) thinks it's OK to change the collation order in a minor release. I'd understand changing it between, say, RHEL 6 and RHEL 7. But the idea that minor release, supposedly safe updates think they can whack this around without breaking applications really kind of blows my mind. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Let me double check that assertion before we go too far with it. Most of the problems I've seen are across 5 and 6 boundaries. I thought I had case where it was within a minor release butI can't find it right now. I'm going to dig. That being said the sort order changes whether you statically or dynamically link (demonstrated on 4+ machines running differentlinux flavors), so at the point I have no reason to trust the stability of the sort across any build. I legitimatelyquestion whether strcoll is buggy. Ex. I have cases where for three strings a, b and c: a > b, but (a || c)< (b || c). That's right postfixing doesn't hold. It actually calls into question the index scan optimization that occurswhen you do LIKE 'test%' even on a single machine, but I don't want to bite that off at the moment. My mentality has switched to 'don't trust any change until shown otherwise', so that may have bled into my last email. - Matt K. On Sep 17, 2014, at 8:17 AM, Robert Haas <robertmhaas@gmail.com>wrote: > On Wed, Sep 17, 2014 at 9:07 AM, Matthew Kelly <mkelly@tripadvisor.com> wrote: >> Here is where I think the timezone and PostGIS cases are fundamentally different: >> I can pretty easily make sure that all my servers run in the same timezone. That's just good practice. I'm also goingto install the same version of PostGIS everywhere in a cluster. I'll build PostGIS and its dependencies from the exactsame source files, regardless of when I build the machine. >> >> Timezone is a user level setting; PostGIS is a user level library used by a subset. >> >> glibc is a system level library, and text is a core data type, however. Changing versions to something that doesn't matchthe kernel can lead to system level instability, broken linkers, etc. (I know because I tried). Here are some subtleother problems that fall out: >> >> * Upgrading glibc, the kernel, and linker through the package manager in order to get security updates can cause the corruption. >> * A basebackup that is taken in production and placed on a backup server might not be valid on that server, or your desktopmachine, or on the spare you keep to do PITR when someone screws up. >> * Unless you keep _all_ of your clusters on the same OS, machines from your database spare pool probably won't be theright OS when you add them to the cluster because a member failed. >> >> Keep in mind here, by OS I mean CentOS versions. (we're running a mix of late 5.x and 6.x, because of our numerous issueswith the 6.x kernel) >> >> The problem with LC_IDENTIFICATION is that every machine I have seen reports revision "1.0", date "2000-06-24". It doesn'tseem like the versioning is being actively maintained. >> >> I'm with Martjin here, lets go ICU, if only because it moves sorting to a user level library, instead of a system level. Martjin do you have a link to the out of tree patch? If not I'll find it. I'd like to apply it to a branch and startplaying with it. > > What I find astonishing is that whoever maintains glibc (or the Red > Hat packaging for it) thinks it's OK to change the collation order in > a minor release. I'd understand changing it between, say, RHEL 6 and > RHEL 7. But the idea that minor release, supposedly safe updates > think they can whack this around without breaking applications really > kind of blows my mind. > > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company
On Tue, Sep 16, 2014 at 11:41 PM, Peter Geoghegan <pg@heroku.com> wrote: > The timezone case you highlight here seems quite distinct from what > Matthew is talking about, because in point of fact the on-disk > representation is merely *interpreted* with reference to the timezone > database. So, you could have an inconsistency between standbys > concerning what the time was in a particular timezone at a particular > timestamp value as reported by the timestamptz output function, but > both standbys would be correct on their own terms, which isn't too > bad. You could have a problem if you have an expression index on (timestamp AT TIME ZONE '...'). I may have the expression slightly wrong but I believe it is posisble to write an immutable expression that depends on the tzdata data as long as it doesn't depend on not the user's current time zone (which would be stable but not immutable). The actual likelihood of that situation might be much lower and the ability to avoid it higher but in theory I think Peter's right that it's the same class of problem. Generally speaking we try to protect against most environment dependencies that lead to corrupt databases by encoding them in the control file. Obviously we can't encode an entire collation in the controlfile though. We could conceivably have a corpus of representative strings that we sort and then checksum in the controlfile. It wouldn't be foolproof but if we collect interesting examples as we find them it might be a worthwhile safety check. Just brainstorming... I wonder if it would be possible to include any collation comparisons made in handling an index insert in the xlog record and have the standby verify those comparisons are valid on the standby. I guess that would be pretty hard to arrange code-wise since the comparisons could be coming from anywhere to say nothing of the wal bloat. Peter G, could go into more detail about collation versioning? What would the implications be for Postgres? -- greg
Why don't we have our collation data? It seems MySQL has already done this. http://dev.mysql.com/doc/refman/5.0/en/charset-collation-implementations.html I don't think we cannot achieve that because even MySQL accomplishes:-) Best regards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp
On Wed, Sep 17, 2014 at 3:47 PM, Tatsuo Ishii <ishii@postgresql.org> wrote: > I don't think we cannot achieve that because even MySQL accomplishes:-) We've always considered it an advantage that we're consistent with the collations in the rest of the system. Generally speaking the fact that Postgres integrates with the system rather than be a separate system unto itself. Consider bug reports like "I've configured my system to use fr_FR.UTF-8 and "sort" produces output in this order why is Postgres producing output in a different order? Or extension authors using strcoll and being surprised that the module gets inconsistent data from the database. Separately we always had a huge problem with ICU that it depended on storing everything in a UCS-16 native encoding and required converting to and from UTF-8 using an iterator interface. I heard that improved somewhat but from what I understand it would be a struggle to avoid copying every string before using it and consuming twice as much memory. No more using strings directly out of disk buffers. Then there's the concern that ICU is a *huge* dependency. ICU is itself larger than the entire Postgres install. It's a big burden on users to have to install and configure a second collation library in addition to the system library and a complete non-starter for embedded systems or low-memory systems. -- greg
On Wed, Sep 17, 2014 at 6:17 AM, Robert Haas <robertmhaas@gmail.com> wrote: > What I find astonishing is that whoever maintains glibc (or the Red > Hat packaging for it) thinks it's OK to change the collation order in > a minor release. I'd understand changing it between, say, RHEL 6 and > RHEL 7. But the idea that minor release, supposedly safe updates > think they can whack this around without breaking applications really > kind of blows my mind. Why wouldn't they feel entitled to? To quote UTS #10 [1]: """ Collation order is not fixed. Over time, collation order will vary: there may be fixes needed as more information becomes available about languages; there may be new government or industry standards for the language that require changes; and finally, new characters added to the Unicode Standard will interleave with the previously-defined ones. This means that collations must be carefully versioned. """ Indeed, they do version collations with LC_IDENTIFICATION. We just don't make any attempt to use the version information. In short, this is our fault. :-( [1] http://www.unicode.org/reports/tr10/#Stability -- Peter Geoghegan
On 9/17/14 10:47 AM, Tatsuo Ishii wrote: > Why don't we have our collation data? It seems MySQL has already done this. Where would you get the source data from? How would you maintain it?
On 9/16/14 5:57 PM, Peter Geoghegan wrote: > On Tue, Sep 16, 2014 at 2:07 PM, Peter Eisentraut <peter_e@gmx.net> wrote: >> Clearly, this is worth documenting, but I don't think we can completely >> prevent the problem. There has been talk of a built-in index integrity >> checking tool. That would be quite useful. > > We could at least use the GNU facility for versioning collations where > available, LC_IDENTIFICATION [1]. It looks like the revisions or dates reported by LC_IDENTIFICATION aren't ever updated for most locales.
On Wed, Sep 17, 2014 at 01:07:56PM +0000, Matthew Kelly wrote: > I'm with Martjin here, lets go ICU, if only because it moves sorting > to a user level library, instead of a system level. Martjin do you > have a link to the out of tree patch? If not I'll find it. I'd like > to apply it to a branch and start playing with it. http://people.freebsd.org/~girgen/postgresql-icu/README.html http://people.freebsd.org/~girgen/postgresql-icu/ Note I said optional. It is a large library for sure, but for some installations I think the benefits are sufficient. Mvg, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > He who writes carelessly confesses thereby at the very outset that he does > not attach much importance to his own thoughts. -- Arthur Schopenhauer
On 9/17/14 9:07 AM, Matthew Kelly wrote: > Here is where I think the timezone and PostGIS cases are fundamentally different: > I can pretty easily make sure that all my servers run in the same timezone. That's just good practice. I'm also goingto install the same version of PostGIS everywhere in a cluster. I'll build PostGIS and its dependencies from the exactsame source files, regardless of when I build the machine. I wrote time zone *database*, not time zone. The time zone database is (in some configurations) part of glibc. I also wrote PostGIS dependent libraries, not PostGIS itself. If you are comparing RHEL 5 and 6, as you wrote elsewhere, then some of those will most likely be different. (Heck, glibc could be different. Is glibc never allowed to fix insufficiencies in its floating-point implementation, for example?) Also, there is nothing that guarantees that the PostGIS version will be the same on both sides.
On Wed, Sep 17, 2014 at 11:05 AM, Peter Eisentraut <peter_e@gmx.net> wrote: >> We could at least use the GNU facility for versioning collations where >> available, LC_IDENTIFICATION [1]. > > It looks like the revisions or dates reported by LC_IDENTIFICATION > aren't ever updated for most locales. That's not surprising. There is zero controversy about how to correctly sort English text, for example. For other languages, that might be much less true. -- Peter Geoghegan
On 9/17/14 10:46 AM, Greg Stark wrote: > You could have a problem if you have an expression index on (timestamp > AT TIME ZONE '...'). I may have the expression slightly wrong but I > believe it is posisble to write an immutable expression that depends > on the tzdata data as long as it doesn't depend on not the user's > current time zone (which would be stable but not immutable). The > actual likelihood of that situation might be much lower and the > ability to avoid it higher but in theory I think Peter's right that > it's the same class of problem. I was thinking of something like a text column with "natural" input of time stamp information, and and index on that_column::timestamp. > Generally speaking we try to protect against most environment > dependencies that lead to corrupt databases by encoding them in the > control file. Obviously we can't encode an entire collation in the > controlfile though. We could conceivably have a corpus of > representative strings that we sort and then checksum in the > controlfile. It wouldn't be foolproof but if we collect interesting > examples as we find them it might be a worthwhile safety check. I think it could be useful in a number of situations if a type could stick some arbitrary additional information into a new column in pg_type, such as versions of libraries it depends on or storage format versions. Then again, collation isn't actually a property of any single type.
On 9/17/14 2:07 PM, Peter Geoghegan wrote: > On Wed, Sep 17, 2014 at 11:05 AM, Peter Eisentraut <peter_e@gmx.net> wrote: >>> We could at least use the GNU facility for versioning collations where >>> available, LC_IDENTIFICATION [1]. >> >> It looks like the revisions or dates reported by LC_IDENTIFICATION >> aren't ever updated for most locales. > > > That's not surprising. There is zero controversy about how to > correctly sort English text, for example. For other languages, that > might be much less true. > a) There is plenty of controversy about how to sort English text on Stack Overflow. ;-) b) Even in an English locale you have to maintain a sort order for all Unicode characters, and that changes more than zero times. But it's quite clear from looking at the glibc git logs that no one is maintaining these version numbers.
On Wed, Sep 17, 2014 at 11:08 AM, Peter Eisentraut <peter_e@gmx.net> wrote: > I also wrote PostGIS dependent libraries, not PostGIS itself. If you > are comparing RHEL 5 and 6, as you wrote elsewhere, then some of those > will most likely be different. (Heck, glibc could be different. Is > glibc never allowed to fix insufficiencies in its floating-point > implementation, for example?) The operator class author has a responsibility to make sure that doesn't happen. If he or she should fail, then it's a bug, and possibly a failure of imagination on their part. This is the only way of thinking about it that makes sense. If you want to use a library feature in your opclass B-Tree support function 1, then you'd better be damned sure that it implies immutability insofar as that's possible. Sure, it's also possible that your users could be the victim on an unfortunate upstream bug that you couldn't reasonably predict, but when is that not true? In general, I am totally unconvinced by this line of argument. It implies that everyone has to be an expert on everything just to use Postgres. -- Peter Geoghegan
On Wed, Sep 17, 2014 at 7:46 AM, Greg Stark <stark@mit.edu> wrote: > You could have a problem if you have an expression index on (timestamp > AT TIME ZONE '...'). I may have the expression slightly wrong but I > believe it is posisble to write an immutable expression that depends > on the tzdata data as long as it doesn't depend on not the user's > current time zone (which would be stable but not immutable). The > actual likelihood of that situation might be much lower and the > ability to avoid it higher but in theory I think Peter's right that > it's the same class of problem. Really? If you have an "IMMUTABLE" expression predicated on something that isn't immutable, then it isn't actually immutable, regardless of the ability of Postgres to determine that. It's that simple. So, if that's true, then it sounds like a bug to me. I didn't check, but I think that AT TIME ZONE just changes the display format, and the relevant output function is only stable, and so this shouldn't be a problem. > Peter G, could go into more detail about collation versioning? What > would the implications be for Postgres? Well, I think one implication might be that Postgres won't start, just because you updated the OS (we have no way to get back results consistent with the old collation file, I would think). At least your database isn't corrupt, but that's still a really bad loss of availability. This makes me lean towards pursuing ICU support as part of any versioning scheme. There is a reason why everyone else does something similar. Apparently DB2 uses ICU. -- Peter Geoghegan
> On 9/17/14 10:47 AM, Tatsuo Ishii wrote: >> Why don't we have our collation data? It seems MySQL has already done this. > > Where would you get the source data from? How would you maintain it? Don't know. However seeing that that MySQL manages it, it should be possible for us. Best regards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp
> On Wed, Sep 17, 2014 at 3:47 PM, Tatsuo Ishii <ishii@postgresql.org> wrote: >> I don't think we cannot achieve that because even MySQL accomplishes:-) > > We've always considered it an advantage that we're consistent with the > collations in the rest of the system. Generally speaking the fact that > Postgres integrates with the system rather than be a separate system > unto itself. > > Consider bug reports like "I've configured my system to use > fr_FR.UTF-8 and "sort" produces output in this order why is Postgres > producing output in a different order? Or extension authors using > strcoll and being surprised that the module gets inconsistent data > from the database. I doubt it. glibc takes liberty to change the collation data release by release, but people don't seem to complain it. Then why would people complain the collation difference between PostgreSQL and glibc if there's any? Best regards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp
On Wed, Sep 17, 2014 at 10:06 AM, Matthew Kelly <mkelly@tripadvisor.com> wrote: > Let me double check that assertion before we go too far with it. > > Most of the problems I've seen are across 5 and 6 boundaries. I thought I had case where it was within a minor releasebut I can't find it right now. I'm going to dig. > > That being said the sort order changes whether you statically or dynamically link (demonstrated on 4+ machines runningdifferent linux flavors), so at the point I have no reason to trust the stability of the sort across any build. Ilegitimately question whether strcoll is buggy. Ex. I have cases where for three strings a, b and c: a > b, but (a ||c) < (b || c). That's right postfixing doesn't hold. It actually calls into question the index scan optimization thatoccurs when you do LIKE 'test%' even on a single machine, but I don't want to bite that off at the moment. > > My mentality has switched to 'don't trust any change until shown otherwise', so that may have bled into my last email. Of course, there's also the question of whether ICU would have similar issues. You're assuming that they *don't* whack the collation order around in minor releases, or at least that they do so to some lesser degree than glibc, but is that actually true? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Sep 17, 2014 at 5:16 PM, Robert Haas <robertmhaas@gmail.com> wrote: > Of course, there's also the question of whether ICU would have similar > issues. You're assuming that they *don't* whack the collation order > around in minor releases, or at least that they do so to some lesser > degree than glibc, but is that actually true? No, but they're disciplined about it. They clearly do versioning properly, which seems to not be the case with glibc, based on Peter's remarks: http://userguide.icu-project.org/collation/architecture#TOC-Versioning (they talk about a 32-bit identifier here). PostgreSQL's problems in this area are exactly the same as every other database system's (the Unicode consortium anticipated these problems too, and as I pointed out have commented on these problems.). A bunch of prominent database systems are listed as using ICU. -- Peter Geoghegan
<div dir="ltr">We use ICU with postgres for many years in our mchar extension, which provides case-insensitive text datatype for popular russian financial system. I don't know if we may ask ICU to give us special BSD-compatible license?<br /></div>
On Wed, Sep 17, 2014 at 9:06 PM, Oleg Bartunov <obartunov@gmail.com> wrote: > We use ICU with postgres for many years in our mchar extension, which > provides case-insensitive text data type for popular russian financial > system. I don't know if we may ask ICU to give us special BSD-compatible > license ? I don't think that's necessary. Firebird uses ICU, and has similar licensing terms to PostgreSQL. -- Peter Geoghegan
On 09/17/2014 09:17 PM, Robert Haas wrote: > What I find astonishing is that whoever maintains glibc (or the Red > Hat packaging for it) thinks it's OK to change the collation order in > a minor release. I'd understand changing it between, say, RHEL 6 and > RHEL 7. But the idea that minor release, supposedly safe updates > think they can whack this around without breaking applications really > kind of blows my mind. If confirmed, it certainly requires some ... firm ... bug reports. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Thu, Sep 18, 2014 at 1:09 PM, Peter Geoghegan <pg@heroku.com> wrote:
I don't think that's necessary. Firebird uses ICU, and has similarOn Wed, Sep 17, 2014 at 9:06 PM, Oleg Bartunov <obartunov@gmail.com> wrote:
> We use ICU with postgres for many years in our mchar extension, which
> provides case-insensitive text data type for popular russian financial
> system. I don't know if we may ask ICU to give us special BSD-compatible
> license ?
licensing terms to PostgreSQL.
Firebird uses MPL license, which is compatible with GPL
http://en.wikipedia.org/wiki/Mozilla_Public_License
http://en.wikipedia.org/wiki/Mozilla_Public_License
--
Peter Geoghegan
> On Wed, Sep 17, 2014 at 9:06 PM, Oleg Bartunov <obartunov@gmail.com> wrote: >> We use ICU with postgres for many years in our mchar extension, which >> provides case-insensitive text data type for popular russian financial >> system. I don't know if we may ask ICU to give us special BSD-compatible >> license ? > > I don't think that's necessary. Firebird uses ICU, and has similar > licensing terms to PostgreSQL. http://source.icu-project.org/repos/icu/icu/trunk/license.html ........permission notice appear in supporting documentation.Permission ishereby granted, free of charge, to any person obtaininga copy ofthis software and associated documentation files (the "Software"), todeal in the Software without restriction,including withoutlimitation the rights to use, copy, modify, merge, publish,distribute, and/or sell copies ofthe Software, and to permit personsto whom the Software is furnished to do so, provided that the abovecopyright notice(s)and this permission notice appear in all copiesof the Software and that both the above copyright notice(s) and thispermissionnotice appear in supporting documentation. ........ In my understanding PostgreSQL's manual MUST include the ICU license term (this is not a problem). What I am not so sure is, any software uses PostgreSQL also MUST include the ICU license or not. If yes, I think this is surely a problem. Best regards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp
On Wed, Sep 17, 2014 at 9:35 PM, Tatsuo Ishii <ishii@postgresql.org> wrote: > In my understanding PostgreSQL's manual MUST include the ICU license > term (this is not a problem). What I am not so sure is, any software > uses PostgreSQL also MUST include the ICU license or not. If yes, I > think this is surely a problem. It uses the MIT license, which is less restrictive than even the BSD license. I believe one part (the Chinese/Japanese Word Break Dictionary Data) is BSD Licensed, though: https://ssl.icu-project.org/repos/icu/icu/trunk/license.html I don't think licensing would be a problem. -- Peter Geoghegan
On Thu, Sep 18, 2014 at 01:35:10PM +0900, Tatsuo Ishii wrote: > In my understanding PostgreSQL's manual MUST include the ICU license > term (this is not a problem). What I am not so sure is, any software > uses PostgreSQL also MUST include the ICU license or not. If yes, I > think this is surely a problem. Only if we're thinking of distributing it. If the user gets ICU from their distribution then there is no need to list the licence (just like we don't need to mention the licence of glibc). We only need link against it, not distribute it. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > He who writes carelessly confesses thereby at the very outset that he does > not attach much importance to his own thoughts. -- Arthur Schopenhauer
On Wed, Sep 17, 2014 at 03:57:38PM +0100, Greg Stark wrote: > Then there's the concern that ICU is a *huge* dependency. ICU is > itself larger than the entire Postgres install. It's a big burden on > users to have to install and configure a second collation library in > addition to the system library and a complete non-starter for embedded > systems or low-memory systems. $ apt-cache show libicu52|grep Installed-Size Installed-Size: 27283 That's 27MB or less than 2 WAL files. Or about 4 times the template database, or which 3 are created during install and the first user database will be a fourth. The installed size of Postgres is 11MB not including any of the libraries it already depends on. Yes, it's not a small library but lets not get carried away. And if it's optional then low memory systems can configure it out. As for configuration, ICU doesn't require configuration just like glibc doesn't require configuration. Mvg, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > He who writes carelessly confesses thereby at the very outset that he does > not attach much importance to his own thoughts. -- Arthur Schopenhauer
On Thu, Sep 18, 2014 at 3:25 PM, Martijn van Oosterhout <kleptog@svana.org> wrote:
On Thu, Sep 18, 2014 at 01:35:10PM +0900, Tatsuo Ishii wrote:
> In my understanding PostgreSQL's manual MUST include the ICU license
> term (this is not a problem). What I am not so sure is, any software
> uses PostgreSQL also MUST include the ICU license or not. If yes, I
> think this is surely a problem.
Only if we're thinking of distributing it. If the user gets ICU from
their distribution then there is no need to list the licence (just like
we don't need to mention the licence of glibc). We only need link
against it, not distribute it.
I understand how it'd works with extension, but not with core.
Have a nice day,--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
> He who writes carelessly confesses thereby at the very outset that he does
> not attach much importance to his own thoughts.
-- Arthur Schopenhauer
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
iQIVAwUBVBrBQkvt++dL5i1EAQhDehAAiQg7iLKficvXMSfAvwR+8Qp7yTG3wiNu
oZ5ieomZCYhlTpXvHae9dWTJPKehjiCCy/zjVyLpFvfHQVptadjBVefkIFSS/d+V
Y7X3gPI3d+8Tc+ZGVZF/CX/5eG9iPgKbsDRpK0zs5j+C4D1HYjxLFf4jWYI/gTXN
Abfr6taSi4YrSAw/4bSMlSQMFDU/wmLx175R4f8j2CvEYmspgXe89i4QU2V14m6Y
bB+zGhkxxJZAubTq3UcPzDDeX3FH4KqS/4NTSZ1V1ceIWo3r9jRPLHpceAvQlHwP
e4eNnRkFZbTeLlOUcvd7N7qkEc2kDYEGXKyaNqr868N022mFiYx4AHwOU/U/dbmm
Xw22FpSTwkokmwugohr7wrL7tUJV7NtHrcEUyPd/2cuddIRvO2H2iV4wqR6ct0TZ
2RCd1b7bIKq+ywrGkySW1xplMhGmGygfPnUzqzlZ2f1YxcmK6PNMnpxldy5nvl3V
2rNnPOoWS3H+R6aE31sSGH+Gl9w6J4lvpPiTAwx8pGoBPi4fWWqvwHUPp7FNqsAO
8fjo00+MN9Vbg0YsqHiE6oCp2pKs3BJy3IHfZiw2nefh9UcEV29666atUBJjb3bw
hO65Km7uzoacX+WKm0XdmaQhdUwVJKIFFoH3sxnawA6+CUr4M8mUDtQicnd/ajRo
HX0lbgRk9z4=
=31r4
-----END PGP SIGNATURE-----
On 09/18/2014 04:12 PM, Oleg Bartunov wrote: > On Thu, Sep 18, 2014 at 3:25 PM, Martijn van Oosterhout <kleptog@svana.org> > wrote: > >> On Thu, Sep 18, 2014 at 01:35:10PM +0900, Tatsuo Ishii wrote: >>> In my understanding PostgreSQL's manual MUST include the ICU license >>> term (this is not a problem). What I am not so sure is, any software >>> uses PostgreSQL also MUST include the ICU license or not. If yes, I >>> think this is surely a problem. >> >> Only if we're thinking of distributing it. If the user gets ICU from >> their distribution then there is no need to list the licence (just like >> we don't need to mention the licence of glibc). We only need link >> against it, not distribute it. > > I understand how it'd works with extension, but not with core. The same it works with libxml, openssl, libreadline and all the other libraries you can build with. - Heikki
On Thu, Sep 18, 2014 at 6:51 AM, Heikki Linnakangas <hlinnakangas@vmware.com> wrote: > The same it works with libxml, openssl, libreadline and all the other > libraries you can build with. I like the comparison with libxml. If we were to adopt ICU, it would be as a core component that makes collation versioning work, that in practice all packages use. It wouldn't actually be mandatory, but almost universally available in practice. I really think that long term, relying on the OS collations is not a good plan. It's a big contributing factor to our reticence on the question of when two different systems should be considered compatible for the purposes of physical replication. Not that I'm volunteering to work on it! -- Peter Geoghegan
On Wed, Sep 17, 2014 at 01:07:56PM +0000, Matthew Kelly wrote: > * Unless you keep _all_ of your clusters on the same OS, machines > from your database spare pool probably won't be the right OS when you > add them to the cluster because a member failed. There has been discussion about having master/streaming slaves use the same OS version, but a simple OS update of an existing master can break indexes too. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +