Обсуждение: tzcode update
We included the public domain timezone library by Arthur David Olson back in 2004 into our source tree, but we haven't kept it up to date with the upstream changes since. We've made a number of small changes to our version of the library, including: - formatting changes, mostly thanks to pgindent - widened time_t to 8 bytes - we only include the parts of the library that we need which means that we can't just take the new library version and drop it in the src/timezone directory. We can debate whether those changes were a good idea or not, given that they make the merging harder, but my take is that it's not that bad given how seldom and little the upstream code changes. I was not able to find anything like release notes that would list the differences between tzcode2003e, which I believe is the version that we included back then, and the latest version tzcode2007k. So I just took a diff between those, and deduced from there what has changed. The main difference between those version seems to be support for 64-bit time_t, including - extended data file format, with 64-bit time values - extrapolation of DST transitions in a 400 year cycle, for values not directly covered by the transition table. In addition to that, they've added "public domain" notice to the top of ialloc.c, and some other cosmetic changes. We already had such a notice in our version of the file, but the original did not. I merged the upstream changes, mostly manually, trying to be faithful to the original diff to make future merges easier. But I also made some whitespace/comment changes, like we've done before to our version of the code: no double-stars in comments, braces on lines of their own. Attached is the resulting patch against Postgres CVS HEAD. In addition to manually merging the patch, I had to teach pg_next_dst_boundary how to extrapolate the DST transitions. The code there is copy-pasted from localsub, so I copy-pasted that change from there as well. Also, they now use a binary search instead of linear search in localsub, and I copied that change to pg_next_dst_bounday as well (that is why I started looking at this in the first place, http://archives.postgresql.org/pgsql-patches/2007-09/msg00291.php) I don't really know how to test this. It passes the regression tests, after the fixes to pg_dst_next_boundary, but I was expecting there to be some failures now that we support timezones for timestamps outside the 32-bit time_t range. Apparently our tests don't cover that? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Вложения
"Heikki Linnakangas" <heikki@enterprisedb.com> writes: > I was not able to find anything like release notes that would list the > differences between tzcode2003e, which I believe is the version that we > included back then, and the latest version tzcode2007k. So I just took a > diff between those, and deduced from there what has changed. Oh good, this has been on my to-do list for awhile ... but I'm happy to let you do it ;-) > I don't really know how to test this. It passes the regression tests, > after the fixes to pg_dst_next_boundary, but I was expecting there to be > some failures now that we support timezones for timestamps outside the > 32-bit time_t range. Apparently our tests don't cover that? Unless the 64-bit extension changed the semantics a lot more than I think, any given compiled tzdata file covers only a finite range of years. The extension makes it *possible* for the data to extend outside the time_t range, but you won't actually see a difference in behavior unless (a) you do extend the range (what's zic's default now?) and (b) you test a date falling within the extended range. So I'm not too surprised that there are no cases in the regression tests that notice. We should probably add some reaching out to 2100 or so. regards, tom lane
Tom Lane wrote: > "Heikki Linnakangas" <heikki@enterprisedb.com> writes: >> I was not able to find anything like release notes that would list the >> differences between tzcode2003e, which I believe is the version that we >> included back then, and the latest version tzcode2007k. So I just took a >> diff between those, and deduced from there what has changed. > > Oh good, this has been on my to-do list for awhile ... but I'm happy > to let you do it ;-) > >> I don't really know how to test this. It passes the regression tests, >> after the fixes to pg_dst_next_boundary, but I was expecting there to be >> some failures now that we support timezones for timestamps outside the >> 32-bit time_t range. Apparently our tests don't cover that? > > Unless the 64-bit extension changed the semantics a lot more than I > think, any given compiled tzdata file covers only a finite range of > years. The extension makes it *possible* for the data to extend outside > the time_t range, but you won't actually see a difference in behavior > unless (a) you do extend the range (what's zic's default now?) and > (b) you test a date falling within the extended range. So I'm not > too surprised that there are no cases in the regression tests that > notice. We should probably add some reaching out to 2100 or so. zic does extend the range by default now, as witnessed by: postgres=# set timezone = 'Europe/London'; SET postgres=# SELECT '2090-07-01 15:00:00'::timestamptz; timestamptz ------------------------ 2090-07-01 15:00:00+01 (1 row) Their new format is best described by this mail by Arhur David Olson (the author of the library): http://article.gmane.org/gmane.comp.time.tz/474/match=64+bit > The name of the game would be to produce binary files with: > 1. data with 32-bit time values through 2037 (for use by old > systems) > 2. data with 64-bit time values for "historic" years > 3. a newline enclosed, POSIX-conforming > TZ-environment-variable-style string > telling what to do in years beyond those covered by 2 above. > > There are a few places (Cairo, Godthab, Chile) that can't be represented by > POSIX-conforming strings; for these we'd punt to the "write 400 years worth > of data and work modulo 400" approach (leaving off the trailing TZ string). I'll add some tests to cover timestamps > 2038. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas wrote: > I'll add some tests to cover timestamps > 2038. Attached is a new patch, with a couple of new regression tests. No other changes. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Вложения
I just noticed that I had accidentally reverted this change in the patch: > /* > * Note: the point of adding 4800 is to ensure we make the same > * assumptions as Postgres' Julian-date routines about the placement of > * leap years in centuries BC, at least back to 4713BC which is as far as > * we'll go. This is effectively extending Gregorian timekeeping into > * pre-Gregorian centuries, which is a tad bogus but it conforms to the > * SQL spec... > */ > #define LEAPS_THRU_END_OF(y) (((y) + 4800) / 4 - ((y) + 4800) / 100 + ((y) + 4800) / 400) vs original in tzcode2003e: > #define LEAPS_THRU_END_OF(y) ((y) / 4 - (y) / 100 + (y) / 400) Looking closer, I don't understand how that change was supposed to do anything. If I'm doing my math right, our +4800 version of the code can be written as: "y/4 - y/100 - y/400 + 1164". The only place this macro is used is this: > days -= ((int64) (newy - y)) * DAYSPERNYEAR + > LEAPS_THRU_END_OF(newy - 1) - > LEAPS_THRU_END_OF(y - 1); AFAICS, the constant " + 1164" is always cancelled out, making the exercise of adding 4800 a waste of time. Am I missing something? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
"Heikki Linnakangas" <heikki@enterprisedb.com> writes: > Looking closer, I don't understand how that change was supposed to do > anything. The point of that patch is to avoid an off-by-one result for years BC. The direction of rounding in integer division with a negative numerator is undefined in C (or at least used to be --- did C99 tighten this up?). regards, tom lane
Tom Lane wrote: > "Heikki Linnakangas" <heikki@enterprisedb.com> writes: >> Looking closer, I don't understand how that change was supposed to do >> anything. > > The point of that patch is to avoid an off-by-one result for years BC. > The direction of rounding in integer division with a negative numerator > is undefined in C (or at least used to be --- did C99 tighten this up?). Oh, I see. In that case we're good. The corresponding new code in tzcode actually looks like this: > ! static int > ! leaps_thru_end_of(const int y) > ! { > ! return (y >= 0) ? (y / 4 - y / 100 + y / 400) : > ! -(leaps_thru_end_of(-(y + 1)) + 1); > ! } -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
"Heikki Linnakangas" <heikki@enterprisedb.com> writes: > Heikki Linnakangas wrote: >> I'll add some tests to cover timestamps > 2038. > Attached is a new patch, with a couple of new regression tests. No other > changes. Applied with minor revisions --- I found a couple of portability problems while testing here. int64_fast_t is already defined in some system headers and causes a conflict, so I just made it be int64 instead. Also there were undefined references to INT32_MIN and INT32_MAX, which could possibly have been fixed with more #include's, but I thought replacing them with a cast-based overflow check was at least as good. regards, tom lane
On 13/02 10.51, Heikki Linnakangas wrote: > Heikki Linnakangas wrote: > >I'll add some tests to cover timestamps > 2038. > > Attached is a new patch, with a couple of new regression tests. No other > changes. Ouch! This fails on our Solaris builds, because we build with the Solaris timezone files. And these apparently don't work beyond 2038 and don't know that Finland plans to have DST also in the summer of 2050... I haven't studied this in depth but I'm afraid the answer is "fix your own timezone files"? -- Bjorn Munch Database Technology Group, Sun Microsystems Trondheim, Norway
On Mon, Feb 18, 2008 at 04:32:58PM +0100, Bjorn Munch wrote: > On 13/02 10.51, Heikki Linnakangas wrote: > > Heikki Linnakangas wrote: > > >I'll add some tests to cover timestamps > 2038. > > > > Attached is a new patch, with a couple of new regression tests. No other > > changes. > > Ouch! This fails on our Solaris builds, because we build with the > Solaris timezone files. And these apparently don't work beyond 2038 > and don't know that Finland plans to have DST also in the summer of > 2050... > > I haven't studied this in depth but I'm afraid the answer is "fix your > own timezone files"? Or use the ones shipping with pg ;-) This sounds exactly like the problem scenario brought forward when the patch to use OS timezonedata was added. But I'd assume there are plans for Solaris to include this data as well in the future, no? You'll have to eventually.... //Magnus
Magnus Hagander <magnus@hagander.net> writes: > On Mon, Feb 18, 2008 at 04:32:58PM +0100, Bjorn Munch wrote: >> Ouch! This fails on our Solaris builds, because we build with the >> Solaris timezone files. And these apparently don't work beyond 2038 >> and don't know that Finland plans to have DST also in the summer of >> 2050... >> >> I haven't studied this in depth but I'm afraid the answer is "fix your >> own timezone files"? > Or use the ones shipping with pg ;-) Yeah. I included the far-future cases in the committed patch specifically to make it obvious if you were using tz files that lacked post-2038 support. I can't imagine that fixing this isn't pretty darn high on the Solaris to-do list, anyway. Financial apps doing, say, 30-year mortgage projections are broken *today* on platforms without post-Y2038 calendar support. regards, tom lane
On 18/02 14.22, Tom Lane wrote: > I can't imagine that fixing this isn't pretty darn high on the Solaris > to-do list, anyway. Financial apps doing, say, 30-year mortgage > projections are broken *today* on platforms without post-Y2038 calendar > support. Well, it IS mentioned in the BUGS section of at least one man page (localtime), so it's clearly a known problem. I don't know about 30-year mortgage, I don't think such financial apps care whether the day 30 days into the future has DST or not. :-) And apart from that, the date was correct. - Bjorn Munch
Tom Lane wrote: > Magnus Hagander <magnus@hagander.net> writes: >> On Mon, Feb 18, 2008 at 04:32:58PM +0100, Bjorn Munch wrote: >>> Ouch! This fails on our Solaris builds, because we build with the >>> Solaris timezone files. And these apparently don't work beyond 2038 >>> and don't know that Finland plans to have DST also in the summer of >>> 2050... >>> >>> I haven't studied this in depth but I'm afraid the answer is "fix your >>> own timezone files"? > >> Or use the ones shipping with pg ;-) > > Yeah. I included the far-future cases in the committed patch > specifically to make it obvious if you were using tz files that lacked > post-2038 support. Agreed, we want to give people a notice. What we could do is to add comments to the expected output file, that would show up in regression.diffs, that explain what the failure means. > I can't imagine that fixing this isn't pretty darn high on the Solaris > to-do list, anyway. Yep, and PostgreSQL 8.4 won't be released for quite some time. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com