Обсуждение: A date bug: number of day of October

Поиск
Список
Период
Сортировка

A date bug: number of day of October

От
Guillaume Perréal
Дата:
Hi all,

I think there is a bug in the PostgreSQL date system:

Try computing (replace XXXX with some differents years):'XXXX-10-01'::datetime + '1 month'::interval
Depending on the year, you can get 'XXXX-11-01' as expected or 'XXXX-10-31'
which is bug.
Guillaume Perréal - Stagiaire MIAG
Cemagref (URH), Lyon, France
Tél: (+33) 4.72.20.87.64


Re: A date bug: number of day of October

От
Peter Eisentraut
Дата:
Guillaume Perréal writes:

> Try computing (replace XXXX with some differents years):
>     'XXXX-10-01'::datetime + '1 month'::interval
> Depending on the year, you can get 'XXXX-11-01' as expected or 'XXXX-10-31'
> which is bug.

In the CET zone (where you probably live) daylight savings time ends in
October, so it is true that

    1999-10-01 00:00:00 + (24h * 31 days) = 1999-10-31 23:00:00

Of course a month is not defined as "24h * 30/31 days" but instead as the
time between xxxx-yy-zz and xxxx-(yy+1)-zz, so the above could be condered
wrong. It is especially wrong that the same thing happens if you use DATE
instead of TIMESTAMP (or DATETIME, now deprecated).

You get the above behaviour between 1996 and 2037. Before 1996 you can
observe the same behaviour with September, because back then we switched
in September! The system is pretty smart. Of course after 2037 we're
doomed anyway.

The bottom line is that INTERVAL in its current implementation has
deficiencies and it's not SQL compliant either.


--
Peter Eisentraut                  Sernanders väg 10:115
peter_e@gmx.net                   75262 Uppsala
http://yi.org/peter-e/            Sweden

Re: A date bug: number of day of October

От
Tom Lane
Дата:
Peter Eisentraut <peter_e@gmx.net> writes:
> In the CET zone (where you probably live) daylight savings time ends in
> October, so it is true that

>     1999-10-01 00:00:00 + (24h * 31 days) = 1999-10-31 23:00:00

> Of course a month is not defined as "24h * 30/31 days" but instead as the
> time between xxxx-yy-zz and xxxx-(yy+1)-zz, so the above could be condered
> wrong. It is especially wrong that the same thing happens if you use DATE
> instead of TIMESTAMP (or DATETIME, now deprecated).

Actually, INTERVAL does know the difference between '1 month' and '31 days',
as illustrated by the fact that it gets these two cases right:

regression=# select '1999-6-1'::timestamp  + '1 month'::interval;
        ?column?
------------------------
 1999-07-01 00:00:00-04
(1 row)

regression=# select '1999-6-1'::timestamp  + '2 month'::interval;
        ?column?
------------------------
 1999-08-01 00:00:00-04
(1 row)

(remember June and July have different numbers of days).  The problem
here is a plain old code bug: after transforming the input value to
y/m/d/h/m/s/timezone, and correctly adding the N-month offset to
this symbolic form, timestamp_pl_span transforms back to a GMT-based
timestamp using *the same timezone offset*.

Thus, for example,

regression=# select '1999-10-1'::timestamp  + '1 month'::interval;
        ?column?
------------------------
 1999-10-31 23:00:00-05
(1 row)

for me (I live in EST5EDT, where Oct 1 is in daylight savings time
GMT-4, but Nov 1 is standard time GMT-5).

Correct behavior, IMHO, is to change to the local timezone appropriate
for the converted date before transforming y/m/d/etc to timestamp.
I have no idea how hard that is to do.  One kluge that comes to mind is
to convert the modified y/m/d/etc date to timestamp, convert that back
to local time to get a timezone, and then convert the same y/m/d/h/m/s
plus new timezone to timestamp.  But perhaps there's a cleaner/faster
way to do it.  I'm not real sure that said algorithm would give
plausible behavior if the result time falls within a DST transition
anyway.  (But what is plausible behavior in that case?)

Another issue: for intervals smaller than a month, INTERVAL currently
represents the value as X number of seconds.  Thus, since our last
DST->EST transition was early morning 1999/10/31,

regression=# select '1999-10-31'::timestamp + '1 day'::interval;
        ?column?
------------------------
 1999-10-31 23:00:00-05
(1 row)

which is fairly unintuitive --- though if I'd asked for +'24 hours'
I would accept it as correct.  This is not a code bug but designed
behavior.  ISTM that really, INTERVAL ought to have a three-part
representation: months (which can serve for larger units as well),
days, and sub-day units (which can all be converted to seconds).
But representing days as seconds breaks at DST boundaries.

            regards, tom lane

Re: A date bug: number of day of October

От
Guillaume Perréal
Дата:
Tom Lane wrote:
>
> Peter Eisentraut <peter_e@gmx.net> writes:
> > In the CET zone (where you probably live) daylight savings time ends in
> > October, so it is true that
>
> >       1999-10-01 00:00:00 + (24h * 31 days) = 1999-10-31 23:00:00
>
> > Of course a month is not defined as "24h * 30/31 days" but instead as the
> > time between xxxx-yy-zz and xxxx-(yy+1)-zz, so the above could be condered
> > wrong. It is especially wrong that the same thing happens if you use DATE
> > instead of TIMESTAMP (or DATETIME, now deprecated).
>
> Actually, INTERVAL does know the difference between '1 month' and '31 days',
> as illustrated by the fact that it gets these two cases right:
>
> regression=# select '1999-6-1'::timestamp  + '1 month'::interval;
>         ?column?
> ------------------------
>  1999-07-01 00:00:00-04
> (1 row)
>
> regression=# select '1999-6-1'::timestamp  + '2 month'::interval;
>         ?column?
> ------------------------
>  1999-08-01 00:00:00-04
> (1 row)
>
> (remember June and July have different numbers of days).  The problem
> here is a plain old code bug: after transforming the input value to
> y/m/d/h/m/s/timezone, and correctly adding the N-month offset to
> this symbolic form, timestamp_pl_span transforms back to a GMT-based
> timestamp using *the same timezone offset*.
>
> Thus, for example,
>
> regression=# select '1999-10-1'::timestamp  + '1 month'::interval;
>         ?column?
> ------------------------
>  1999-10-31 23:00:00-05
> (1 row)
>
> for me (I live in EST5EDT, where Oct 1 is in daylight savings time
> GMT-4, but Nov 1 is standard time GMT-5).
>
> Correct behavior, IMHO, is to change to the local timezone appropriate
> for the converted date before transforming y/m/d/etc to timestamp.
> I have no idea how hard that is to do.  One kluge that comes to mind is
> to convert the modified y/m/d/etc date to timestamp, convert that back
> to local time to get a timezone, and then convert the same y/m/d/h/m/s
> plus new timezone to timestamp.  But perhaps there's a cleaner/faster
> way to do it.  I'm not real sure that said algorithm would give
> plausible behavior if the result time falls within a DST transition
> anyway.  (But what is plausible behavior in that case?)
>
> Another issue: for intervals smaller than a month, INTERVAL currently
> represents the value as X number of seconds.  Thus, since our last
> DST->EST transition was early morning 1999/10/31,
>
> regression=# select '1999-10-31'::timestamp + '1 day'::interval;
>         ?column?
> ------------------------
>  1999-10-31 23:00:00-05
> (1 row)
>
> which is fairly unintuitive --- though if I'd asked for +'24 hours'
> I would accept it as correct.  This is not a code bug but designed
> behavior.  ISTM that really, INTERVAL ought to have a three-part
> representation: months (which can serve for larger units as well),
> days, and sub-day units (which can all be converted to seconds).
> But representing days as seconds breaks at DST boundaries.
>
>                         regards, tom lane

Thanks you all,

As I just use this code to get the number of seconds in a month (and don't care
anyway of the timezone), I think I could use this:

select date_part('epoch', date_trunc('month', '1999-10-01'::timestamp + '1 month
1 hour'::interval))-date_part('epoch', '1999-10-01'::timestamp);

Regards,

Guillaume Perréal - Stagiaire MIAG
Cemagref (URH), Lyon, France
Tél: (+33) 4.72.20.87.64