Re: date_trunc function in interval version

Поиск
Список
Период
Сортировка
От Yasir
Тема Re: date_trunc function in interval version
Дата
Msg-id CAA9OW9eLub13G8GYoNvbY6bmyMxm0sjxe0gjxETx4H5_Xvg6wA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: date_trunc function in interval version  (Przemysław Sztoch <przemyslaw@sztoch.pl>)
Ответы Re: date_trunc function in interval version
Re: date_trunc function in interval version
Список pgsql-hackers


On Sun, May 19, 2024 at 2:20 AM Przemysław Sztoch <przemyslaw@sztoch.pl> wrote:
Robert Haas wrote on 5/15/2024 9:29 PM:
On Mon, Mar 4, 2024 at 5:03 AM Przemysław Sztoch <przemyslaw@sztoch.pl> wrote:
Apparently the functionality is identical to date_bin.
When I saw date_bin in the documentation, I thought it solved all my problems.
Unfortunately, DST problems have many corner cases.
I tried to change date_bin several times, but unfortunately in some cases it would start working differently than before.
So, first of all, thanks for taking an interest and sending a patch.

In order for the patch to have a chance of being accepted, we would
need to have a clear understanding of exactly how this patch is
different from the existing date_bin(). If we knew that, we could
decide either that (a) date_bin does the right thing and your patch
does the wrong thing and therefore we should reject your patch, or we
could decide that (b) date_bin does the wrong thing and therefore we
should fix it, or we could decide that (c) both date_bin and what this
patch does are correct, in the sense of being sensible things to do,
and there is a reason to have both. But if we don't really understand
how they are different, which seems to be the case right now, then we
can't make any decisions. And what that means in practice is that
nobody is going to be willing to commit anything, and we're just going
to go around in circles.

Typically, this kind of research is the responsibility of the patch
author: you're the one who wants something changed, so that means you
need to provide convincing evidence that it should be. If someone else
volunteers to do it, that's also cool, but it absolutely has to be
done in order for there to be a chance of progress here. No committer
is going to say "well, we already have date_bin, but Przemysław says
his date_trunc is different somehow, so let's have both without
understanding how exactly they're different." That's just not a
realistic scenario. Just to name one problem, how would we document
each of them? Users would expect the documentation to explain how two
closely-related functions differ, but we will be unable to explain
that if we don't know the answer ourselves.

If you can't figure out exactly what the differences are by code
inspection, then maybe one thing you could do to help unblock things
here is provide some very clear examples of when they deliver the same
results and when they deliver different results. Although there are no
guarantees, that might lead somebody else to jump in and suggest an
explanation, or further avenues of analysis, or some other helpful
comment.

Personally, what I suspect is that there's already a way to do what
you want using date_bin(), maybe in conjunction with some casting or
some calls to other functions that we already have. But it's hard to
be sure because we just don't have the details. "DST problems have
many corner cases" and "in some cases [date_bin] would start working
differently than before" may be true statements as far as they go, but
they're not very specific complaints. If you can describe *exactly*
how date_bin fails to meet your expectations, there is a much better
chance that something useful will happen here.

I would also like to thank Robert for presenting the matter in detail.

My function date_trunc ( interval, timestamp, ...) is similar to original function date_trunc ( text, timestamp ...) .

My extension only gives more granularity.
We don't have a jump from hour to day. We can use 6h and 12h. It's the same with minutes.
We can round to 30 minutes, 20 minutes, 15 minutes, etc.

Using date_bin has a similar effect, but requires specifying the origin. According to this origin,
subsequent buckets are then calculated. The need to provide this origin is sometimes a very big problem.
Especially since you cannot use one origin when changing from summer to winter time.

If we use one origin for example begin of year: 2024-01-01 00:00:00 then:
# SET timezone='Europe/Warsaw';
# SELECT date_bin('1 day', '2024-03-05 11:22:33', '2024-01-01 00:00:00'), date_trunc('day', '2024-03-05 11:22:33'::timestamptz);
2024-03-05 00:00:00+01     2024-03-05 00:00:00+01    date_bin works ok, because we are before DST
# SELECT date_bin('1 day', '2024-05-05 11:22:33', '2024-01-01 00:00:00'), date_trunc('day', '2024-05-05 11:22:33'::timestamptz);
2024-05-05 01:00:00+02     2024-05-05 00:00:00+02    date_bin has problem, because we are in May after DST

If anyone has an idea how to make date_bin work like date_trunc, please provide an example.


Here is an example which will make date_bin() to behave like date_trunc(): 
# SELECT date_bin('1 day', '2024-05-05 11:22:33', '0001-01-01'::timestamp), date_trunc('day', '2024-05-05 11:22:33'::timestamptz);
      date_bin       |       date_trunc      
---------------------+------------------------
 2024-05-05 00:00:00 | 2024-05-05 00:00:00+02
(1 row)

In general, to make date_bin work similarly to date_trunc in PostgreSQL, you need to set the interval length appropriately and use an origin timestamp that aligns with the start of the interval you want to bin.

Here's how you can use date_bin to mimic the behavior of date_trunc:

Truncate to the Start of the Year:
# SELECT date_bin('1 year', timestamp_column, '0001-01-01'::timestamp) FROM your_table;
Truncate to the Start of the Month:
# SELECT date_bin('1 month', timestamp_column, '0001-01-01'::timestamp) FROM your_table;
Truncate to the Start of the Day:
# SELECT date_bin('1 day', timestamp_column, '0001-01-01'::timestamp) FROM your_table;
Truncate to the Start of the Hour:
# SELECT date_bin('1 hour', timestamp_column, '0001-01-01'::timestamp) FROM your_table;
Truncate to the Start of the Minute:
# SELECT date_bin('1 minute', timestamp_column, '0001-01-01'::timestamp) FROM your_table;


--
Przemysław Sztoch | Mobile +48 509 99 00 66

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Bruce Momjian
Дата:
Сообщение: Re: First draft of PG 17 release notes
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Why is citext/regress failing on hamerkop?