Обсуждение: [HACKERS] log_destination=file

Поиск

Список

Период

Сортировка

[HACKERS] log_destination=file

От

Magnus Hagander

Дата:

31 августа 2017 г., 17:49:37

Attached is a very much VIP (AKA rough draft) for $subject.

Right now we have two parameters controlling logging destination, and they work in interesting combinations:

log_destination=stderr, logging_collector=off -> log to stderr (makes sense)

log_destination=stderr, logging_collector=on -> log to file (*highly* illogical for most users, to set stderr when they want file)

log_destination=csvlog, logging_collector=on -> log to csvfile (makes sense)

log_destination=csvlog, logging_collector=off -> fail (ugh)

(beyond that we also have syslog and eventlog, but those are different and not touched by this patch)

My understanding is that the main reason for this is that we cannot change logging_collector without restarting postmaster, whereas we can change log_destination.

My suggestion is we work around this by just always starting the logging collector, even if we're not planning to use it. Then we'd just have:

log_destination = stderr -> log to stderr

log_destination = file -> log to file

log_destination = csvlog -> log to csv file

The main difference here is that log_destination=stderr would also go through the logging collector which would then assemble it and write it out to it's own stderr.

Do people see an actual problem with that? I agree it's an extra round of indirection there, but is that a problem? It would also cause one more backgorund process to always be running, but again, is that really a problem? The overhead is not exactly large.

It would make configuration a lot more logical for logging. It would also make it possible to switch between all logging configurations without restarting.

The attached WIP is mostly working for the simple cases. It fails completely if the syslogger is restarted at this point, simply because I haven't figured out how to pass the original stderr down to the syslogger. I'm sure there are also many other smaller issues around it, but I wanted to get the discussion done before I spend the time to go through those.

Thoughts?

--

Magnus Hagander
Me: https://www.hagander.net/
Work: https://www.redpill-linpro.com/

Вложения

log_destination_file.patch

Re: [HACKERS] log_destination=file

От

Tom Lane

Дата:

31 августа 2017 г., 18:34:11

Magnus Hagander <magnus@hagander.net> writes:
> My understanding is that the main reason for this is that we cannot change
> logging_collector without restarting postmaster, whereas we can change
> log_destination.

Right, because the decision whether to redirect stdout/stderr can't
be changed on the fly.

> My suggestion is we work around this by just always starting the logging
> collector, even if we're not planning to use it.

Umm....

> Do people see an actual problem with that? I agree it's an extra round of
> indirection there, but is that a problem? It would also cause one more
> backgorund process to always be running, but again, is that really a
> problem? The overhead is not exactly large.

You just made three assertions about "this isn't a problem" without
providing any evidence in support of any of them.  Maybe with some
measurements we could have a real discussion.
        regards, tom lane

Re: [HACKERS] log_destination=file

От

Magnus Hagander

Дата:

31 августа 2017 г., 18:40:59

On Thu, Aug 31, 2017 at 2:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Magnus Hagander <magnus@hagander.net> writes:
> My understanding is that the main reason for this is that we cannot change
> logging_collector without restarting postmaster, whereas we can change
> log_destination.

Right, because the decision whether to redirect stdout/stderr can't
be changed on the fly.

Right.

We could of course also say we only care about things generated by our ereport framework, in which case we don't need to redirect stderr and can just use a "regular pipe". But IIRC that's functionality we also specifically wanted (though I don't think I've ever needed it myself, presumably others have).

> My suggestion is we work around this by just always starting the logging
> collector, even if we're not planning to use it.

Umm....

> Do people see an actual problem with that? I agree it's an extra round of
> indirection there, but is that a problem? It would also cause one more
> backgorund process to always be running, but again, is that really a
> problem? The overhead is not exactly large.

You just made three assertions about "this isn't a problem" without
providing any evidence in support of any of them. Maybe with some
measurements we could have a real discussion.

Well, not entirely. The first two are questions "is this really a problem".

The overhead of an extra process certainly isn't much, I'm wiling to say that as an assertion.

The other two, less so, that's more question.

Are you actually asking for a benchmark of if logging gets slower? If so, could you suggest a workload to make an actual benchmark of it (where logging would be high enough that it could be come a bottleneck -- and not writing the log data to disk, but the actual logging). I'm not sure what a good one would be.

--

Magnus Hagander
Me: https://www.hagander.net/
Work: https://www.redpill-linpro.com/

Re: [HACKERS] log_destination=file

От

Tom Lane

Дата:

31 августа 2017 г., 18:49:11

Magnus Hagander <magnus@hagander.net> writes:
> On Thu, Aug 31, 2017 at 2:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Right, because the decision whether to redirect stdout/stderr can't
>> be changed on the fly.

> Right.

> We could of course also say we only care about things generated by our
> ereport framework, in which case we don't need to redirect stderr and can
> just use a "regular pipe". But IIRC that's functionality we also
> specifically wanted (though I don't think I've ever needed it myself,
> presumably others have).

Yes, it's pretty important, because of assorted stuff not-under-our-
control that doesn't know about ereport and will just print to stderr
anyway.  Some examples: dynamic linker can't-resolve-symbol failure
messages, glibc malloc corruption error messages, just about any external
module in plperl or plpython.  I don't find this to be negotiable.

> Are you actually asking for a benchmark of if logging gets slower?

Yes.

> If so,
> could you suggest a workload to make an actual benchmark of it (where
> logging would be high enough that it could be come a bottleneck -- and not
> writing the log data to disk, but the actual logging). I'm not sure what a
> good one would be.

pgbench with log_statement = all would be a pretty easy test case.
        regards, tom lane

Re: [HACKERS] log_destination=file

От

Greg Stark

Дата:

01 сентября 2017 г., 22:00:03

On 31 August 2017 at 13:49, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Magnus Hagander <magnus@hagander.net> writes:
>> On Thu, Aug 31, 2017 at 2:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Yes, it's pretty important, because of assorted stuff not-under-our-
> control that doesn't know about ereport and will just print to stderr
> anyway.  Some examples: dynamic linker can't-resolve-symbol failure
> messages, glibc malloc corruption error messages, just about any external
> module in plperl or plpython.  I don't find this to be negotiable.

So what happens now with these messages? My understanding is that
they're missing from the CSV logs and are simply inserted into the
text logs without any log_line_prefix? The logging collector doesn't
recognize these messages and reformat them for the CSV logs does it?

I'm actually asking because I'm more concerned with JSON logs or
msgpack logs. Currently these are supported with an emit_log_hook but
they can't capture these non-ereport logs either.

Also the CSV and emit_log_hook based logs don't have any convenient
way to turn them on and off and control the location and filename of
the logs. It would be nice if we could have something like

log_destinations='stderr=text,syslog=json,postgresql-%Y-%m-%d_%H%M%S.csv=csv'

>> Are you actually asking for a benchmark of if logging gets slower?
>
> Yes.

Personally I don't think it's "performance" so much as operational
issues that are more concerning. For all we know there are people out
there who tried to use the logging collector and found it didn't work
well on some system -- perhaps it interacted with systemd or something
else on the system -- and they switched back to just using stderr. I
don't know how to flush these users out though if there are any. Just
making this change early in a release cycle is the best we can do.

-- 
greg

Re: [HACKERS] log_destination=file

От

Magnus Hagander

Дата:

01 сентября 2017 г., 22:22:20

On Fri, Sep 1, 2017 at 6:00 PM, Greg Stark <stark@mit.edu> wrote:

On 31 August 2017 at 13:49, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Magnus Hagander <magnus@hagander.net> writes:
>> On Thu, Aug 31, 2017 at 2:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Yes, it's pretty important, because of assorted stuff not-under-our-
> control that doesn't know about ereport and will just print to stderr
> anyway. Some examples: dynamic linker can't-resolve-symbol failure
> messages, glibc malloc corruption error messages, just about any external
> module in plperl or plpython. I don't find this to be negotiable.

So what happens now with these messages? My understanding is that
they're missing from the CSV logs and are simply inserted into the
text logs without any log_line_prefix? The logging collector doesn't
recognize these messages and reformat them for the CSV logs does it?

Yeah, that's pretty much it.

Fixing that is also on my plan, but for later :)

I'm actually asking because I'm more concerned with JSON logs or
msgpack logs. Currently these are supported with an emit_log_hook but
they can't capture these non-ereport logs either.

Also the CSV and emit_log_hook based logs don't have any convenient
way to turn them on and off and control the location and filename of
the logs. It would be nice if we could have something like

log_destinations='stderr=text,syslog=json,postgresql-%Y-%m-%d_%H%M%S.csv=csv'

That could definitely be an interesting improvement as well. Decoupling the format from the destination would make a lot of sense. (Arguments can certainly be made for example for csv-to-syslog, when you have a custom syslog server)

>> Are you actually asking for a benchmark of if logging gets slower?
>
> Yes.

Personally I don't think it's "performance" so much as operational
issues that are more concerning. For all we know there are people out
there who tried to use the logging collector and found it didn't work
well on some system -- perhaps it interacted with systemd or something
else on the system -- and they switched back to just using stderr. I
don't know how to flush these users out though if there are any. Just
making this change early in a release cycle is the best we can do.

The most common use of stderr I think is debian/ubuntu which uses that and then pg_ctl based redirect. Or at least used to (I remember being annoyed by that at least once..)

AFAIK all other major platforms already use the logging collector in the standard packages.

--

Magnus Hagander
Me: https://www.hagander.net/
Work: https://www.redpill-linpro.com/

Re: [HACKERS] log_destination=file

От

Tom Lane

Дата:

01 сентября 2017 г., 22:44:08

Magnus Hagander <magnus@hagander.net> writes:
> On Fri, Sep 1, 2017 at 6:00 PM, Greg Stark <stark@mit.edu> wrote:
>> So what happens now with these messages? My understanding is that
>> they're missing from the CSV logs and are simply inserted into the
>> text logs without any log_line_prefix? The logging collector doesn't
>> recognize these messages and reformat them for the CSV logs does it?

> Yeah, that's pretty much it.

> Fixing that is also on my plan, but for later :)

Keep in mind that you've got to be really, really conservative about
adding functionality in the logging collector process.  If it ever
crashes, we have problems.  If memory serves, the postmaster is able
to restart it, but we cannot restore the original stdout/stderr
destination, which is problematic if that's where its output had
been going.  So it's pretty close to being as mission-critical as
the postmaster itself.
        regards, tom lane

Re: [HACKERS] log_destination=file

От

Magnus Hagander

Дата:

01 сентября 2017 г., 22:47:08

On Fri, Sep 1, 2017 at 6:44 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Magnus Hagander <magnus@hagander.net> writes:
> On Fri, Sep 1, 2017 at 6:00 PM, Greg Stark <stark@mit.edu> wrote:
>> So what happens now with these messages? My understanding is that
>> they're missing from the CSV logs and are simply inserted into the
>> text logs without any log_line_prefix? The logging collector doesn't
>> recognize these messages and reformat them for the CSV logs does it?

> Yeah, that's pretty much it.

> Fixing that is also on my plan, but for later :)

Keep in mind that you've got to be really, really conservative about
adding functionality in the logging collector process. If it ever
crashes, we have problems. If memory serves, the postmaster is able
to restart it, but we cannot restore the original stdout/stderr
destination, which is problematic if that's where its output had
been going. So it's pretty close to being as mission-critical as
the postmaster itself.

Yeah. That's one of the reasons I'm not trying to do it all in one batch.

But yeah, the postmaster restarts it just fine. And with the WIP patch I posted earlier, the message from the postmaster that it did gets lost if you are logging to stderr. It does end up in the CSV file though. And I'm sure there are plenty of other corner cases around that one to consider.

--

Magnus Hagander
Me: https://www.hagander.net/
Work: https://www.redpill-linpro.com/

Re: [HACKERS] log_destination=file

От

Dmitry Dolgov

Дата:

07 сентября 2017 г., 19:46:17

Hi

> On 31 August 2017 at 14:49, Tom Lane <tgl@sss.pgh.pa.us> wrote:

>> Are you actually asking for a benchmark of if logging gets slower?

>

> Yes.

>

>> If so,

>> could you suggest a workload to make an actual benchmark of it (where

>> logging would be high enough that it could be come a bottleneck -- and not

>> writing the log data to disk, but the actual logging). I'm not sure what a

>> good one would be.

>

> pgbench with log_statement = all would be a pretty easy test case.

This part of the discussion caught my attention, and I tried to perform such

easy test. I was doing:

pgbench -S -j2 -c${clients} -T500 test

with `log_statement=all` and `log_destination=stderr`, what I assume should be

enough to get some approximation for numbers.

scaling factor: 100

average latency:

clients patch master

10 1.827 1.456

20 4.027 3.300

30 6.284 4.921

40 8.409 6.767

50 10.985 8.646

It seems that for this particular workload it was about 20-25% slower.

Re: [HACKERS] log_destination=file

От

Tom Lane

Дата:

07 сентября 2017 г., 20:02:08

Dmitry Dolgov <9erthalion6@gmail.com> writes:
>> On 31 August 2017 at 14:49, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> pgbench with log_statement = all would be a pretty easy test case.

> It seems that for this particular workload it was about 20-25% slower.

Ouch.  That seems like rather a large hit :-(.  I actually expected
it to be close to negligible, but I don't think 20% qualifies.
        regards, tom lane

Re: [HACKERS] log_destination=file

От

Magnus Hagander

Дата:

08 сентября 2017 г., 05:32:16

On Thu, Sep 7, 2017 at 7:02 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Dmitry Dolgov <9erthalion6@gmail.com> writes:
>> On 31 August 2017 at 14:49, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> pgbench with log_statement = all would be a pretty easy test case.

> It seems that for this particular workload it was about 20-25% slower.

Ouch. That seems like rather a large hit :-(. I actually expected
it to be close to negligible, but I don't think 20% qualifies.

Agreed, that's a lot more than I expected. A few questions though:

1. where was stderr actually sent? To the console, to /dev/null or to a file?

2. Could you run the same thing (on the same machine) with the logging collector on and logging to file, without the patch? I'd assume that gives performance similar to running with the patch, but it would be good to confirm that.

And thanks for running the benchmark, saved me some time!

--

Magnus Hagander
Me: https://www.hagander.net/
Work: https://www.redpill-linpro.com/

Re: [HACKERS] log_destination=file

От

Dmitry Dolgov

Дата:

08 сентября 2017 г., 12:58:23

> On 8 September 2017 at 01:32, Magnus Hagander <magnus@hagander.net> wrote:
>
> 1. where was stderr actually sent? To the console, to /dev/null or to a file?

To the console (but I can also try other options, although I'm not sure if it would have any impact).

> 2. Could you run the same thing (on the same machine) with the logging collector on and logging to file, without the patch? I'd assume that gives performance similar to running with the patch, but it would be good to confirm that.

Sure, I'll do it this weekend.

Re: [HACKERS] log_destination=file

От

Dmitry Dolgov

Дата:

10 сентября 2017 г., 15:29:13

> On 7 September 2017 at 15:46, Dmitry Dolgov <9erthalion6@gmail.com> wrote:
>
> It seems that for this particular workload it was about 20-25% slower.

Hmm...looks like I provided misleading data, sorry. The numbers from previous

email are correct and I'm able to reproduce them, but surprisingly for me only

for one particular situation when `log_statement=all; log_destination=stderr`

and stderr is sent to a console with a tmux session running in it (and apparently

it has some impact on the performance, but I'm not sure how exactly). In all

other cases (when stderr is sent to a plain console, /dev/null, or we send logs

to a file) numbers are different, and the difference between patch and master

is quite less significant.

average latency:

clients patch master

10 0.321 0.286

20 0.669 0.602

30 1.016 0.942

40 1.358 1.280

50 1.727 1.637

Re: [HACKERS] log_destination=file

От

Robert Haas

Дата:

14 ноября 2017 г., 21:33:53

On Sun, Sep 10, 2017 at 5:29 AM, Dmitry Dolgov <9erthalion6@gmail.com> wrote:
> average latency:
>
> clients     patch       master
> 10          0.321       0.286
> 20          0.669       0.602
> 30          1.016       0.942
> 40          1.358       1.280
> 50          1.727       1.637

That's still a noticeable slowdown, though.  And we've had previous
reports of the overhead of logging being significant as well:

http://postgr.es/m/CACLsApsA7U0GCFpojVQem6SGTEkv8vnwdBfhVi+dqO+gu5gdCA@mail.gmail.com

I seem to recall a discussion, perhaps in-person, around the time Theo
submitted that patch where it was reported that the logging collector
could not be used on some systems he was working with because it
became a major performance bottleneck.  With each backend writing its
own messages to a file, it was tolerable, but when you tried to funnel
everything through a single process, the back-pressure slowed down the
entire system unacceptably.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [HACKERS] log_destination=file

От

Michael Paquier

Дата:

30 ноября 2017 г., 10:44:17

On Wed, Nov 15, 2017 at 12:33 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Sun, Sep 10, 2017 at 5:29 AM, Dmitry Dolgov <9erthalion6@gmail.com> wrote:
>> average latency:
>>
>> clients     patch       master
>> 10          0.321       0.286
>> 20          0.669       0.602
>> 30          1.016       0.942
>> 40          1.358       1.280
>> 50          1.727       1.637
>
> That's still a noticeable slowdown, though.  And we've had previous
> reports of the overhead of logging being significant as well:
>
> http://postgr.es/m/CACLsApsA7U0GCFpojVQem6SGTEkv8vnwdBfhVi+dqO+gu5gdCA@mail.gmail.com
>
> I seem to recall a discussion, perhaps in-person, around the time Theo
> submitted that patch where it was reported that the logging collector
> could not be used on some systems he was working with because it
> became a major performance bottleneck.  With each backend writing its
> own messages to a file, it was tolerable, but when you tried to funnel
> everything through a single process, the back-pressure slowed down the
> entire system unacceptably.

I have moved this patch to next CF for now.. The original patch still applies.
-- 
Michael

Re: [HACKERS] log_destination=file

От

Magnus Hagander

Дата:

20 января 2018 г., 18:51:12

On Tue, Nov 14, 2017 at 5:33 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Sun, Sep 10, 2017 at 5:29 AM, Dmitry Dolgov <9erthalion6@gmail.com> wrote:
> average latency:
>
> clients patch master
> 10 0.321 0.286
> 20 0.669 0.602
> 30 1.016 0.942
> 40 1.358 1.280
> 50 1.727 1.637

That's still a noticeable slowdown, though. And we've had previous
reports of the overhead of logging being significant as well:

http://postgr.es/m/CACLsApsA7U0GCFpojVQem6SGTEkv8vnwdBfhVi+dqO+gu5gdCA@mail.gmail.com

I seem to recall a discussion, perhaps in-person, around the time Theo
submitted that patch where it was reported that the logging collector
could not be used on some systems he was working with because it
became a major performance bottleneck. With each backend writing its
own messages to a file, it was tolerable, but when you tried to funnel
everything through a single process, the back-pressure slowed down the
entire system unacceptably.

Finally found myself back at this one, because I still think this is a problem we definitely need to adress (whether with this file or not).

The funneling into a single process is definitely an issue.

But we don't really solve that problem today wit logging to stderr, do we? Because somebody has to pick up the log as it came from stderr. Yes, you get more overhead when sending the log to devnull, but that isn't really a realistic scenario. The question is what to do when you actually want to collect that much logging that quickly.

If each backend could actually log to *its own file*, then things would get sped up. But we can't do that today. Unless you use the hooks and build it yourself.

Per the thread referenced, using the hooks to handle the very-high-rate-logging case seems to be the conclusion. But is that still the conclusion, or do we feel we need to also have a native solution?

And if the conclusion is that hooks is the way to go for that, then is the slowdown of this patch actually a relevant problem to it?

--

Magnus Hagander
Me: https://www.hagander.net/
Work: https://www.redpill-linpro.com/

Re: [HACKERS] log_destination=file

От

Robert Haas

Дата:

23 января 2018 г., 01:52:51

On Sat, Jan 20, 2018 at 7:51 AM, Magnus Hagander <magnus@hagander.net> wrote:
> Finally found myself back at this one, because I still think this is a
> problem we definitely need to adress (whether with this file or not).
>
> The funneling into a single process is definitely an issue.
>
> But we don't really solve that problem today wit logging to stderr, do we?
> Because somebody has to pick up the log as it came from stderr. Yes, you get
> more overhead when sending the log to devnull, but that isn't really a
> realistic scenario. The question is what to do when you actually want to
> collect that much logging that quickly.

I think it depends on where the bottleneck is.  If you're limited by
the speed at which a single process can write, shutting the logging
collector off and letting everyone write fixes it, because now you can
bring the CPU cycles of many processes to bear rather than just one.
If you're limited by the rate at which you can lay the file down on
disk, then turning off the logging collector doesn't help, but I don't
think that's the main problem.  Now, of course, if you're writing the
file to disk faster than a single process could do all those writes,
then you're probably also going to need multiple processes to keep up
with reading it, parsing it, etc.  But that's not a problem for
PostgreSQL core unless we decide to start shipping an in-core log
analyzer.

> If each backend could actually log to *its own file*, then things would get
> sped up. But we can't do that today. Unless you use the hooks and build it
> yourself.

That seems like a useful thing to support in core.

> Per the thread referenced, using the hooks to handle the
> very-high-rate-logging case seems to be the conclusion. But is that still
> the conclusion, or do we feel we need to also have a native solution?
>
> And if the conclusion is that hooks is the way to go for that, then is the
> slowdown of this patch actually a relevant problem to it?

I think that if we commit what you've proposed, we're making it harder
for people who have a high volume of logging but are not currently
using hooks.  I think we should try really hard to avoid the situation
where our suggested workaround for a server change is "go write some C
code and maybe you can get back to the performance you had with
release N-1".  That's just not friendly.

I wonder if it would be feasible to set things up so that the logging
collector was always started, but whether or not backends used it or
wrote directly to their original stderr was configurable (e.g. dup
stderr elsewhere, then dup whichever output source is currently
selected onto stderr, then dup the other one if the config is changed
later).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [HACKERS] log_destination=file

От

Tomas Vondra

Дата:

23 января 2018 г., 03:25:30

On 01/22/2018 08:52 PM, Robert Haas wrote:
> On Sat, Jan 20, 2018 at 7:51 AM, Magnus Hagander <magnus@hagander.net> wrote:
>> Finally found myself back at this one, because I still think this is a
>> problem we definitely need to adress (whether with this file or not).
>>
>> The funneling into a single process is definitely an issue.
>>
>> But we don't really solve that problem today wit logging to stderr, do we?
>> Because somebody has to pick up the log as it came from stderr. Yes, you get
>> more overhead when sending the log to devnull, but that isn't really a
>> realistic scenario. The question is what to do when you actually want to
>> collect that much logging that quickly.
> 
> I think it depends on where the bottleneck is. If you're limited by 
> the speed at which a single process can write, shutting the logging 
> collector off and letting everyone write fixes it, because now you
> can bring the CPU cycles of many processes to bear rather than just
> one. If you're limited by the rate at which you can lay the file down
> on disk, then turning off the logging collector doesn't help, but I
> don't think that's the main problem. Now, of course, if you're
> writing the file to disk faster than a single process could do all
> those writes, then you're probably also going to need multiple
> processes to keep up with reading it, parsing it, etc. But that's not
> a problem for PostgreSQL core unless we decide to start shipping an
> in-core log analyzer.
> 

Sorry for the naive question, but which of these bottlenecks are we
actually hitting? I don't recall dealing with an actual production
system where the log collector would be an issue, so I don't have a very
good idea where the actual bottleneck is in this case.

I find it hard to believe the collector would be limited by I/O when
writing the data to disk (buffered sequential writes and all that).

So I guess it's more about the process of collecting the data from all
the processes through pipe, with the PIPE_MAX_PAYLOAD dance, etc.

I plan to do some experiments with log_statements=all, but my guess is
that the main issue is in transferring individual messages (which may be
further split into multiple chunks). What if we instead send the log
messages in larger batches? Of course, that would require alternative
transport mechanism (say, through shared memory) and the collector would
have to merge to maintain ordering. And then there's the issue that the
collector is pretty critical component.

FWIW I'm pretty sure we're not the only project facing the need to
collect high volume of logs from many processes, so how do the other
projects handle this issue?

>>
>> If each backend could actually log to *its own file*, then things would get
>> sped up. But we can't do that today. Unless you use the hooks and build it
>> yourself.
> 
> That seems like a useful thing to support in core.
> 

Yeah, I agree with that.

>> Per the thread referenced, using the hooks to handle the
>> very-high-rate-logging case seems to be the conclusion. But is that still
>> the conclusion, or do we feel we need to also have a native solution?
>>
>> And if the conclusion is that hooks is the way to go for that, then is the
>> slowdown of this patch actually a relevant problem to it?
> 
> I think that if we commit what you've proposed, we're making it harder
> for people who have a high volume of logging but are not currently
> using hooks.  I think we should try really hard to avoid the situation
> where our suggested workaround for a server change is "go write some C
> code and maybe you can get back to the performance you had with
> release N-1".  That's just not friendly.
> 
> I wonder if it would be feasible to set things up so that the logging
> collector was always started, but whether or not backends used it or
> wrote directly to their original stderr was configurable (e.g. dup
> stderr elsewhere, then dup whichever output source is currently
> selected onto stderr, then dup the other one if the config is changed
> later).
> 

I think the hook system is a really powerful tool, but it seems a bit
awkward to force people to use it to improve performance like this ...
That seems like something the core should to out of the box.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] log_destination=file

От

Andres Freund

Дата:

02 марта 2018 г., 13:27:09

Hi,

On 2018-01-20 14:51:12 +0200, Magnus Hagander wrote:
> Finally found myself back at this one, because I still think this is a
> problem we definitely need to adress (whether with this file or not).

This has an open CF entry for CF 2018-03. Given that there's still a lot
of uncertainty what we want, I don't think that's a reasonable target.
I would suggest moving it to 'returned with feedback'. Arguments
against?

- Andres

Re: [HACKERS] log_destination=file

От

Magnus Hagander

Дата:

02 марта 2018 г., 14:42:41

On Fri, Mar 2, 2018 at 8:27 AM, Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2018-01-20 14:51:12 +0200, Magnus Hagander wrote:
> Finally found myself back at this one, because I still think this is a
> problem we definitely need to adress (whether with this file or not).

This has an open CF entry for CF 2018-03. Given that there's still a lot
of uncertainty what we want, I don't think that's a reasonable target.
I would suggest moving it to 'returned with feedback'. Arguments
against?

I agree, let's boot it. It needs more discussions about what to actually do, and after that how. Let's focus on other things for 11, and have this discussion again when the time comes. I'll go mark it.

--

Magnus Hagander
Me: https://www.hagander.net/
Work: https://www.redpill-linpro.com/

Re: [HACKERS] log_destination=file

От

Robert Haas

Дата:

03 марта 2018 г., 00:22:03

On Mon, Jan 22, 2018 at 4:25 PM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
> Sorry for the naive question, but which of these bottlenecks are we
> actually hitting? I don't recall dealing with an actual production
> system where the log collector would be an issue, so I don't have a very
> good idea where the actual bottleneck is in this case.

Unfortunately, neither do I -- but I do believe that the problem is real.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [HACKERS] log_destination=file

От

Tom Lane

Дата:

03 марта 2018 г., 00:29:59

Robert Haas <robertmhaas@gmail.com> writes:
> On Mon, Jan 22, 2018 at 4:25 PM, Tomas Vondra
> <tomas.vondra@2ndquadrant.com> wrote:
>> Sorry for the naive question, but which of these bottlenecks are we
>> actually hitting? I don't recall dealing with an actual production
>> system where the log collector would be an issue, so I don't have a very
>> good idea where the actual bottleneck is in this case.

> Unfortunately, neither do I -- but I do believe that the problem is real.

Not to state the obvious ... but if we do not know where the bottleneck
is, isn't it a bit premature to be proposing changes to "fix" it?

But that argument just leads to the same conclusion that seems to be
prevalent at the moment, ie boot this to v12.

            regards, tom lane

Re: [HACKERS] log_destination=file

От

Magnus Hagander

Дата:

03 марта 2018 г., 05:06:09

On Fri, Mar 2, 2018 at 7:29 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:
> On Mon, Jan 22, 2018 at 4:25 PM, Tomas Vondra
> <tomas.vondra@2ndquadrant.com> wrote:
>> Sorry for the naive question, but which of these bottlenecks are we
>> actually hitting? I don't recall dealing with an actual production
>> system where the log collector would be an issue, so I don't have a very
>> good idea where the actual bottleneck is in this case.

> Unfortunately, neither do I -- but I do believe that the problem is real.

Not to state the obvious ... but if we do not know where the bottleneck
is, isn't it a bit premature to be proposing changes to "fix" it?

But that argument just leads to the same conclusion that seems to be
prevalent at the moment, ie boot this to v12.

Just to be clear, it has already been booted to v12.

--

Magnus Hagander
Me: https://www.hagander.net/
Work: https://www.redpill-linpro.com/