Re: Parallel copy

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: Parallel copy
Дата
Msg-id 20200225160051.q7df3mibkguubnwf@development
обсуждение исходный текст
Ответ на Re: Parallel copy  (Andres Freund <andres@anarazel.de>)
Ответы Re: Parallel copy  (Amit Kapila <amit.kapila16@gmail.com>)
Re: Parallel copy  (Ants Aasma <ants@cybertec.at>)
Список pgsql-hackers
On Sun, Feb 23, 2020 at 05:09:51PM -0800, Andres Freund wrote:
>Hi,
>
>On 2020-02-19 11:38:45 +0100, Tomas Vondra wrote:
>> I generally agree with the impression that parsing CSV is tricky and
>> unlikely to benefit from parallelism in general. There may be cases with
>> restrictions making it easier (e.g. restrictions on the format) but that
>> might be a bit too complex to start with.
>>
>> For example, I had an idea to parallelise the planning by splitting it
>> into two phases:
>
>FWIW, I think we ought to rewrite our COPY parsers before we go for
>complex schemes. They're way slower than a decent green-field
>CSV/... parser.
>

Yep, that's quite possible.

>
>> The one piece of information I'm missing here is at least a very rough
>> quantification of the individual steps of CSV processing - for example
>> if parsing takes only 10% of the time, it's pretty pointless to start by
>> parallelising this part and we should focus on the rest. If it's 50% it
>> might be a different story. Has anyone done any measurements?
>
>Not recently, but I'm pretty sure that I've observed CSV parsing to be
>way more than 10%.
>

Perhaps. I guess it'll depend on the CSV file (number of fields, ...),
so I still think we need to do some measurements first. I'm willing to
do that, but (a) I doubt I'll have time for that until after 2020-03,
and (b) it'd be good to agree on some set of typical CSV files.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Kostas Chasialis
Дата:
Сообщение: [GsoC] Read/write transaction-level routing in Odyssey Project Idea
Следующее
От: Konstantin Knizhnik
Дата:
Сообщение: Re: Yet another vectorized engine