Re: pgarchives: Bug report + Patches: loader can't handle message in multiple lists

Поиск
Список
Период
Сортировка
От Magnus Hagander
Тема Re: pgarchives: Bug report + Patches: loader can't handle message in multiple lists
Дата
Msg-id CABUevExZQk52_OQjjOxJFW9b13x5sEZ_wSknZz-Tg-62xR9Wyw@mail.gmail.com
обсуждение исходный текст
Ответ на pgarchives: Bug report + Patches: loader can't handle message in multiple lists  (Célestin Matte <celestin.matte@cmatte.me>)
Ответы Re: pgarchives: Bug report + Patches: loader can't handle message in multiple lists
Список pgsql-www
On Wed, Mar 22, 2023 at 4:13 PM Célestin Matte <celestin.matte@cmatte.me> wrote:
>
> The messages loader from pgarchives may crash when importing 2 mailing lists with messages in common. In another
word:importing script will fail when importing list1 and list2 if a message to list1 has list2 in CC:. 

We certainly imported many thousands of such messages when we
initially loaded the postgres archives, so at *some* point it worked
-- it must be some change that came in later that broke it.


> I attach a patch (0001-loader-attempt-to-handle-message-in-multiple-lists.patch) that starts addressing the issue,
butdoes not fully fixes it, as the script can later crash (in storage.py line 234) because a message cannot be imported
twice.Fixing this would require changing the way messages are stored in the database, using (messageid, listid) as a
primarykey instead of messageid, or allowing a message to belong to several threads using message_thread table instead
ofa threadid column in messages. 
> This patch is only for discussion.

How does it even get that far?  The code on line 32 of storage.py will
check if the message is already in the archives. How does it even get
to the point of trying to insert the message again?

I don't see why you'd need a different storage -- the entire point in
this case is that the message is stored only once, but it's tagged
with being on both lists (in list_threads). That is, a message belongs
to a thread, and the thread can belong to one or more lists.


> I also attach patch 0001-load_message-catch-postgres-UniqueViolation-errors-w.patch as a workaround to this issue, to
catchand log such errors and keep importing an mbox without crashing when it happens (message will then only appear in
thefirst imported list). 
> This patch can be used/applied.

I'm not sure I like the idea of committing after every message and
then rolling back in the event of an error. If nothing else, this
should be done using a savepoint instead of a complete transaction.


--
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/



В списке pgsql-www по дате отправления:

Предыдущее
От: Magnus Hagander
Дата:
Сообщение: Re: [PATCH] Message loader: add JIS encodings
Следующее
От: Magnus Hagander
Дата:
Сообщение: Re: [PATCH] pgarchives: Always load auth, even when using PUBLIC_ARCHIVES=True