Обсуждение: parallel restore item dependencies
OK, I've worked out why I am seeing deadlocks etc. from parallel restore on FK items. In my original patch, I looked at all the dependencies of a candidate item ansd compared them with the dependencies of the running items to see if there was a potential locking clash. However, Tom in his admirable reworking of my patch, restricted the list of potential clashing items (lockDeps) to "TABLE" items, if any. This would probably have been ok if we hadn't just beforehand transferred all TABLE dependencies in POST_DATA items to the corresponding TABLE DATA item. The result is that we get empty lockDeps lists on all items - I'm surprised we haven't had more complaints about deadlock or failing locks. A simple fix that would probably work would be to adjust the filter to include TABLE DATA items, so the relevant statement would read: if (tocsByDumpId[depid - 1] && (strcmp(tocsByDumpId[depid - 1]->desc, "TABLE") == 0 || strcmp(tocsByDumpId[depid- 1]->desc, "TABLE DATA") == 0)) lockids[nlockids++] = depid; Perhaps a better fix would move the code that sets up the lockDeps so that it runs before we adjust the dependencies. I'm moderately confident that either of these fixes will work, but I think this demonstrates the need for lots of testing, especially with complex data sets that have lots of dependencies and potentially deadlocking items. thoughts? cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > OK, I've worked out why I am seeing deadlocks etc. from parallel restore > on FK items. > In my original patch, I looked at all the dependencies of a candidate > item ansd compared them with the dependencies of the running items to > see if there was a potential locking clash. However, Tom in his > admirable reworking of my patch, restricted the list of potential > clashing items (lockDeps) to "TABLE" items, if any. This would probably > have been ok if we hadn't just beforehand transferred all TABLE > dependencies in POST_DATA items to the corresponding TABLE DATA item. > The result is that we get empty lockDeps lists on all items - I'm > surprised we haven't had more complaints about deadlock or failing locks. [ scratches head... ] I coulda sworn I tested that when I was hacking it. I'm running low on steam tonight but will think more about this tomorrow. regards, tom lane
I wrote: > Andrew Dunstan <andrew@dunslane.net> writes: >> In my original patch, I looked at all the dependencies of a candidate >> item ansd compared them with the dependencies of the running items to >> see if there was a potential locking clash. However, Tom in his >> admirable reworking of my patch, restricted the list of potential >> clashing items (lockDeps) to "TABLE" items, if any. This would probably >> have been ok if we hadn't just beforehand transferred all TABLE >> dependencies in POST_DATA items to the corresponding TABLE DATA item. >> The result is that we get empty lockDeps lists on all items - I'm >> surprised we haven't had more complaints about deadlock or failing locks. > [ scratches head... ] I coulda sworn I tested that when I was hacking > it. I'm running low on steam tonight but will think more about this > tomorrow. I think I have reconstructed what happened: I tested this code before I decided that repointing the dependencies was a good idea, or else reordered the sequence of operations in fix_dependencies after that. It looks to me like the correct fix is just to look for TABLE DATA not TABLE while setting up lockDeps[], since all the entry types we care about are POST_DATA items. Anyway, I've committed that, please try it. regards, tom lane
Tom Lane wrote: > I wrote: > >> Andrew Dunstan <andrew@dunslane.net> writes: >> >>> In my original patch, I looked at all the dependencies of a candidate >>> item ansd compared them with the dependencies of the running items to >>> see if there was a potential locking clash. However, Tom in his >>> admirable reworking of my patch, restricted the list of potential >>> clashing items (lockDeps) to "TABLE" items, if any. This would probably >>> have been ok if we hadn't just beforehand transferred all TABLE >>> dependencies in POST_DATA items to the corresponding TABLE DATA item. >>> The result is that we get empty lockDeps lists on all items - I'm >>> surprised we haven't had more complaints about deadlock or failing locks. >>> > > >> [ scratches head... ] I coulda sworn I tested that when I was hacking >> it. I'm running low on steam tonight but will think more about this >> tomorrow. >> > > I think I have reconstructed what happened: I tested this code before > I decided that repointing the dependencies was a good idea, or else > reordered the sequence of operations in fix_dependencies after that. > It looks to me like the correct fix is just to look for TABLE DATA > not TABLE while setting up lockDeps[], since all the entry types we > care about are POST_DATA items. Anyway, I've committed that, please > try it. > > > Passes test. Thanks. andrew