UPDATE grabs multiple rows when it seems like it should only grab one

Поиск
Список
Период
Сортировка
От Kevin Burke
Тема UPDATE grabs multiple rows when it seems like it should only grab one
Дата
Msg-id CAEYV4pYAG+o2_SO7+Tc9aUF8A9aZW=bNi2H7yd39i9fDy5nEEg@mail.gmail.com
обсуждение исходный текст
Ответы Re: UPDATE grabs multiple rows when it seems like it should only grab one  (Kevin Burke <burke@shyp.com>)
Re: UPDATE grabs multiple rows when it seems like it should only grab one  ("David G. Johnston" <david.g.johnston@gmail.com>)
Re: UPDATE grabs multiple rows when it seems like it should only grab one  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-bugs
Hi,
I'm trying to write a job queue that grabs one job at a time from the
queue. I expect that the following query should update a maximum of one row
in the table:

UPDATE queued_jobs
SET status='in-progress',
        updated_at=now()
FROM (
        SELECT id AS inner_id
        FROM queued_jobs
        WHERE status='queued'
                AND name = $1
                AND run_after <= now()
        LIMIT 1
        FOR UPDATE
) find_job
WHERE queued_jobs.id = find_job.inner_id
        AND status='queued'
RETURNING id,
        name,
        attempts,
        run_after,
        expires_at,
        status,
        data,
        created_at,
        updated_at

However, I observe that multiple rows are updated. I am certain that it's a
single query updating multiple rows, because I observed this in the EXPLAIN
output, and also configured my application to crash if multiple rows were
returned, and could reliably trigger an application crash.

Here is the EXPLAIN output from a query when two rows were returned:

Update on queued_jobs  (cost=0.75..16.83 rows=1 width=120) (actual
time=3.011..67.515 rows=2 loops=1)
  ->  Nested Loop  (cost=0.75..16.83 rows=1 width=120) (actual
time=2.974..67.458 rows=2 loops=1)
        Join Filter: (queued_jobs.id = find_job.inner_id)
        Rows Removed by Join Filter: 475
        ->  Index Scan using queued_jobs_pkey on queued_jobs
 (cost=0.38..8.39 rows=1 width=80) (actual time=0.011..1.326 rows=477
loops=1)
              Filter: (status = 'queued'::job_status)
              Rows Removed by Filter: 1
        ->  Subquery Scan on find_job  (cost=0.38..8.42 rows=1 width=56)
(actual time=0.137..0.138 rows=1 loops=477)
              ->  Limit  (cost=0.38..8.41 rows=1 width=22) (actual
time=0.136..0.136 rows=1 loops=477)
                    ->  LockRows  (cost=0.38..8.41 rows=1 width=22) (actual
time=0.136..0.136 rows=1 loops=477)
                          ->  Index Scan using find_queued_job on
queued_jobs queued_jobs_1  (cost=0.38..8.40 rows=1 width=22) (actual
time=0.134..0.135 rows=2 loops=477)
                                Index Cond: ((name = $1) AND (run_after <=
now()))
                                Filter: (status = 'queued'::job_status)

Here's the EXPLAIN output from a "normal" query that only gets one row:

Update on queued_jobs  (cost=0.41..8.53 rows=1 width=120) (actual
time=3.730..3.733 rows=1 loops=1)
  ->  Nested Loop  (cost=0.41..8.53 rows=1 width=120) (actual
time=3.688..3.690 rows=1 loops=1)
        ->  Subquery Scan on find_job  (cost=0.00..0.08 rows=1 width=56)
(actual time=3.672..3.673 rows=1 loops=1)
              ->  Limit  (cost=0.00..0.07 rows=1 width=22) (actual
time=3.662..3.662 rows=1 loops=1)
                    ->  LockRows  (cost=0.00..2935.47 rows=42743 width=22)
(actual time=3.661..3.661 rows=1 loops=1)
                          ->  Seq Scan on queued_jobs queued_jobs_1
 (cost=0.00..2508.04 rows=42743 width=22) (actual time=1.362..1.375 rows=5
loops=1)
                                Filter: ((status = 'queued'::job_status)
AND (name = $1) AND (run_after <= now()))
                                Rows Removed by Filter: 1
        ->  Index Scan using queued_jobs_pkey on queued_jobs
 (cost=0.41..8.44 rows=1 width=80) (actual time=0.012..0.013 rows=1 loops=1)
              Index Cond: (id = find_job.inner_id)
              Filter: (status = 'queued'::job_status)

For convenience, I've posted these (and a table schema) here:
https://gist.github.com/kevinburkeshyp/ba5fdac337b3793628261de5fb26d6a3

I'm running Postgres 9.4.6 on a Mac 10.10.5, installed via Homebrew, with
the read committed isolation level. The client is a Go application with 8
concurrent database connections, using prepared statements with the
github.com/lib/pq client.

I also observe that this only seems to occur when I am simultaneously
inserting rows into the table. The inserts occur from a different Go
application, running on a separate process with a separate connection pool.

Any ideas? Maybe I don't understand SQL properly? I can reliably reproduce
this, please ping me if you'd like more information!

--
kevin

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: problem installing postgres in debian8 from debian repository
Следующее
От: Kevin Burke
Дата:
Сообщение: Re: UPDATE grabs multiple rows when it seems like it should only grab one