Обсуждение: Immutable function WAY slower than Stable function?

Поиск
Список
Период
Сортировка

Immutable function WAY slower than Stable function?

От
Ken Tanzer
Дата:
Hi.  I was recently troubleshooting a function, and realized it had incorrectly been declared as Immutable, when it should have been declared Stable.  When I changed it to Stable, the query I was running ran dramatically faster. Digging into this a little more, this is what I found:

I've got a function (staff_inspector) that takes two arguments, an integer and a date.

I've got a convenience function that takes just an integer, and fills in the date.  (With a Stable function target_date()).

There is no performance difference between the two functions if both arguments are supplied.

If I use the convenience function however, the difference is dramatic.  The Stable version clocks in around 1.3 seconds, and the immutable version at around 23 seconds.

So I'm wondering if this is expected behavior, and if someone could explain it to me.  Also, if it is expected, I'm wondering whether it's worth noting in Section 36.6 ("Function Volatility Categories"), which led me to believe I might be risking a stale value by marking something as Immutable, but offers no hint about a ginormous performance penalty.

Here's some more detail and information.

I created two versions of the underlying function.  (staff_inspector_stable,staff_inspector_imm).  There is no noticeable  performance difference between them.  I created two versions of the convenience function, si_stable and si_imm.  They are identical, except for being declared Stable or Immutable.  

The first time I run any of these queries, they take about two seconds.  All subsequent calls take about 1.3 seconds.  Except the Immutable version, which clocks in at 20+ seconds.  This is true whether it calls the Stable or Immutable version of the underlying function:

spc=> EXPLAIN ANALYZE SELECT client_id,staff_inspector_stable(client_id,target_date()) FROM tbl_residence_own;                                                     QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------Seq Scan on tbl_residence_own  (cost=0.00..3734.27 rows=6977 width=8) (actual time=8.311..1990.601 rows=6983 loops=1)Planning time: 1.976 msExecution time: 2001.247 ms
(3 rows)

spc=> EXPLAIN ANALYZE SELECT client_id,staff_inspector_stable(client_id,target_date()) FROM tbl_residence_own;                                                     QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------Seq Scan on tbl_residence_own  (cost=0.00..3734.27 rows=6977 width=8) (actual time=3.472..1298.537 rows=6983 loops=1)Planning time: 0.279 msExecution time: 1310.831 ms
(3 rows)

spc=> EXPLAIN ANALYZE SELECT client_id,staff_inspector_imm(client_id,target_date()) FROM tbl_residence_own;                                                     QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------Seq Scan on tbl_residence_own  (cost=0.00..3734.27 rows=6977 width=8) (actual time=3.780..1299.082 rows=6983 loops=1)Planning time: 0.308 msExecution time: 1311.379 ms
(3 rows)

spc=> EXPLAIN ANALYZE SELECT client_id,si_stable(client_id) FROM tbl_residence_own;                                                     QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------Seq Scan on tbl_residence_own  (cost=0.00..3734.27 rows=6977 width=8) (actual time=3.145..1300.551 rows=6983 loops=1)Planning time: 0.281 msExecution time: 1312.762 ms
(3 rows)

spc=> EXPLAIN ANALYZE SELECT client_id,si_imm(client_id) FROM tbl_residence_own;
                                                       QUERY PLAN
------------------------------------------------------------------------------------------------------------------------Seq Scan on tbl_residence_own  (cost=0.00..1990.02 rows=6977 width=8) (actual time=3.537..22892.481 rows=6983 loops=1)Planning time: 0.079 ms
 Execution time: 22903.504 ms
(3 rows)

spc=> EXPLAIN ANALYZE SELECT client_id,si_stable_calls_imm(client_id) FROM tbl_residence_own;                                                     QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------Seq Scan on tbl_residence_own  (cost=0.00..3734.27 rows=6977 width=8) (actual time=2.907..1291.235 rows=6983 loops=1)Planning time: 0.223 msExecution time: 1303.488 ms
(3 rows)

spc=> EXPLAIN ANALYZE SELECT client_id,si_imm_calls_imm(client_id) FROM tbl_residence_own;
                                                       QUERY PLAN
------------------------------------------------------------------------------------------------------------------------Seq Scan on tbl_residence_own  (cost=0.00..1990.02 rows=6977 width=8) (actual time=3.664..22868.734 rows=6983 loops=1)Planning time: 0.134 ms
 Execution time: 22879.761 ms
(3 rows)

And then here is the definitions of the functions:

CREATE OR REPLACE FUNCTION staff_inspector_stable( client INTEGER, asof DATE ) RETURNS INTEGER AS $$
SELECT
CASE WHEN
(SELECT program_type_code FROM reg_spc WHERE asof BETWEEN reg_spc_date AND COALESCE(reg_spc_date_end,asof) AND client_id=client LIMIT 1)
ILIKE 'SSP%' THEN
--- SSP answer   COALESCE((SELECT staff_id FROM staff_employment_current WHERE staff_position_code='COORD_PROP' AND agency_project_code='SSP' LIMIT 1),(SELECT staff_id FROM staff_employment_current WHERE staff_position_code='MGRPROJ' AND agency_project_code='SSP' LIMIT 1))
ELSE
---SPC answer   (SELECT staff_inspector FROM tbl_residence_own ro LEFT JOIN l_housing_project USING (housing_project_code) WHERE client_id=client AND asof BETWEEN residence_date AND COALESCE(residence_date_end,asof) AND NOT ro.is_deleted LIMIT 1)

END
--LIMIT 1
$$ LANGUAGE SQL STABLE;

CREATE OR REPLACE FUNCTION staff_inspector_imm( client INTEGER, asof DATE ) RETURNS INTEGER AS $$
SELECT
CASE WHEN
(SELECT program_type_code FROM reg_spc WHERE asof BETWEEN reg_spc_date AND COALESCE(reg_spc_date_end,asof) AND client_id=client LIMIT 1)
ILIKE 'SSP%' THEN
--- SSP answer   COALESCE((SELECT staff_id FROM staff_employment_current WHERE staff_position_code='COORD_PROP' AND agency_project_code='SSP' LIMIT 1),(SELECT staff_id FROM staff_employment_current WHERE staff_position_code='MGRPROJ' AND agency_project_code='SSP' LIMIT 1))
ELSE
---SPC answer   (SELECT staff_inspector FROM tbl_residence_own ro LEFT JOIN l_housing_project USING (housing_project_code) WHERE client_id=client AND asof BETWEEN residence_date AND COALESCE(residence_date_end,asof) AND NOT ro.is_deleted LIMIT 1)

END
--LIMIT 1
$$ LANGUAGE SQL IMMUTABLE;

CREATE OR REPLACE FUNCTION si_stable( client INTEGER ) RETURNS INTEGER AS $$
SELECT staff_inspector_stable(client,target_date());
$$ LANGUAGE SQL STABLE;

CREATE OR REPLACE FUNCTION si_imm( client INTEGER ) RETURNS INTEGER AS $$
SELECT staff_inspector_stable(client,target_date());
$$ LANGUAGE SQL IMMUTABLE;

CREATE OR REPLACE FUNCTION si_stable_calls_imm( client INTEGER ) RETURNS INTEGER AS $$
SELECT staff_inspector_imm(client,target_date());
$$ LANGUAGE SQL STABLE;

CREATE OR REPLACE FUNCTION si_imm_calls_imm( client INTEGER ) RETURNS INTEGER AS $$
SELECT staff_inspector_imm(client,target_date());
$$ LANGUAGE SQL IMMUTABLE;

And the version info (Postgres on Centos 6.10)

SELECT version();
                                                 version                                                  
----------------------------------------------------------------------------------------------------------PostgreSQL 9.6.9 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-18), 64-bit
(1 row)

Happy to provide any additional relevant info, or for someone to point out what obvious thing I'm overlooking.  Thanks in advance!

Ken



--
AGENCY Software  
A Free Software data system
By and for non-profits
(253) 245-3801

learn more about AGENCY or
follow the discussion.

Re: Immutable function WAY slower than Stable function?

От
Tom Lane
Дата:
Ken Tanzer <ken.tanzer@gmail.com> writes:
> Hi.  I was recently troubleshooting a function, and realized it had
> incorrectly been declared as Immutable, when it should have been declared
> Stable.  When I changed it to Stable, the query I was running ran
> dramatically faster. Digging into this a little more, this is what I found:

EXPLAIN VERBOSE might shed a bit more light.  I suspect you're falling
foul of the rather arcane rules about whether a SQL function can be
inlined.  (Although it seems like only the wrapper function would be
inline-able anyway, so I'm not sure why the penalty is that high.)

            regards, tom lane


Re: Immutable function WAY slower than Stable function?

От
Ken Tanzer
Дата:
On Mon, Aug 6, 2018 at 4:11 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Ken Tanzer <ken.tanzer@gmail.com> writes:
> Hi.  I was recently troubleshooting a function, and realized it had
> incorrectly been declared as Immutable, when it should have been declared
> Stable.  When I changed it to Stable, the query I was running ran
> dramatically faster. Digging into this a little more, this is what I found:

EXPLAIN VERBOSE might shed a bit more light.  I suspect you're falling
foul of the rather arcane rules about whether a SQL function can be
inlined.  (Although it seems like only the wrapper function would be
inline-able anyway, so I'm not sure why the penalty is that high.)

                        regards, tom lane

Thanks Tom.  This was as talkative as I could get it.  I do see the difference on the Output line, but I don't know how to interperet it.

spc=> EXPLAIN (ANALYZE,BUFFERS,VERBOSE) SELECT client_id,si_imm(client_id) FROM tbl_residence_own;                                                         QUERY PLAN                                                           
-------------------------------------------------------------------------------------------------------------------------------Seq Scan on public.tbl_residence_own  (cost=0.00..1990.02 rows=6977 width=8) (actual time=3.771..22665.604 rows=6983 loops=1)  Output: client_id, si_imm(client_id)  Buffers: shared hit=199814Planning time: 0.156 msExecution time: 22677.333 ms
(5 rows)

spc=> EXPLAIN (ANALYZE,BUFFERS,VERBOSE) SELECT client_id,si_stable(client_id) FROM tbl_residence_own;                                                         QUERY PLAN                                                          
------------------------------------------------------------------------------------------------------------------------------Seq Scan on public.tbl_residence_own  (cost=0.00..3734.27 rows=6977 width=8) (actual time=3.100..1302.888 rows=6983 loops=1)  Output: client_id, staff_inspector_stable(client_id, target_date())  Buffers: shared hit=60174Planning time: 0.354 msExecution time: 1315.746 ms
(5 rows)

Cheers,
Ken

--
AGENCY Software  
A Free Software data system
By and for non-profits
(253) 245-3801

learn more about AGENCY or
follow the discussion.

Re: Immutable function WAY slower than Stable function?

От
Adrian Klaver
Дата:
On 08/06/2018 03:49 PM, Ken Tanzer wrote:
> Hi.  I was recently troubleshooting a function, and realized it had 
> incorrectly been declared as Immutable, when it should have been 
> declared Stable.  When I changed it to Stable, the query I was running 
> ran dramatically faster. Digging into this a little more, this is what I 
> found:
> 

> 
> Happy to provide any additional relevant info, or for someone to point 
> out what obvious thing I'm overlooking.  Thanks in advance!

What is the definition for target_date()?

> 
> Ken
> 
> 
> 
> -- 


-- 
Adrian Klaver
adrian.klaver@aklaver.com


Re: Immutable function WAY slower than Stable function?

От
Ken Tanzer
Дата:
On Mon, Aug 6, 2018 at 4:36 PM Adrian Klaver <adrian.klaver@aklaver.com> wrote:

What is the definition for target_date()?

Hi Adrian.  Happy to provide this info.  Though on a side note, I don't understand why it should matter, if functions are black box optimization fences.  But here are the definitions:

CREATE OR REPLACE FUNCTION target_date() RETURNS date AS $$       SELECT target_date FROM target_date_current;
$$ LANGUAGE sql STABLE;
The target_date table and views:

CREATE TABLE tbl_target_date (   target_date_id          SERIAL PRIMARY KEY,   target_date         DATE NOT NULL,   effective_at        TIMESTAMP NOT NULL DEFAULT current_timestamp,   comment             TEXT,   --system fields   added_by            INTEGER NOT NULL REFERENCES tbl_staff (staff_id),   added_at            TIMESTAMP(0)     NOT NULL DEFAULT CURRENT_TIMESTAMP,   changed_by          INTEGER NOT NULL  REFERENCES tbl_staff (staff_id),   changed_at          TIMESTAMP(0)     NOT NULL DEFAULT CURRENT_TIMESTAMP,   is_deleted          BOOLEAN NOT NULL DEFAULT FALSE,   deleted_at          TIMESTAMP(0),   deleted_by          INTEGER REFERENCES tbl_staff(staff_id),   deleted_comment     TEXT,   sys_log         TEXT
);

CREATE VIEW target_date AS SELECT * FROM tbl_target_date WHERE NOT is_deleted;

CREATE VIEW target_date_current AS   SELECT * FROM target_date ORDER BY effective_at DESC LIMIT 1;

CREATE OR REPLACE FUNCTION target_date_no_edit_or_delete() RETURNS TRIGGER AS $$
   BEGIN   IF (TG_OP <> 'INSERT')   THEN RAISE EXCEPTION 'Target records cannot be changed or deleted.  (Attempted operation: %)',TG_OP;   END IF;   IF (NEW.target_date <> date_trunc('month',NEW.target_date))   THEN RAISE EXCEPTION 'Target date must be the first of a month';   END IF;   IF (NEW.target_date <= target_date())   THEN RAISE EXCEPTION 'Target date can only be moved forward';   END IF;   RETURN NEW;   END;

$$ LANGUAGE plpgsql;

CREATE TRIGGER protect_target_date BEFORE INSERT OR UPDATE OR DELETE ON tbl_target_date FOR EACH ROW EXECUTE PROCEDURE target_date_no_edit_or_delete();
CREATE TRIGGER target_date_no_trunacte BEFORE TRUNCATE ON tbl_target_date FOR STATEMENT EXECUTE PROCEDURE target_date_no_edit_or_delete();
And the tbl_residence_own, which was referenced in my sample queries:

spc=> \d tbl_residence_own                                                      Table "public.tbl_residence_own"           Column            |              Type              |                                  Modifiers
------------------------------+--------------------------------+------------------------------------------------------------------------------residence_own_id             | integer                        | not null default nextval('tbl_residence_own_residence_own_id_seq'::regclass)client_id                    | integer                        | not nullhousing_project_code         | character varying(10)          | not nullhousing_unit_code            | character varying(10)          | not nullresidence_date               | date                           | not nullresidence_date_end           | date                           |unit_rent_manual             | numeric(7,2)                   |utility_allowance_manual     | numeric(7,2)                   |is_active_manual             | boolean                        | not null default truewas_received_hap             | boolean                        |was_received_compliance      | boolean                        |moved_from_code              | character varying(10)          |chronic_homeless_status_code | character varying(10)          |lease_on_file                | boolean                        |moved_to_code                | character varying(10)          |departure_type_code          | character varying(10)          |departure_reason_code        | character varying(10)          |move_out_was_code            | character varying(10)          |returned_homeless            | boolean                        |was_deposit_returned         | boolean                        |comment_damage               | text                           |comment_deposit              | text                           |comment                      | text                           |old_access_id                | character varying              |old_utility_allowance        | numeric(9,2)                   |added_by                     | integer                        | not nulladded_at                     | timestamp(0) without time zone | not null default now()changed_by                   | integer                        | not nullchanged_at                   | timestamp(0) without time zone | not null default now()is_deleted                   | boolean                        | default falsedeleted_at                   | timestamp(0) without time zone |deleted_by                   | integer                        |deleted_comment              | text                           |sys_log                      | text                           |tenant_pays_deposit          | boolean                        | not null default falseis_coordinated_entry         | boolean                        |referral_source              | text                           |
Indexes:   "tbl_residence_own_pkey" PRIMARY KEY, btree (residence_own_id)   "tbl_residence_own_client_id" btree (client_id)   "tbl_residence_own_housing_project_code" btree (housing_project_code)   "tbl_residence_own_housing_unit_code" btree (housing_unit_code)   "tbl_residence_own_is_deleted" btree (is_deleted)   "tbl_residence_own_residence_date" btree (residence_date)   "tbl_residence_own_residence_date_end" btree (residence_date_end)
Check constraints:   "coordinated_entry_or_other" CHECK (xor(is_coordinated_entry, referral_source IS NOT NULL))   "date_sanity" CHECK (residence_date_end IS NULL OR residence_date <= residence_date_end)
Foreign-key constraints:   "tbl_residence_own_added_by_fkey" FOREIGN KEY (added_by) REFERENCES tbl_staff(staff_id)   "tbl_residence_own_changed_by_fkey" FOREIGN KEY (changed_by) REFERENCES tbl_staff(staff_id)   "tbl_residence_own_chronic_homeless_status_code_fkey" FOREIGN KEY (chronic_homeless_status_code) REFERENCES tbl_l_chronic_homeless_status(chronic_homeless_status_code)   "tbl_residence_own_client_id_fkey" FOREIGN KEY (client_id) REFERENCES tbl_client(client_id)   "tbl_residence_own_deleted_by_fkey" FOREIGN KEY (deleted_by) REFERENCES tbl_staff(staff_id)   "tbl_residence_own_departure_reason_code_fkey" FOREIGN KEY (departure_reason_code) REFERENCES tbl_l_departure_reason(departure_reason_code)   "tbl_residence_own_departure_type_code_fkey" FOREIGN KEY (departure_type_code) REFERENCES tbl_l_departure_type(departure_type_code)   "tbl_residence_own_housing_project_code_fkey" FOREIGN KEY (housing_project_code) REFERENCES tbl_l_housing_project(housing_project_code)   "tbl_residence_own_housing_unit_code_fkey" FOREIGN KEY (housing_unit_code) REFERENCES tbl_housing_unit(housing_unit_code)   "tbl_residence_own_move_out_was_code_fkey" FOREIGN KEY (move_out_was_code) REFERENCES tbl_l_exit_status(exit_status_code)   "tbl_residence_own_moved_from_code_fkey" FOREIGN KEY (moved_from_code) REFERENCES tbl_l_facility(facility_code)   "tbl_residence_own_moved_to_code_fkey" FOREIGN KEY (moved_to_code) REFERENCES tbl_l_facility(facility_code)
Triggers:   check_max_occupant AFTER INSERT OR UPDATE ON tbl_residence_own FOR EACH ROW EXECUTE PROCEDURE enforce_max_occupant()   tbl_residence_own_log_chg AFTER INSERT OR DELETE OR UPDATE ON tbl_residence_own FOR EACH ROW EXECUTE PROCEDURE table_log()   tbl_residence_own_no_unit_or_project_change BEFORE UPDATE ON tbl_residence_own FOR EACH ROW EXECUTE PROCEDURE tbl_residence_own_validate_modify()
Let me know if I can provide more info.  Thanks!

Ken


-- 
AGENCY Software  
A Free Software data system
By and for non-profits
(253) 245-3801

learn more about AGENCY or
follow the discussion.

Re: Immutable function WAY slower than Stable function?

От
Adrian Klaver
Дата:
On 08/06/2018 04:44 PM, Ken Tanzer wrote:
> On Mon, Aug 6, 2018 at 4:36 PM Adrian Klaver <adrian.klaver@aklaver.com 
> <mailto:adrian.klaver@aklaver.com>> wrote:
> 
> 
>     What is the definition for target_date()?
> 
> 
> Hi Adrian.  Happy to provide this info.  Though on a side note, I don't 
> understand why it should matter, if functions are black box optimization 

Black boxes the bane of my existence, different field, control 
circuits:) My interest was because you are chaining functions and a 
chain is only as strong as it's weakest link. From the surface I am not 
seeing anything, will need to mull it over a bit.

> fences.  But here are the definitions:
> 

-- 
Adrian Klaver
adrian.klaver@aklaver.com


Re: Immutable function WAY slower than Stable function?

От
Tom Lane
Дата:
Ken Tanzer <ken.tanzer@gmail.com> writes:
> Hi Adrian.  Happy to provide this info.  Though on a side note, I don't
> understand why it should matter, if functions are black box optimization
> fences.

They aren't, at least not when they are SQL-language functions that
meet the conditions for inlining.  The reason that EXPLAIN VERBOSE
is helpful here is that you can see whether the function got inlined
or not: do you see a call to the function, or a representation of
its body?

            regards, tom lane


Re: Immutable function WAY slower than Stable function?

От
Ken Tanzer
Дата:
On Mon, Aug 6, 2018 at 7:42 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Ken Tanzer <ken.tanzer@gmail.com> writes:
> Hi Adrian.  Happy to provide this info.  Though on a side note, I don't
> understand why it should matter, if functions are black box optimization
> fences.

They aren't, at least not when they are SQL-language functions that
meet the conditions for inlining. 

Yeah, I kinda realized after I sent this that I wasn't really making much sense.  :)  I appreciate your explanation though.

The reason that EXPLAIN VERBOSE
is helpful here is that you can see whether the function got inlined
or not: do you see a call to the function, or a representation of
its body?

I wasn't able to get anything more VERBOSE than what I sent before.  I re-copied it below.  (If there's another command or option I should be using, please advise.)

There is the difference in the two output lines, which I guess suggests that the stable one got inlined and the immutable one didn't?

   Output: client_id, si_imm(client_id)
   Output: client_id, staff_inspector_stable(client_id, target_date())

spc=> EXPLAIN (ANALYZE,BUFFERS,VERBOSE) SELECT client_id,si_imm(client_id) FROM tbl_residence_own;
                                                          QUERY PLAN                                                           
-------------------------------------------------------------------------------------------------------------------------------Seq Scan on public.tbl_residence_own  (cost=0.00..1990.02 rows=6977 width=8) (actual time=3.771..22665.604 rows=6983 loops=1)
   Output: client_id, si_imm(client_id)
   Buffers: shared hit=199814Planning time: 0.156 msExecution time: 22677.333 ms
(5 rows)

spc=> EXPLAIN (ANALYZE,BUFFERS,VERBOSE) SELECT client_id,si_stable(client_id) FROM tbl_residence_own;                                                         QUERY PLAN                                                          
------------------------------------------------------------------------------------------------------------------------------Seq Scan on public.tbl_residence_own  (cost=0.00..3734.27 rows=6977 width=8) (actual time=3.100..1302.888 rows=6983 loops=1)
   Output: client_id, staff_inspector_stable(client_id, target_date())
   Buffers: shared hit=60174Planning time: 0.354 msExecution time: 1315.746 ms
(5 rows)
Cheers,
Ken




--
AGENCY Software  
A Free Software data system
By and for non-profits
(253) 245-3801

learn more about AGENCY or
follow the discussion.

Re: Immutable function WAY slower than Stable function?

От
Laurenz Albe
Дата:
Ken Tanzer wrote:
> On Mon, Aug 6, 2018 at 4:11 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Ken Tanzer <ken.tanzer@gmail.com> writes:
> > > Hi.  I was recently troubleshooting a function, and realized it had
> > > incorrectly been declared as Immutable, when it should have been declared
> > > Stable.  When I changed it to Stable, the query I was running ran
> > > dramatically faster. Digging into this a little more, this is what I found:
>
> spc=> EXPLAIN (ANALYZE,BUFFERS,VERBOSE) SELECT client_id,si_imm(client_id) FROM tbl_residence_own;
>                                                           QUERY PLAN
        
 
>
-------------------------------------------------------------------------------------------------------------------------------
>  Seq Scan on public.tbl_residence_own  (cost=0.00..1990.02 rows=6977 width=8) (actual time=3.771..22665.604 rows=6983
loops=1)
>    Output: client_id, si_imm(client_id)
>    Buffers: shared hit=199814
>  Planning time: 0.156 ms
>  Execution time: 22677.333 ms
> (5 rows)
> 
> spc=> EXPLAIN (ANALYZE,BUFFERS,VERBOSE) SELECT client_id,si_stable(client_id) FROM tbl_residence_own;
>                                                           QUERY PLAN
       
 
>
------------------------------------------------------------------------------------------------------------------------------
>  Seq Scan on public.tbl_residence_own  (cost=0.00..3734.27 rows=6977 width=8) (actual time=3.100..1302.888 rows=6983
loops=1)
>    Output: client_id, staff_inspector_stable(client_id, target_date())
>    Buffers: shared hit=60174
>  Planning time: 0.354 ms
>  Execution time: 1315.746 ms
> (5 rows)

That seems to say that the _stable function is running much faster.
Buth functions don't get inlined.

I'd dig into the functions and find out how long the queries in
them take.  auto_explain is a good helper for that.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com


Re: Immutable function WAY slower than Stable function?

От
Ken Tanzer
Дата:
On Mon, Aug 6, 2018 at 11:05 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
Ken Tanzer wrote:
> On Mon, Aug 6, 2018 at 4:11 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Ken Tanzer <ken.tanzer@gmail.com> writes:
>
> spc=> EXPLAIN (ANALYZE,BUFFERS,VERBOSE) SELECT client_id,si_imm(client_id) FROM tbl_residence_own;
>                                                           QUERY PLAN                                                           
> -------------------------------------------------------------------------------------------------------------------------------
>  Seq Scan on public.tbl_residence_own  (cost=0.00..1990.02 rows=6977 width=8) (actual time=3.771..22665.604 rows=6983 loops=1)
>    Output: client_id, si_imm(client_id)
>    Buffers: shared hit=199814
>  Planning time: 0.156 ms
>  Execution time: 22677.333 ms
> (5 rows)
>
> spc=> EXPLAIN (ANALYZE,BUFFERS,VERBOSE) SELECT client_id,si_stable(client_id) FROM tbl_residence_own;
>                                                           QUERY PLAN                                                         
> ------------------------------------------------------------------------------------------------------------------------------
>  Seq Scan on public.tbl_residence_own  (cost=0.00..3734.27 rows=6977 width=8) (actual time=3.100..1302.888 rows=6983 loops=1)
>    Output: client_id, staff_inspector_stable(client_id, target_date())
>    Buffers: shared hit=60174
>  Planning time: 0.354 ms
>  Execution time: 1315.746 ms
> (5 rows)

That seems to say that the _stable function is running much faster.

Agreed!
 
Buth functions don't get inlined.

OK, I clearly don't understand the output lines.  What does it mean then that the stable output line shows the underlying (two-argument) function, while the immutable one shows the convenience function?
 
Output: client_id, si_imm(client_id)
Output: client_id, staff_inspector_stable(client_id, target_date())

I'd dig into the functions and find out how long the queries in
them take.  auto_explain is a good helper for that.


I'm definitely not understanding why or how auto-explain would help here.  (Also, don't overlook the fact that both si_stable and si_imm have the exact same definition (except for stable v. immutable), and are calling the same function (staff_inspector_stable)).  I can load up the auto_explain stuff if it really will help, but in the meantime what I did was pull the query out of staff_inspector_stable, and inserted into the "SELECT client_id,staff_inspector( ... ) FROM tbl_residence_own" query, replacing the "client" (parameter 1) with the client_id from the tbl_residence_own, and replacing "asof" (parameter 2) with target_date(), which seems like it should closely mimic the original query.  

I ended up with the query below, and have attached the explain output to this email.  The actual timing is:
 Planning time: 2.737 msExecution time: 821.034 ms
Which is somewhat better than the function versions, though it seems unsurprisingly and not dramatically so.

EXPLAIN (VERBOSE,ANALYZE,BUFFERS) SELECT client_id,
CASE WHEN
(SELECT program_type_code FROM reg_spc WHERE target_date() BETWEEN reg_spc_date AND COALESCE(reg_spc_date_end,target_date()) AND client_id=tro.client_id LIMIT 1)
ILIKE 'SSP%' THEN   COALESCE((SELECT staff_id FROM staff_employment_current WHERE staff_position_code='COORD_PROP' AND agency_project_code='SSP' LIMIT 1),(SELECT staff_id FROM staff_employment_current WHERE staff_position_code='MGRPROJ' AND agency_project_code='SSP' LIMIT 1))
ELSE   (SELECT staff_inspector FROM tbl_residence_own ro LEFT JOIN l_housing_project USING (housing_project_code) WHERE client_id=5538 AND target_date() BETWEEN residence_date AND COALESCE(residence_date_end,target_date()) AND NOT ro.is_deleted LIMIT 1)
END
FROM tbl_residence_own tro;

Let me know if that is not helpful.  Or if you need something from auto-explain, please help me with some more specifics.  Thanks!

Ken


--
AGENCY Software  
A Free Software data system
By and for non-profits
(253) 245-3801

learn more about AGENCY or
follow the discussion.
Вложения

Re: Immutable function WAY slower than Stable function?

От
Ken Tanzer
Дата:


On Tue, Aug 7, 2018 at 12:05 AM Ken Tanzer <ken.tanzer@gmail.com> wrote:
I'm definitely not understanding why or how auto-explain would help here.  (Also, don't overlook the fact that both si_stable and si_imm have the exact same definition (except for stable v. immutable), and are calling the same function (staff_inspector_stable)).  I can load up the auto_explain stuff if it really will help, but in the meantime what I did was pull the query out of staff_inspector_stable, and inserted into the "SELECT client_id,staff_inspector( ... ) FROM tbl_residence_own" query, replacing the "client" (parameter 1) with the client_id from the tbl_residence_own, and replacing "asof" (parameter 2) with target_date(), which seems like it should closely mimic the original query.  

I ended up with the query below, and have attached the explain output to this email.  The actual timing is:
 Planning time: 2.737 msExecution time: 821.034 ms
Which is somewhat better than the function versions, though it seems unsurprisingly and not dramatically so.

EXPLAIN (VERBOSE,ANALYZE,BUFFERS) SELECT client_id,
CASE WHEN
(SELECT program_type_code FROM reg_spc WHERE target_date() BETWEEN reg_spc_date AND COALESCE(reg_spc_date_end,target_date()) AND client_id=tro.client_id LIMIT 1)
ILIKE 'SSP%' THEN   COALESCE((SELECT staff_id FROM staff_employment_current WHERE staff_position_code='COORD_PROP' AND agency_project_code='SSP' LIMIT 1),(SELECT staff_id FROM staff_employment_current WHERE staff_position_code='MGRPROJ' AND agency_project_code='SSP' LIMIT 1))
ELSE   (SELECT staff_inspector FROM tbl_residence_own ro LEFT JOIN l_housing_project USING (housing_project_code) WHERE client_id=5538 AND target_date() BETWEEN residence_date AND COALESCE(residence_date_end,target_date()) AND NOT ro.is_deleted LIMIT 1)
END
FROM tbl_residence_own tro;


Whoops, scratch that previous explain and query.  I accidentally left in a hard-coded client_id from earlier testing.  The correct query is:

EXPLAIN (VERBOSE,ANALYZE,BUFFERS) SELECT client_id,
CASE WHEN
(SELECT program_type_code FROM reg_spc WHERE target_date() BETWEEN reg_spc_date AND COALESCE(reg_spc_date_end,target_date()) AND client_id=tro.client_id LIMIT 1)
ILIKE 'SSP%' THEN
   COALESCE((SELECT staff_id FROM staff_employment_current WHERE staff_position_code='COORD_PROP' AND agency_project_code='SSP' LIMIT 1),(SELECT staff_id FROM staff_employment_current WHERE staff_position_code='MGRPROJ' AND agency_project_code='SSP' LIMIT 1))
ELSE
   (SELECT staff_inspector FROM tbl_residence_own ro LEFT JOIN l_housing_project USING (housing_project_code) WHERE client_id=tro.client_id AND target_date() BETWEEN residence_date AND COALESCE(residence_date_end,target_date()) AND NOT ro.is_deleted LIMIT 1)
END
FROM tbl_residence_own tro;
The corrected explain output is attached, and the actual timing was:

 Planning time: 2.741 ms
 Execution time: 2538.277 ms

Sorry for the confusion!  It's been a long day, and filled with many 20+ second waits for queries to finish. :)

Cheers,
Ken


--
AGENCY Software  
A Free Software data system
By and for non-profits
(253) 245-3801

learn more about AGENCY or
follow the discussion.
Вложения

Re: Immutable function WAY slower than Stable function?

От
Laurenz Albe
Дата:
Ken Tanzer wrote: 
> > Buth functions don't get inlined.
> 
> OK, I clearly don't understand the output lines.  What does it mean then that the stable
> output line shows the underlying (two-argument) function, while the immutable one shows
> the convenience function?
>  
> Output: client_id, si_imm(client_id)
> Output: client_id, staff_inspector_stable(client_id, target_date())

I was wrong, obviously "si_imm" gets inlined and replaced with the other
function call in the STABLE case.

> > I'd dig into the functions and find out how long the queries in
> > them take.  auto_explain is a good helper for that.
> 
> I'm definitely not understanding why or how auto-explain would help here.  (Also, don't
> overlook the fact that both si_stable and si_imm have the exact same definition
> (except for stable v. immutable), and are calling the same function (staff_inspector_stable)).
> 
> Let me know if that is not helpful.  Or if you need something from auto-explain,
> please help me with some more specifics.  Thanks!

If you set

shared_preload_libraries = 'auto_explain'
auto_explain.log_min_duration = 0
auto_explain.log_analyze = on
auto_explain.log_buffers = on
auto_explain.log_verbose = on
auto_explain.log_nested_statements = on

and run both the slow and the fast query, the log will contain the execution plans and
time for all SQL statements that were called from the functions.

Then you can identify in which nested query the time is spent, which should give us
more material to determine the cause.

The most likely explanation for the difference is that the same query is running with
different execution plans in both cases.

Do you notice a difference if you start a new database session and run the queries
several times?  Is there a difference in execution time from the sixth execution on?
If yes, generic plans may be part of the problem.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com


Re: Immutable function WAY slower than Stable function?

От
Adrian Klaver
Дата:
On 08/07/2018 12:38 AM, Ken Tanzer wrote:
> 
> 
> On Tue, Aug 7, 2018 at 12:05 AM Ken Tanzer <ken.tanzer@gmail.com 
> <mailto:ken.tanzer@gmail.com>> wrote:

> Whoops, scratch that previous explain and query.  I accidentally left in 
> a hard-coded client_id from earlier testing.  The correct query is:
> 
> EXPLAIN (VERBOSE,ANALYZE,BUFFERS) SELECT client_id,
> CASE WHEN
> (SELECT program_type_code FROM reg_spc WHERE target_date() BETWEEN reg_spc_date AND
COALESCE(reg_spc_date_end,target_date())AND client_id=tro.client_id LIMIT 1)
 
> ILIKE 'SSP%' THEN
> 
>      COALESCE((SELECT staff_id FROM staff_employment_current WHERE staff_position_code='COORD_PROP' AND
agency_project_code='SSP'LIMIT 1),(SELECT staff_id FROM staff_employment_current WHERE staff_position_code='MGRPROJ'
ANDagency_project_code='SSP' LIMIT 1))
 
> ELSE
> 
>      (SELECT staff_inspector FROM tbl_residence_own ro LEFT JOIN l_housing_project USING (housing_project_code) WHERE
client_id=tro.client_idAND target_date() BETWEEN residence_date AND COALESCE(residence_date_end,target_date()) AND NOT
ro.is_deletedLIMIT 1)
 
> END
> FROM tbl_residence_own tro;
> 
> The corrected explain output is attached, and the actual timing was:
> 
>   Planning time: 2.741 ms
>   Execution time: 2538.277 ms
> 
> Sorry for the confusion!  It's been a long day, and filled with many 20+ 
> second waits for queries to finish. :)

Hmm, whatever it is looks to be tied to si_* being IMMUTABLE. Just a 
thought but have you tried(NOTE: DEFAULT value):

CREATE OR REPLACE FUNCTION staff_inspector_stable( client INTEGER, asof 
DATE DEFAULT target_date()) RETURNS INTEGER AS $$
SELECT
CASE WHEN
(SELECT program_type_code FROM reg_spc WHERE asof BETWEEN reg_spc_date 
AND COALESCE(reg_spc_date_end,asof) AND client_id=client LIMIT 1)
ILIKE 'SSP%' THEN
--- SSP answer
     COALESCE((SELECT staff_id FROM staff_employment_current WHERE 
staff_position_code='COORD_PROP' AND agency_project_code='SSP' LIMIT 
1),(SELECT staff_id FROM staff_employment_current WHERE 
staff_position_code='MGRPROJ' AND agency_project_code='SSP' LIMIT 1))
ELSE
---SPC answer
     (SELECT staff_inspector FROM tbl_residence_own ro LEFT JOIN 
l_housing_project USING (housing_project_code) WHERE client_id=client 
AND asof BETWEEN residence_date AND COALESCE(residence_date_end,asof) 
AND NOT ro.is_deleted LIMIT 1)

END
--LIMIT 1
$$ LANGUAGE SQL STABLE;

> 
> Cheers,
> Ken
> 
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com


Re: Immutable function WAY slower than Stable function?

От
Tom Lane
Дата:
Ken Tanzer <ken.tanzer@gmail.com> writes:
> On Mon, Aug 6, 2018 at 7:42 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> The reason that EXPLAIN VERBOSE
>> is helpful here is that you can see whether the function got inlined
>> or not: do you see a call to the function, or a representation of
>> its body?

> There is the difference in the two output lines, which I guess suggests
> that the stable one got inlined and the immutable one didn't?

> *   Output: client_id, si_imm(client_id)

Yeah, exactly: here we have the original call to si_imm ...

> *   Output: client_id, staff_inspector_stable(client_id, target_date())

... but here, that's been replaced by si_imm's body.  So we've
successfully eliminated one layer of SQL-function call overhead.

The reason this happens in one case and not the other is that one of
the arcane rules for function inlining is that it's not allowed to make
the expression look more volatile than it was before.  So if si_imm()
is marked immutable, inlining it is disallowed if it would insert any
stable or volatile functions.  I can give you a disquisition on why
that is if you like, but it's not really relevant right now, since
clearly none of these functions should have been marked immutable.

Still, though, we've only eliminated the SQL-function overhead for one
out of three SQL functions that are getting called at each row, so you'd
think the inlining could not save more than 1/3rd of the runtime at the
absolute most.  How is it making more than an order-of-magnitude
difference?

I was confused about that too, so I set up a simple test case similar
to Ken's and poked into it a bit, and what I found out is that nested
SQL functions are just about completely broken performance-wise,
unless one or the other gets inlined :-(.  The reason is that the SQL
function mechanism is designed to cache the results of parsing and
planning the function body ... but only for the duration of the
calling query.  So if you issue a direct SQL command "SELECT myf(x)
FROM big_table", and myf() isn't inline-able, then myf's body gets
parsed/planned once and then re-used at each row of the table.  That's
not too awful.  But each execution of myf's body counts as a separate
"calling query" for this purpose, so if it's calling some other SQL
function that's also not inline-able, then we re-parse-and-plan that
other function for each row of big_table.  *That* is where the time
is going.

I've had a to-do item to rewrite and improve the SQL function cache
mechanism for a long time, but I hadn't thought it was high priority.
Maybe it should be.

In the meantime, I think you could dodge the issue by converting either
level of function into plpgsql.  Or just be careful about inline-ability
of convenience wrappers.

            regards, tom lane


Re: Immutable function WAY slower than Stable function?

От
Guyren Howe
Дата:
On Aug 7, 2018, at 9:10 , Tom Lane <tgl@sss.pgh.pa.us> wrote:

I've had a to-do item to rewrite and improve the SQL function cache
mechanism for a long time, but I hadn't thought it was high priority.
Maybe it should be.

Personally, I would love this. Far too few developers dig into doing things in the database. Conceptually, doing all the data things in the database should be faster, more versatile, more secure etc.

Modern SQL has tools that can make its awfulness tolerable. CTEs and SQL functions spring immediately to mind. But both suffer from optimization fences and corner cases like this that arguably make it not worth the effort.

Re: Immutable function WAY slower than Stable function?

От
Ken Tanzer
Дата:
On Tue, Aug 7, 2018 at 9:10 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
I was confused about that too, so I set up a simple test case similar
to Ken's and poked into it a bit, and what I found out is that nested
SQL functions are just about completely broken performance-wise,
unless one or the other gets inlined :-(.  The reason is that the SQL
function mechanism is designed to cache the results of parsing and
planning the function body ... but only for the duration of the
calling query.  So if you issue a direct SQL command "SELECT myf(x)
FROM big_table", and myf() isn't inline-able, then myf's body gets
parsed/planned once and then re-used at each row of the table.  That's
not too awful.  But each execution of myf's body counts as a separate
"calling query" for this purpose, so if it's calling some other SQL
function that's also not inline-able, then we re-parse-and-plan that
other function for each row of big_table.  *That* is where the time
is going.


Glad you were able to dig into this and figure it out!

As a couple of follow-ups:

After I submitted the (corrected) explain for my attempt to inline-by-hand the SQL from the function directly into the query, that the inlined version took about twice as long as the call with the function.  (2.5 vs 1.3 seconds.)  That doesn't really make sense to me.  I guess I could use that auto_explain stuff to see what's going on, but if there's an obvious answer...

I've wondered in the past about Immutable functions, and particularly the scope & lifetime of "forever" in "guaranteed to return the same results given the same arguments forever."  I assume that's "for all users and all sessions," but either in theory or in practice is there a limit to how long a stale value might persist?  And, if you were to drop and recreate a function with the same name & parameters, would it start fresh at that point?  And is there a way to flush any caching?  (It's surely best to just declare Stable, but I'm wondering about cases that might have _very_ infrequently-changed values.)

Finally, I'm wondering if there's any legitimate reason for a immutable function to call anything stable or volatile, and if not would it make any sense to emit a warning when the function is created.
 
I've had a to-do item to rewrite and improve the SQL function cache
mechanism for a long time, but I hadn't thought it was high priority.
Maybe it should be.

That sounds great.  My earlier misguided statement notwithstanding, we've actually tried to write our functions in SQL wherever possible, assuming that the performance will be better than with a procedural language.

 
In the meantime, I think you could dodge the issue by converting either
level of function into plpgsql.  Or just be careful about inline-ability
of convenience wrappers.


Definitely yes on the second.  And to be clear, my actual problem was dodged before starting this thread.  I was aiming more towards understanding the behavior, which seemed either amiss or in need of some documentation.  And trying to be Postgres-helpful where I can in a non-developer way.  Kinda like "see something / say something" for software users. :)

Thanks again.

Ken

--
AGENCY Software  
A Free Software data system
By and for non-profits
(253) 245-3801

learn more about AGENCY or
follow the discussion.

Re: Immutable function WAY slower than Stable function?

От
Christophe Pettus
Дата:
> On Aug 7, 2018, at 11:42, Ken Tanzer <ken.tanzer@gmail.com> wrote:
> I assume that's "for all users and all sessions," but either in theory or in practice is there a limit to how long a
stalevalue might persist?  And, if you were to drop and recreate a function with the same name & parameters, would it
startfresh at that point?  And is there a way to flush any caching?  (It's surely best to just declare Stable, but I'm
wonderingabout cases that might have _very_ infrequently-changed values.) 

Well, the extreme case is an IMMUTABLE function used to create an expression index; then, the value lasts as long as
theindex does.  The best way to think of an IMMUTABLE is that it is a pure function, unchanged by system state.  (This
isone of the reasons that datetime-related functions are often STABLE rather than IMMUTABLE, due to time zone changes.) 

--
-- Christophe Pettus
   xof@thebuild.com



Re: Immutable function WAY slower than Stable function?

От
Tom Lane
Дата:
Ken Tanzer <ken.tanzer@gmail.com> writes:
> I've wondered in the past about Immutable functions, and particularly the
> scope & lifetime of "forever" in "guaranteed to return the same results
> given the same arguments forever."  I assume that's "for all users and all
> sessions," but either in theory or in practice is there a limit to how long
> a stale value might persist?

The two issues I can think of offhand are (a) plans that depend on the
assumption of immutability might persist for the life of active sessions;
(b) if you created an index on the result of an immutable function,
changing the function's behavior would likely render the index
inconsistent.  (REINDEX would be enough to fix that, though.)

> And, if you were to drop and recreate a
> function with the same name & parameters, would it start fresh at that
> point?

DROP (not CREATE OR REPLACE) would surely get rid of any dependent
values.

> Finally, I'm wondering if there's any legitimate reason for a immutable
> function to call anything stable or volatile, and if not would it make any
> sense to emit a warning when the function is created.

We've been around on that before, and concluded that it's not worth it.
People have in the past used phony-immutable functions to coerce the
optimizer to do what they want --- in fact, that use case is one of the
reasons why we have the inlining restriction.  I'm not sure if such
practices are still popular, but somebody who was doing that would not
appreciate nannyism.

            regards, tom lane


Re: Immutable function WAY slower than Stable function?

От
Michal
Дата:
On 2018.08.07 18:10, Tom Lane wrote:
> I've had a to-do item to rewrite and improve the SQL function cache
> mechanism for a long time, but I hadn't thought it was high priority.
> Maybe it should be.
>
> In the meantime, I think you could dodge the issue by converting either
> level of function into plpgsql.  Or just be careful about inline-ability
> of convenience wrappers.

I now remember I stumbled across this same problem last year when I 
needed an immutable unaccent() function for the purpose of creating a 
functional index. I simply wrapped unaccent(), which is stable, in an 
immutable custom function (and accepting the need to recreate the index 
in case when unaccent rules change). I can't remember the specifics but 
I was also surprised to learn that query execution plans were not 
optimized at all when the function was immutable and in some cases they 
were even slower than when the function was stable, quite contrary to 
all logic and to what the docs say. So I think it would be great if this 
issue were resolved in Postgres.