Обсуждение: ERROR: "invalid memory alloc request size" or "unexpected end of data" on large table
ERROR: "invalid memory alloc request size" or "unexpected end of data" on large table
От
Stephane Bailliez
Дата:
(posting on pgsql-perf as I'm questioning the pertinence of the settings, might not be the best place for the overall pb: apologies) Postgresql 8.1.10 Linux Ubuntu: 2.6.17-12-server 4GB RAM, machine is only used for this I do have less than 30 tables, 4 of them having between 10-40 million rows, size on disk is approximately 50G Nothing spectacular on the install, it's mainly sandbox. Relevant bits of the postgresql.conf max_connections = 15 shared_buffers = 49152 work_mem = 16384 maintenance_work_mem = 32768 max_fsm_pages = 40000 effective_cache_size = 100000 I'm doing a rather 'simplistic' query, though heavy on hashing and aggregate: For the records: select count(*) from action where action_date between '2007-10-01' and '2007-10-31' 9647980 The query is: select tspent, count(*) from ( select sum(time_spent)/60 as tspent from action where action_date between '2007-10-01' and '2007-10-31' group by action_date, user_id ) as a group by tstpent order by tspent asc; I do receive a memory alloc error for a 1.5GB request size. So I may have oversized something significantly that is exploding (work_mem ?) (I was running an explain analyze and had a pgsql_tmp dir reaching 2.9GB until it died with result similar error as with the query alone) ERROR: invalid memory alloc request size 1664639562 SQL state: XX000 Sometimes I do get: ERROR: unexpected end of data SQL state: XX000 table is along the line of (sorry cannot give you the full table): CREATE TABLE action ( id SERIAL, action_date DATE NOT NULL, time_spent INT NOT NULL, user_id TEXT NOT NULL, -- user id is a 38 character string ... ); CREATE INDEX action_action_date_idx ON action USING btree(action_date); Here is an explain analyze for just 1 day: "HashAggregate (cost=709112.04..709114.54 rows=200 width=8) (actual time=9900.994..9902.188 rows=631 loops=1)" " -> HashAggregate (cost=706890.66..708001.35 rows=74046 width=49) (actual time=9377.654..9687.964 rows=122644 loops=1)" " -> Bitmap Heap Scan on action (cost=6579.73..701337.25 rows=740455 width=49) (actual time=2409.697..6756.027 rows=893351 loops=1)" " Recheck Cond: ((action_date >= '2007-10-01'::date) AND (action_date <= '2007-10-02'::date))" " -> Bitmap Index Scan on action_action_date_idx (cost=0.00..6579.73 rows=740455 width=0) (actual time=2373.837..2373.837 rows=893351 loops=1)" " Index Cond: ((action_date >= '2007-10-01'::date) AND (action_date <= '2007-10-02'::date))" "Total runtime: 9933.165 ms" -- stephane
Stephane Bailliez <sbailliez@gmail.com> writes: > ERROR: invalid memory alloc request size 1664639562 This sounds like corrupt data --- specifically, 1664639562 showing up where a variable-width field's length word ought to be. It may or may not be relevant that the ASCII equivalent of that bit pattern is Jb8c ... do you work with data that contains such substrings? > Sometimes I do get: > ERROR: unexpected end of data If it's not 100% repeatable I'd start to wonder about flaky hardware. Have you run memory and disk diagnostics on this machine recently? regards, tom lane
Re: ERROR: "invalid memory alloc request size" or "unexpected end of data" on large table
От
Stephane Bailliez
Дата:
Tom Lane wrote: > This sounds like corrupt data --- specifically, 1664639562 showing > up where a variable-width field's length word ought to be. It > may or may not be relevant that the ASCII equivalent of that bit > pattern is Jb8c ... do you work with data that contains such > substrings? > Not specifically but I cannot rule it out entirely, it 'could' be in one of the column which may have a combination of uppercase/lowercase/number otherwise all other text entries would be lowercase. > If it's not 100% repeatable I'd start to wonder about flaky hardware. > Have you run memory and disk diagnostics on this machine recently? I did extensive tests a month or two ago (long crunching queries running non stop for 24h) which were ok but honestly cannot say I'm not very trusty in this particular hardware. Would need to put it offline and memtest it for good obviously. I moved some data (and a bit more, same table has 35M rows) to another machine (also 8.1.10 on ubuntu with 2.6.17-10 smp, 2 Xeon 2GHZ instead of P4 3GHz) and it passes with flying colors (though it's much much slower with the same settings so I need to check a few things in there, I had tmp dir topping 3GB so not sure if I could have more in memory) Thanks for the insight, Tom. -- stephane